Poker-AI.org • View topic - Exploitation/Adaptation By Purification

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Exploitation/Adaptation By Purification

Page 1 of 1

[ 6 posts ]

Print view

Previous topic | Next topic

Author

Message

cantina

Post subject: Exploitation/Adaptation By Purification

Posted: Sun Oct 20, 2013 5:41 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Say you have an EQ with a mixed strategy (typical of those created with CFRM). Has anybody tried, during live play, using a purification threshold as a means of adaptation/exploitation? For example, you would take the winnings/losses after a hand is played, then either try to increase the probabilities for the actions taken (or reduce them) by raising/lowering the purity threshold for a given section of the game. How those sections are defined and how the thresholds are updated would, of course, be the subject of experimentation.

Top

cantina

Post subject: Re: Exploitation/Adaptation By Purification

Posted: Thu Oct 24, 2013 3:07 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Some results. 25bb stack level, global threshold adjustment versus a static player, starting at 5% purification. The min/max purification levels were 1%/100%, adjustments were made like:
threshold += threshold * inc

Average (optimum) is probably around 40%, which is about what I expected. A lower delta (increment percentage) would probably have shown a more stable optimum threshold. I don't have a player that can exploit overly predictable strategies (i.e. very pure strategies), so I'm not sure how to show it decreasing it's threshold. It should, though, in theory, against a player with that capability.

The increment sign (+/-) was decided by Zed * Utility of the deviated threshold levels.

Code:

Hand #    Threshold (10% delta)
--------------------------------------
50000     0.146923779810831
100000    0.186814088517662
150000    0.449290742512755
200000    0.496626159328362
250000    0.33744537676712
300000    0.442345245425452
350000    0.327312908941213
400000    0.406057514081512
450000    0.342283417617408
500000    0.420383316311766
550000    0.40578471755976
600000    0.488456269506705
650000    0.432960247785639
700000    0.513456650014193
750000    0.422146355060823
800000    0.510797089623596
850000    0.385637976310415
900000    0.403411666168254
950000    0.528818128102072
1000000   0.295419722492907
1050000   0.254078200455836

Next: somehow model a dynamic threshold based on game state.

Top

spears

Post subject: Re: Exploitation/Adaptation By Purification

Posted: Thu Oct 24, 2013 9:39 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

How is this different from just mixing in a pure strategy?

Top

cantina

Post subject: Re: Exploitation/Adaptation By Purification

Posted: Thu Oct 24, 2013 11:45 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

What do you mean by 'mixing in'?

Top

spears

Post subject: Re: Exploitation/Adaptation By Purification

Posted: Thu Oct 24, 2013 1:00 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

30% pure = play pure strategy 30% of the time and play NE 70% of the time

Top

cantina

Post subject: Re: Exploitation/Adaptation By Purification

Posted: Thu Oct 24, 2013 5:24 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

I don't know if the scenario you describe would be exactly the same, but yes, a mix of the pure strategies could be the same as playing a mixed strategy. A completely pure strategy would exclude all but one action, while a completely un-purified strategy might play all available actions. Something in the middle could/would play fewer than all actions, but in the scenario you describe 70% of the time it would be playing all actions if that were the default EQ. That's not really the point of this post, though.

What I'm showing here is that the "amount" of purity can be adapted in real-time to improve equity versus a given opponent. For example, an opponent that plays a very static strategy (like another EQ), it would likely be better to play a more pure strategy. While an exploitative opponent (like most humans), it would be better to play a more defensive (less pure) strategy. You can decide that level by sampling.

Top

Page 1 of 1

[ 6 posts ]

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum