Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 1:52 pm

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Sat May 25, 2013 8:35 am 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
Hi,

I am new to this forum and comparably new to AI, which I consider a very interesting topic. Lately I solved a couple of toy games using CFRM. Now I want to approach poker (HUNL, ten BB stacks for convenience). Immediately I found myself confronted with a couple of issues:

a) The game tree appears to be too large to create it manually (hardcode) or entirely store it in RAM
b) After generating a eps-equilibrium strategy I need to store it on my HDD in a way that I can reload it quickly

For every issue there (most likely) exist a variety of solutions. I am now interested in collecting, finding and discussing pros&cons of several approaches.

So this thread is meant to find near-optimal (in terms of time and space) answers to the following questions:

1) How can the Extensive Game Tree (EGT) for a static/ dynamic abstraction be created in a simple way.
2) How can the Tree be efficiently stored in RAM (i.e. recursive data structure - which scales well when using dynamic/ evolutionary approaches - or simply an array, to avoid computational overhead for creating the context for each recursive function call, or a hybrid of both, or ...)? (i.e. 2: IIRC in some paper someone stated it would not be necessary to "store the entire game tree [in RAM] since it is only required to walk over it".)
3) How can the resulting strategy be extracted from the tree and stored on permanent storage devices (i.e. just serialise the recursive data-structure/ the array, use more sophisticated ways like creating a LUT which allows us to store more than one strategy in RAM and therefore allowing us to quickly switch between them on runtime)

Looking for a fruity discussion

Cheers,
Nose


Top
 Profile  
 
PostPosted: Sat May 25, 2013 9:06 am 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
Also a fact to consider is an efficient interface between streets when using an imperfect recall abstraction. Such like:

Is it favourable to solve each street independently by having multiple "initialisation points" (this refers only to the streets FLOP, TURN and RIVER) where, for example, you start solving the FLOP from various (pot size x abstract hand strength)-nodes such as { (pot=4BB, h_strength=strong), (pot=10BB, h_strength=strong), (pot=4BB, h_strength=weak), (pot=10BB, h_strength=weak) }

OR

Is it favourable trying to solve all streets at once, since outcomes somehow influence the play on earlier streets?

OR

First solve each street independently, combine then and adjust regrets for the overall strategy (which I think should not make a difference to solving streets independently and combining them, since information from earlier streets get lost anyway)


Hard to give a self-explaining example for this


Top
 Profile  
 
PostPosted: Sat May 25, 2013 10:03 am 
Offline
Junior Member

Joined: Sun Mar 10, 2013 10:30 am
Posts: 20
Theoretically computing CFRM for only subgames in imperfect-information game is wrong.

These two papers show techniques to get around this issue :

http://www.cs.cmu.edu/~sganzfri/Endgame_AAAI13.pdf
http://poker-ai.org/phpbb/viewtopic.php?f=25&t=2433


Top
 Profile  
 
PostPosted: Sun May 26, 2013 2:51 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
LOLWorld wrote:
Theoretically computing CFRM for only subgames in imperfect-information game is wrong.

These two papers show techniques to get around this issue :

http://www.cs.cmu.edu/~sganzfri/Endgame_AAAI13.pdf
http://poker-ai.org/phpbb/viewtopic.php?f=25&t=2433


Thanks. The first is quite enlightening. For the CFR-D I miss the comparison to other CFR-variants.

By wrong you mean that the equilibrium of the full game differs from the combination of equilibria from the subgames? Yes, but I thought that is what imperfect recall is all about. Instead of solving a game with a huge tree we try to solve four "loosely chained" subgames ... or did I get it all wrong? :shock:


Top
 Profile  
 
PostPosted: Sun May 26, 2013 6:29 pm 
Offline
Junior Member

Joined: Sun Mar 10, 2013 10:30 am
Posts: 20
I asked the same question concerning imperfect recall here :
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2478#p4060

Maybe some of the forum gurus can enlighten us?


Top
 Profile  
 
PostPosted: Mon May 27, 2013 9:09 am 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
It appears to me that somehomelessguy answered your question when saying

somehomelessguy wrote:
no, you won't get good results in time. maybe with heavy postflop abstraction.

[...]

let's do some rough estimates:

140 preflop hands.
200 river sequences.

140*47*46*200 = 60536000 information sets

and so we need about 600 million iterations. we won't get that in a few seconds.

obviously it's not as many when removing isomorphs, but still not doable without buckets.


Concerning imperfect recall I agree with p2bb when he says

proud2bBot wrote:
But then you calculate only a subgame with strong restrictions (the given ranges for your and your opponents). Within the sub-game, you'll get an optimal solution, but it might be quite a bit off from the unrestricted nash solution. [...]


but in case your assumptions on the preflop ranges based on expert knowledge are valid I don't see why this approach should not perform well. On the other hand I don't see why you would want to take the risk of being exploited because of maybe poorly chosen preflop ranges and then not take advantage of its potential by playing a defensive strategy from that point on ... just my two cents.

Also I read the part about imperfect recall again and still cannot see how this contradicts with "our" understanding.

http://poker.cs.ualberta.ca/publications/schnizlein.msc.pdf wrote:
[...]
While imperfect recall allows us more flexibility in abstraction creation, using it means that we are no longer guaranteed to converge to an equilibrium. In practice, this has not been an issue.

Imperfect recall allows us to increase the number of buckets on earlier rounds without increasing the number of buckets in later rounds. For instance, in the preflop there are only 169 possible 2-card combinations one could have, taking into account suit isomorphisms. Using imperfect recall, we could create a player that has 169 buckets on the preflop that are forgotten once the flop is dealt. This could increase the number of flop buckets to 64, since the 8 buckets originally used to remember the preflop may now be used for the flop. Keeping 8 buckets on the turn and river would mean that the river still only has 4096 bucket sequences.
[...]


But yes, I also would appreciate a few comments from the more experienced board members.


Top
 Profile  
 
PostPosted: Mon May 27, 2013 10:29 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
2) You're going to search for EQ via evolutionary methods?


Top
 Profile  
 
PostPosted: Mon May 27, 2013 5:28 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
Nasher wrote:
2) You're going to search for EQ via evolutionary methods?


I was thinking about it to find a suitable abstraction. The rough idea was to generate sub-game-trees with random action possibilities, train them (using CFRM), let them run against a baseline-strategy (i.e. 100% raise), let the not so well performing strategies "die" and use crossing over to hatch strategies that perform better

but I'm not even close to implementing that approach yet. I am still collecting experience about space of the game-tree, rate of convergence, time, and so on and so forth


Top
 Profile  
 
PostPosted: Mon May 27, 2013 5:36 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
I created an expert based ruleset for bucketing hands (~200buckets with imperfect recall each street) and classifying boards (approx. 15, 30, 40 on Flop, Turn and River). Can anyone estimate how much space that would take in RAM? Or at least whether it is possible to put it in RAM

sorry for the many (stupid) questions, but I am highly interested in saving time by reducing the error-component in the trial&error learning process


Top
 Profile  
 
PostPosted: Mon May 27, 2013 6:43 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I think using evolution to find optimal bet sizes would take an insanely long time for even a short stacked game, if it's even possible, unless you have some trick in mind that would shorten the process considerably (or you have a lot resources at your disposal). You're probably better off using the methods described by the U of A guys.

Regarding game sizes, read: "Measuring the Size of Large No-Limit Poker Games"


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group