Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 5:21 pm

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Sat Mar 01, 2014 1:23 am 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
I've made 100bb NL HU bots that use imperfect recall and only bucket based on their EV histograms and OCHS, which combines the public card info (the board cards) with the private card info (hole cards). Reference: http://poker.cs.ualberta.ca/publication ... action.pdf

From reading descriptions of other NL agents (hyperborean and slumbot), it seems they use some sort of hierarchical public bucketing scheme. That is, they bucket based on just the board cards, then they do another bucketing based on the public+private info. The slumbot NL paper specifically mentions "For the river we divide the game tree eight ways based on the public cards." Reference: https://www.aaai.org/ocs/index.php/WS/A ... /7044/6479

However, when I've tried making miniature test bots (~200MB in size), one with public bucketing and one without, both using the same total amount of memory and run through CFR (all flavors) for the same amount of time, I've yet to really have a convincing case where the bot using public bucketing has an advantage. Every time so far, the bot without public bucketing has performed better heads up. So basically, I made this thread looking to compare notes with anyone.

It could be my public bucketing scheme is bad. It could be that public buckets only become helpful once you already have 1000+ private buckets on each street (my bots were not that large). Perhaps bots with public buckets take much longer to converge because of how the game tree is split. Maybe my betting abstraction is interfering with the bot taking advantage of having public card information... There are so many variables and only so much time to test things :)

Currently, I'm waiting for some larger bots to do a few billion iterations of Pure CFR and I'll compare then, but it's gonna take awhile. So, I'm here to ask if anyone else has any experience with this and would be willing to share.


Top
 Profile  
 
PostPosted: Sat Mar 01, 2014 2:41 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I thought the info in that paper you referenced showed that the new bucketing scheme beat the old hierarchical stuff? They may just be using hierarchical bucketing with certain hand value metrics because it's easier to tweak the game size, etc. without having to re-cluster.


Top
 Profile  
 
PostPosted: Mon Mar 03, 2014 9:42 am 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
You know you could be right. I may have misinterpreted the little blurb they give when describing hyperborean_iro: "Buckets were calculated according to public card textures and k-means clustering over hand strength distributions". Now after your comment I think they do just mean the histogram method alone and not some public card bucketing AND the histogram method. Classic example of making a problem seem harder than it is :)

Still I have to wonder... once you get up to like 10k or 100k buckets, at some point would you rather split all the buckets into groups based on the suitedness of the board? The histogram abstraction captures so much of the card information it's hard to argue for spending memory on anything else, but maybe there are some special cases.


Top
 Profile  
 
PostPosted: Wed Mar 05, 2014 1:36 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
From that sentence it sounds like they used both. The histogram clustering they described didn't allow for board progression, so maybe they used texture buckets to show the public card history? There would also be some value in knowing if your hand projections rank high due to public cards.

It would be neat if you could convert the histogram into a normalized metric without clustering. For example, with statistical functions like skewness and/or kurtosis.


Top
 Profile  
 
PostPosted: Fri Mar 14, 2014 4:11 am 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
Following up on this, my bet is that hyperborean only used public bucketing for the "important" part of their betting tree, which had 100k+ buckets per street. (The important part is based on their importance sampling technique, which was maybe around 0.05% to 0.2% (?) of the betting sequences based on their frequency in self-play and pot size.)

In the Pure CFR paper linked from the other forum, Richard Gibson describes using public buckets by street with amounts of 9, 51, and 280. The ACPC page says hyperborean used 180k/1.53m/1.68m buckets. Divide and you see they must have had 20k/30k/6k private buckets per public bucket. All that's left is to guess what those curious 9/51/280 "hand-picked" buckets could be... I have some suspicions :)

http://richardggibson.appspot.com/stati ... -paper.pdf


Top
 Profile  
 
PostPosted: Fri Mar 14, 2014 9:55 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
nonpareil wrote:
[...]In the Pure CFR paper linked from the other forum, Richard Gibson describes using public buckets by street with amounts of 9, 51, and 280. The ACPC page says hyperborean used 180k/1.53m/1.68m buckets. Divide and you see they must have had 20k/30k/6k private buckets per public bucket. All that's left is to guess what those curious 9/51/280 "hand-picked" buckets could be... I have some suspicions :)
[...]

~2m Buckets? Just to make sure: This is about HUL right? Or HUNL?


Top
 Profile  
 
PostPosted: Fri Mar 14, 2014 11:19 pm 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
In the paper, the author is describing the CPRG's three-player limit hold'em strategy for the ACPC. However, the fact that the number of public buckets for that strategy also divide the number of buckets that the two-player no limit strategy used makes me believe that both used an identical public bucketing scheme for the "important" betting tree.

I'd guess the important betting tree is only like 200-400 betting nodes so they can afford to have that many buckets (e.g. sb raise, bb call; sb raise, bb call, bb check; sb raise, bb call, bb check, sb check; sb limp, bb check; etc.)

Unfortunately, without a pocket super computer, I'm not sure how to run hundreds of k-means groups when dealing with so many buckets without taking months.


Top
 Profile  
 
PostPosted: Sat Mar 15, 2014 9:46 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
nonpareil wrote:
In the paper, the author is describing the CPRG's three-player limit hold'em strategy for the ACPC. However, the fact that the number of public buckets for that strategy also divide the number of buckets that the two-player no limit strategy used makes me believe that both used an identical public bucketing scheme for the "important" betting tree.


Good observation

nonpareil wrote:
I'd guess the important betting tree is only like 200-400 betting nodes so they can afford to have that many buckets (e.g. sb raise, bb call; sb raise, bb call, bb check; sb raise, bb call, bb check, sb check; sb limp, bb check; etc.)


I assume by important you mean branches of the gametree that are likely to be be traversed during gameplay?! So the bot is going to play smart against smart opponents and fishy against fishy opponents?

I am not yet convinced this is a good approach. Sure, the nemesis in the abstracted game will be able to extract less value since he cannot benefit of the bot's strategy in unlikely action sequences, but the nemesis in the unabstracted game might be able to exploit the bot's weaknesses in these branches so he might be tempted to force the bot into situations where it is weak.

In other words: If you are playing against a human player and the human recognizes that the bot is weak in spots that are unusual/ involve unconventional play he will play in an unconventional way

nonpareil wrote:
Unfortunately, without a pocket super computer, I'm not sure how to run hundreds of k-means groups when dealing with so many buckets without taking months.


If a understood correctly you would group some boards and create holecard-board-bins within these groups. Since each group consists of only a few boards it is sufficient to sample only these boards so each K-Means will converge way faster than the K-Means that groups all holecards on all boards

Ie: you divide all boards in two groups A and B each consisting of 50% of all boards. So you will run K-Means on both groups. Since each group is only have the size of A+B the slgorithm is likely to converge in half the time K-Means requires for grouping all holecards on all possible boards.

So I assume this won't be too much of a difference


Top
 Profile  
 
PostPosted: Sun Mar 16, 2014 5:20 am 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
Nose wrote:
I assume by important you mean branches of the gametree that are likely to be be traversed during gameplay?! So the bot is going to play smart against smart opponents and fishy against fishy opponents?


First I have to say I haven't done any tests myself to know how the strategy may change, although several tests were discussed in the dissertation, and the CPRG used it with some of their ACPC submissions.

I agree that importance sampling can seem like an exploitative approach, but I think if you saw some of the relative "importance" values of betting nodes it might not seem that drastic of a change. (Remember that importance is frequency times pot size, so for example, a 3-bet pot might be uncommon, but it's still important because the pots are larger.) When I tested a prototype bot in self-play that had 188432 bet sequences, the sum of the importance from the top ~0.13% (same percentage mentioned in paper) accounted for 49.6% of the sum of the importance from all nodes. Just coincidence it's near 50% (?), but that does seems like a natural target anyhow. The top node (button minraise, bb call) was 10x as important as the 23rd most important node, and 100x more important than the 276th most important node.

The top 0.13% are just super common lines that I doubt can be "avoided" by a nemesis strategy. They include things like any time flop is checked through, or flop and turn checked through, facing cbets in 3-bet pots, facing leads, cbets, barrels in normal pots, etc. I don't think you can escape bumping into certain sequences, which is why they chose to boost the number of buckets for them (including the use of public buckets, connecting back to the start of this thread). The paper gave some information set numbers, and it seemed like they still spent ~6x the memory for all the "unimportant" sequences than for the important ones.

How many more buckets to give, and how many nodes to label important, and thus how much additional memory you devote to them are all parameters you can choose yourself.

You make a good point with your k-means comment. I was feeling a little overwhelmed when I was first imagining it :)


Top
 Profile  
 
PostPosted: Mon Mar 17, 2014 6:18 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
So did they use no public buckets on unimportant rounds? Or did they just limit them to a smaller set size? I'd imagine the different sizes were based on board progression, something similar to what they've described in older papers. 9 seems somewhat small at first glace to describe the flop, but it's probably adequate. They manage some reduction on the turn from 9x9 to 51, and so forth onto the river. Whatever it is, I'm sure it's eloquent. Maybe the new 'optimal, anytime board clustering' approach wou;d shed light on that.


Reading all this, my abstractions are in desperate need of an overhaul.

Anybody interested in working on a very large and profitable short-stack strategy? :) I'm currently using External Pure-CFRM, some custom texture buckets, EHS^2, and HP, all of which need to be rebuilt.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group