Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 5:30 pm

All times are UTC




Post new topic Reply to topic  [ 59 posts ]  Go to page Previous  1, 2, 3
Author Message
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 5:48 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Great idea.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 6:39 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Thanks.

An interesting note about my DUMBASS method, the LUTs are the same size as those used in my other implementations, because I no longer need the board texture, etc. when running CFRM. So, there's no impact on potential abstraction size (at least for me).

You can add more LUTs if you want, that represent, say, the BB hands, or one for each bet round, or you can add them in a more strategic fashion, like for specific branches of the game tree where you think the EVs will/should differ drastically.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 7:05 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
nasher, you are a dumbass


...inventor :p


When I read about the idea I wanted to reply that I dont think it will work well, but your results prove me wrong... Just to see if I got it right: you are basically clustering hands based on their EV obtained by playing the CFRM strategy so far, right?

If so, I wonder why it works as I see some issues:
1. During CFRM, hands change buckets continuously, so the regrets for a specific bucket obtained may not be accurate anymore as there are now different hands connected to this bucket
2. In theory we want to bucket hands which should strategically be played the same way into similar bucket. However, we might find that the EV of e.g. a draw or a made hand is the same, so they'd get bucketed in a similar bucket, even though it might be way better to play the draw passively and the made hand agressively or vice versa.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 7:36 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
proud2bBot wrote:
nasher, you are a dumbass


...inventor :p
Thank you, p2bb, that is very kind of you to acknowledge my hard work and accomplishments. :)

proud2bBot wrote:
When I read about the idea I wanted to reply that I dont think it will work well, but your results prove me wrong... Just to see if I got it right: you are basically clustering hands based on their EV obtained by playing the CFRM strategy so far, right?
I edited the post about the results (they might have been due to the smaller abstraction converging more quickly). Or, some other nuance that's making it out-perform. I'm re-running it with an abstraction of roughly equal size. And, yes, you are correct about the EV clustering.

proud2bBot wrote:
If so, I wonder why it works as I see some issues:
1. During CFRM, hands change buckets continuously, so the regrets for a specific bucket obtained may not be accurate anymore as there are now different hands connected to this bucket
2. In theory we want to bucket hands which should strategically be played the same way into similar bucket. However, we might find that the EV of e.g. a draw or a made hand is the same, so they'd get bucketed in a similar bucket, even though it might be way better to play the draw passively and the made hand agressively or vice versa.
1) See all my blabbering in previous posts about weighted updates.
2) You have a good point. My thought is: the EV is a combination of hand value AND the strategy, so it's not just how often the hand wins on the river, but how much it wins AND how it's played. Isn't this the essence of strategically similar bucketing? I'm not a poker pro, so such things are speculative to me. What do you think?


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 8:04 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Regarding your second point: I can't come up with an example right now, but what I have in my mind is like this: we have two different hands that are currently in the same cluster. Hence, we played these hands similar so far, e.g. shoving the flop. I wonder if we can get into spots, where we find such 2 hands where EV-shove(hand1) ~ EV-shove(hand2) but EV-check(hand1) << EV-shove(hand1) and EV-check(hand2) > EV-shove(hand2). If so, it means that we take a non-optimal action with hand2 and we won't change the strategy even if we let CFRM learn more as we get hand1/hand2 similar often and for the given bucket, a shove is the best avg. option.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Sun Apr 07, 2013 8:26 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
You might be, and probably are right about that. To what degree is the question. My thought is that hands will eventually find and stay where they are the most 'stable.' This stability is dependent on the strategy, of which their EV is a product, and vice versa. Whatever the case, I don't really care, as long as it works better and the empirical results pay off. ;)

For my re-test, I'm going to run the same number of CFRM iterations on an abstraction of the same size as the original test (half the size of my baseline), using the same weighted updates on the AS and CR, but with EHS2 as a static metric for bucketing hands, then I'll compete it against my larger baseline strategy. If DUMBASS is better, the results for this test should be less.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 2:21 am 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Yeah, its possible that the case I mentioned is rarely relevant. Let us know when you have the results of the 'fair' comparison - it might be irrelevant.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 9:41 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
-0.0018bb/hand with EH2 bucketing after 500k games of duplicate poker. -0.0033bb/hand after about a million games. I'm going to try an abstraction of roughly equal size to that of my baseline and run DUMBASS on it to see what I get. There's still a lot of reasons it could have done better, variance in testing being one of them. I already deleted the strategy to make room for the new ones or I'd run it some more. I need a bigger SSD.

Notice how much more noisy the DB graph is. Keep in mind, those EVs started out as EHS2 and are/were slowly interpolated to the hand's EV.


Attachments:
dynamic_bucketing.jpg
dynamic_bucketing.jpg [ 114.08 KiB | Viewed 16215 times ]
ehs2_bucketing.jpg
ehs2_bucketing.jpg [ 93.53 KiB | Viewed 16215 times ]
Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 2:03 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
I was thinking about it and have an alternative dynamic bucketing appoach in mind:

Say we have given a fixed number of buckets, e.g. 100, and a bucketing metric M. Now we run CFRM for e.g. 50M iterations and get a first game tree. Now we modify the game tree by collapsing similar bucket: it the strategy plays buckets a and b similar, there is no need to distinguish between them, hence, we can remove them and create a new bucket which contains all regrets/cum. strategies of a and b. This leaves us with free bucket slots that are currently unused and we can assign them to split more interesting buckets. Therefore, we run the strategy for a number of games and determine the variance of each bucket in terms of EV. We then split the buckets with the highest variance into 2 Parts, copying the regret/cum strategies.
Finally we continue CFRM learning.

For example, lets consider preflop (which is the easiest to explain, even though the effect is way more relevant postflop) with a bucketing size of 100. We might have an initial bucketing that looks as follows:
Bucket 0: 72o, 82o
Bucket 1: 93o, 73o,
...
Bucket x: 88, KQs

After the first training, we learn that bucket 0 is played similar preflop, so we add all stored numbers from bucket 1 to bucket 0 and mark bucket 1 as free. We further find, that the variance of bucket x is high, so we split it up, copying the stored numbers from x into the new slot. This leads us to a new bucketing schema:
Bucket 0: 72o, 82o, 93o, 73o
Bucket 1: KQs
...
Bucket x: 88
Now we continue with regular CFRM.

The advantages are - theoretically - that we don't have a fixed bucket size, but can have a bucket with many elements if all of them play strategically similar. However, we can also have very fine-grained buckets for buckets with a high variance which seem more relevant. Furthermore, we don't have the issue of moving hands to different buckets as the information is moved with them.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 8:00 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I like that idea, and suggested something similar above, but based on bucket frequency. I define buckets on a per-round basis, but in your scenario the similar strategies are at a per-node basis. So, how would you define/store your bucket latices? You couldn't use a full-hand LUT for every node. It's really easy to just say bucket = M * number_of_buckets, but how do you keep track of what buckets at a given node are twice as wide or twice as narrow, along with what hands go where?

I suppose you could use a LUT just for bucket centers/latices at every node, just a few k in size.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 9:44 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
I'd also stick with per round buckets; we just need to define the similarity measure in such that it takes account all decisions within a round, ideally weighted by there probability.
Regarding bucket retrieval, I currently just use a LUT with the bucket number as the lookup value. We can use a similar approach there I guess (in my current abstraction, buckets have different sizes too - but are not dynamic in any way).


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Mon Apr 08, 2013 11:25 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Hmm. How to do that? I suppose you could just line up your nodes for a round's bucket as one large array and do a distance measure of some kind between the other buckets. It would be simple, but maybe not the best solution.

Latest test:
0.0237bb/hand after 500k games of duplicate poker. Aside from the abstraction size, one difference is the update weight on the EVs was changed from 1/1000 to 1/100k. 0.029+ after 5m games.

One thought was that the noise added by the EVs might actually be improving how the strategy converges, and not the EVs themselves?

Update: Interestingly, after another 50m iterations, the graph got more stable, but the over-all win-rate decreased to about 0.012bb/hand.

Update 2: After yet another 50m iterations the graph was even more stable and the strategy performance got even worse, now -0.011bb/hand versus my baseline. :)

Update 3: The decrease wasn't due to the DB, because it happened with static buckets too. O_o Maybe over-fitting, or something wrong with my abstraction/ASS.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Wed Apr 10, 2013 7:06 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
I am glad to see you took my initial idea and tuned it to something that can actually be implemented and tested quite quickly. The results are interesting and du seem to show that there is merit to this at least. I'll have to go back and go over your approach in more detail again when I have some more time but for now:
- How does your abstraction work, e.g. do you just separate by 5 card hands on the flop, not differentiating between hero Ac6c board 2c3d4s and 2c3d/Ac6c4s?

I'm actually quite surprised this works with just bucketing based on streets, not on previous action at all. I would've expected the bucketing to need to be specific to the action sequence as well. However, if it gets reasonably close to your baseline even with that, I'd be comfortable in saying this has great potential in outperforming static bucketing methods, especially based on EHS and EHS2.

@p2bb: I think the key thing to differentiate here is all-in equity versus play equity. The reason we say that we need to separate hands that might have equal EV, e.g. a made hand and a drawing hand, is only because the EV that we use to calculate this is based on rolling out cards without further actions. Those future actions that we know are happening however make those hands play differently. If we now use a different EV, that of actually playing the hand based on our and our opponent's strategy such as we are doing in CFRM calculations, then we can actually trust that calculation to put similarly playable hands into the same category.

_________________
Cheers.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Wed Apr 10, 2013 7:35 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I'm trying it with probing CFRM instead of ASS, so we'll see what happens. I shall call this: Automatic N-bucket Arranging Logic w/ Probing-CFRM, or ANAL-Probing for short.

I really need somebody else to try it... It's pretty easy to implement. It might be that my betting abstraction has some weird tendency to push things off EQ as the strategy begins to converge. I discussed how I created it in other posts on the thread.


Last edited by cantina on Wed Apr 10, 2013 7:53 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Wed Apr 10, 2013 7:45 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Coffee4tw wrote:
@p2bb: I think the key thing to differentiate here is all-in equity versus play equity. The reason we say that we need to separate hands that might have equal EV, e.g. a made hand and a drawing hand, is only because the EV that we use to calculate this is based on rolling out cards without further actions. Those future actions that we know are happening however make those hands play differently. If we now use a different EV, that of actually playing the hand based on our and our opponent's strategy such as we are doing in CFRM calculations, then we can actually trust that calculation to put similarly playable hands into the same category.

I'm not sure I agree with this completely. I think it would be reasonable to assume that EVs for strategically different hands could be the same, despite being played differently. Though, I don't disagree that EV can hold some strategic differentiation, and since there aren't any other regularly updated variables readily available in CFRM, it's kind of a legacy option. :)


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Wed Apr 10, 2013 8:06 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Regarding similar strategies: You could use statistical functions, like Skewness, Kurtosis, and Variance to decide buckets similarity across an entire round. Though, I'm not sure if this would be any better than doing just a straight distance comparison.

Thinking about this further, I saw Spears' post about an incrementally updated formula for variance, do such formulas exist for Skewness and Kurtosis? If so, it would be fun to use these along with EV to update a hand's bucket during training, as those variables would/could be based on the strategy "flow" throughout the round, as apposed to the value, while maintaining a small enough size to fit in a LUT.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Thu Apr 11, 2013 2:51 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
ANAL-Probing slowing improving versus my baseline after 3x runs of 6 hours each.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Tue Apr 16, 2013 12:32 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Since my ASS is still having problems, I could only test ANAL-Probing vs. plain Probing (static buckets). It lost by about -0.004bb/hand after 10m games of duplicate poker.

It could just be that it needs longer to converge than static buckets, which would make sense. If I find some time at a later date to start playing with this again I'll let you guys know.


Top
 Profile  
 
 Post subject: Re: Dynamic Bucketing
PostPosted: Fri Feb 07, 2014 6:24 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
A note on all this, both the sampling methods I tried (Average and Probing) for whatever reason would reach a certain point in convergence then slowly drag the strategy off EQ or prevent it from converging further. This effect was most profound with Average Strategy Sampling. I experienced the same thing when using ordinary bucketing methods and eventually switched back to plain External Sampling.

So, all the results that started out great but eventually lost over time were probably due to those problems. That said, it might be worth continuing exploration of EV-based dynamic bucketing.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 59 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group