Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Sat Mar 23, 2019 7:36 am

All times are UTC




Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Mar 11, 2013 10:44 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Evaluating State-Space Abstractions in Extensive-Form Games
by: Michael Johanson and Neil Burch and Richard Valenzano and Michael Bowling

Abstract
Efficient algorithms exist for finding optimal policies in extensive-form games. However, human-scale problems are typically so large that this computation remains infeasible with modern computing resources. State-space abstraction techniques allow for the derivation of a smaller and strategically similar abstract domain, in which an optimal strategy can be computed and then used as a suboptimal strategy in the real domain. In this paper, we consider the task of evaluating the quality of an abstraction, independent of a speci c abstract strategy. In particular, we use a recent metric for abstraction quality and examine imperfect recall abstractions, in which agents "forget" previously observed information to focus the abstraction effort on more recent and relevant state information. We present experimental results in the domain of Texas hold'em poker that validate the use of distribution-aware abstractions over expectation-based approaches, demonstrate that the new metric better predicts tournament performance, and show that abstractions built using imperfect recall outperform those built using perfect recall in terms of both exploitability and one-on-one play.

http://webdocs.cs.ualberta.ca/~johanson/publications/poker/2013-aamas-kmeans-abstraction/2013-aamas-kmeans-abstraction.pdf


Top
 Profile  
 
PostPosted: Wed Mar 13, 2013 8:30 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 438
Anybody want to post an algorithm for the KE and KO metrics?


Top
 Profile  
 
PostPosted: Wed Mar 13, 2013 9:01 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
I implemented in it Java, but you dont have one single algorithm but always 2 parts: data generation and data clustering.

For KO, you create the 8 ranges and compute the EHS of your hand vs those ranges.
For KE, you iterate over all hands/boards and store the EHS2 of the next street in an array. E.g. on the flop you iterate over all turns, calculate the EHS2 and store it an an array of size say e.g. 101 entries in the index (Math.round(ehs2*100).

Personally, I was using LUT tables for the data generation (they iterate automatically over all unique hand-board-patterns) and implemented a KMeans algorithm, using the initialization of clusters from KMeans++. I can post the KMeans algorithm (in Java) if there's interest. The LUTs I was using from the old forum (also Java).


Top
 Profile  
 
PostPosted: Fri Mar 15, 2013 12:00 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
Implementing KMeans is pretty easy and I have some code flying around somewhere for that too but could you post the link to the LUT thread in our Archive that you are using?

_________________
Cheers.


Top
 Profile  
 
PostPosted: Fri Mar 15, 2013 1:25 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Here it is: http://poker-ai.org/archive/www.pokerai.org/pf3/viewtopica90d.html?f=3&t=2777

Note that it's a fully abstracting LUT, which is great in most cases as it's small, but you cannot distinguish for hands {Ks,Qs,Th,4d} if the flop was {Qs,Th,4d}, {Ks,Th,4d}, {Ks,Qs,4d}, or {Ks,Qs,Th}.


Last edited by Coffee4tw on Fri Mar 15, 2013 11:46 pm, edited 1 time in total.
Changed link to archive location. Please make sure to link to the archive instead of pokerai.org


Top
 Profile  
 
PostPosted: Fri Mar 15, 2013 1:29 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
If you guys are implementing the approach using LUTs: note that the amount of data that is generated is huge (50M river situations), thus you will get a memory issue if you use larger partitions and floats/doubles. I coped with it by converting the EHS values to bytes (e.g. EHS 0,538752 --> (byte) 54), which introduces a tiny bias but its clustering anyway... Just leave the cluster centers/distances as floats, otherwise it will screw your results.


Top
 Profile  
 
PostPosted: Thu Apr 18, 2013 1:47 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
I only just read this paper and it sounds like Imperfect Recall + Earth Mover's Distance is the current state of the art. Very interesting.

_________________
Cheers.


Top
 Profile  
 
PostPosted: Thu Apr 18, 2013 5:51 am 
Offline
Junior Member

Joined: Mon Apr 08, 2013 1:13 pm
Posts: 15
Why do they call it imperfect recall? It looks like "no recall" to me.
Wish they also tried the in-between case. Like you have N preflop buckets, but when you move to bucketing on flop you assume you have M<N buckets on preflop.


Top
 Profile  
 
PostPosted: Thu Apr 18, 2013 10:57 pm 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
You can design imperfect recall however you want and they experimented with the in-between cases as well in previous papers. I think the main reason for not exploring it in this paper is that they wanted to compare the two extremes and given that it takes a while to calculate strategies and then run games between the different versions, they probably wanted to stick to a certain time frame. There is significantly more effort required too in designing an abstraction the way you mention it.

There are millions of possibilities that could be tested and each one might or might not be the answer. That is mentioned as well in the paper, currently the only way to know if an abstraction is better than another is to create it, create a strategy and pit it against other ones. That's what people have been doing for years.

_________________
Cheers.


Top
 Profile  
 
PostPosted: Thu Apr 18, 2013 11:42 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 438
Couldn't you integrate such bucketing "choices" into the game itself and solve it along with the EQ? i.e. Dynamic Bucketing. The methods I came up with in the other thread didn't perform that great (yet), but maybe somebody could come up with a better idea of coupling the convergence to the bucketing.


Top
 Profile  
 
PostPosted: Fri Apr 19, 2013 4:39 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Coffee4tw wrote:
I only just read this paper and it sounds like Imperfect Recall + Earth Mover's Distance is the current state of the art. Very interesting.


Keep in mind that its for FL, I guess it changes if you play PL/NL...


Top
 Profile  
 
PostPosted: Mon Jun 03, 2013 10:17 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 438
I'm implementing this now to see how it compares with my current bucketing methods. I'm not using earth movers as the distance metric, though. And, I'm using vanilla KNN to do the clustering.

Questions for others that have done this:
- How big of a histogram did you use? Currently I'm testing histograms with 100 buckets on the flop, and 50 on the turn. Is there an optimal size? I would think the bigger the better, because it offers more detail, but I could be completely wrong.
- Has anyone experimented with other clustering techniques? Such as: Cobweb, Gaussian Mixture Models, SOMs, or Decision Trees?

Edit: I'm testing SOMs now for the turn clustering, KNN is taking a long time (a week running and it's still not finished).


Last edited by cantina on Thu Jun 06, 2013 5:49 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Mon Jun 03, 2013 10:24 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 438
proud2bBot wrote:
For KE, you iterate over all hands/boards and store the EHS2 of the next street in an array. E.g. on the flop you iterate over all turns, calculate the EHS2 and store it an an array of size say e.g. 101 entries in the index (Math.round(ehs2*100).

The way I read it: say, on the flop, you roll out all possible boards to the river, then store in a histogram the hand strength of each one of those hands. Maybe this comes out to the same result...


Top
 Profile  
 
PostPosted: Wed Jun 12, 2013 5:10 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 438
I tried KE clustering using Self-Organizing Maps (SOMs) on the flop/turn and got results comparable to my current bucketing methods for HUNL. I'm not sure if a different size histogram or additional/better training with the SOMs would produce better results. I will say that SOMs seemed to be much faster than Kmeans, completing an iteration in about 2.5 hours for the flop histograms.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2013 2:21 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
I am not entirely sure I understand the calculation of earth mover's distance (EMD). Consider the attached histograms A, B, C and D; each consisting of three bins. Omitted values are either 0% or 100%. Let EMD(H1, H2) denote earth mover's distance between two histograms H1 and H2.

EMD(A,B) = 1, because we have to take 100% of "earth" from "pile" bin1 and move it to "pile" bin2.
EMD(B,C) = 1, for the same reasoning
EMD(A,C) = X, where reasonable values for X were
X=1 (same reasoning as above) and
X=2 (move 100% from bin1 to bin2 and carry on to bin3) - as in my understanding so far

EMD(A,D) = 0.9 (0.5 from bin1 to bin 2 and 0.2 from bin1 over bin2 to bin 3). Is that correct?


Attachments:
File comment: Sample histograms
histo.jpg
histo.jpg [ 33.06 KiB | Viewed 64447 times ]
Top
 Profile  
 
PostPosted: Wed Jul 24, 2013 10:09 am 
Offline
Junior Member

Joined: Mon Apr 08, 2013 1:13 pm
Posts: 15
2Nose:
Yes, I believe your understanding is correct. EMD(A,C)=2 and EMD(A,D)=0.9. The wikipedia article that has the algorithm for 1-D EMD calculation confirms this.

2All:
The article says that the clusterization problem is big, that there are 2,428,287,420 canonical combinations on river. But I believe for distance metrics they mention they can merge some of the canonical combinations. For example, on turn EMD should not care which of the four cards is the one that appeared last. So the following two boards

AcKcQcJc
AcKcJcQc

can IMHO be treated as identical (for those distance metrics).


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 11:29 am 
Offline
Junior Member

Joined: Mon Apr 08, 2013 1:13 pm
Posts: 15
I am thinking about implementing k-means based bucketing.

The tricky part seems to be the search of the nearest centroid. If it is done in a straightforward fashion the time complexity will be O(NK), where N - number of points (card/board combinations), K - number of clusters. The obvious way to speed up nearest-neighbour search is to use k-D trees. It should work fine for everything except the earth-movers distance. The article has about 50 histogram bars, which would give us a 50-dimensional space. And k-D trees don't work that well in high-dimensional spaces, they basicly don't work.

UoA say they used the triangle optimization inequality, but it mostly avoids distance calculations, the number of operations remains O(NK) - that is if I haven't missed anything.

The worst case scenario for turn (earth movers makes no sense on river) is:

~10^7 hand/board combinations to be clustered
~10^5 buckets

giving the time complexity of a SINGLE iteration of about 10^12.

Anybody has an idea how to speed this up?


Top
 Profile  
 
PostPosted: Fri Aug 09, 2013 8:11 pm 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
10^5 buckets is overkill imo. I got good results with the Hamerly implementation for k-means clustering that you can read about here: http://www.siam.org/proceedings/datamin ... merlyg.pdf. It uses triangle inequality optimizations without requiring a ton of additional memory.

I ran it over 10^7 hand/board combinations with 5000 buckets and (from what I remember) it took about 12-24 hours on a typical desktop CPU until it was swapping only ~0.1% of the observations into different centers per iteration, which was after like 100 iterations I think. I got similar performance with OCHS (8 dimensions) and earth mover's (with 50 histograms/dimensions).

In practice, the number of buckets seems like the biggest factor on time. For comparison, if I pick a tiny number of buckets, like 10, then the Hamerly algorithm can converge in a couple minutes even with the 10^7 observations.


Top
 Profile  
 
PostPosted: Tue Aug 13, 2013 11:32 am 
Offline
Junior Member

Joined: Mon Apr 08, 2013 1:13 pm
Posts: 15
Hi, thanks for the reply. So performance issues with a high number of buckets aren't imaginary.

Do you have an OCHS lookup table at hand? Did you calculate exact strengths against every cluster or did you use monte-carlo sampling? When I try to calculate OCHS values for the 4s4h hand I get numbers that are slightly different from the ones in the paper. My numbers are:
0.706792 0.637735 0.51629 0.62033 0.506075 0.570617 0.496239 0.189106

I'd appreciate if you could tell whether you get the same values for 4h4s or not.


Top
 Profile  
 
PostPosted: Wed Aug 14, 2013 2:24 pm 
Offline
Junior Member

Joined: Wed Jul 17, 2013 10:52 pm
Posts: 21
I got exact values and they match what you got for 44. I agree the values given in the paper (figure 3) are wrong. Just to be extra sure they made the mistake, you can quickly check with pro poker tools to see that 44 vs 88+ has EV of 0.1891 and not 0.180 like they have shown in their graphic (link).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 28 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group