Poker-AI.org
http://poker-ai.org/phpbb/

Using Earth Mover Distance with k-means
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2798
Page 1 of 1

Author:  ScoobySnacks [ Tue Aug 26, 2014 5:20 am ]
Post subject:  Using Earth Mover Distance with k-means

I asked this question in viewtopic.php?f=25&t=2381, but it didn't seem to get any visability so I'll ask it here. A common method it seems to create an abstraction of hold'em is to take all possible hands in a given round (say the flop), and cluster them somehow. The paper discussed in the linked thread is one example of this. For each possible hole-flop combo they compute the distribution of the hand value (that is percentage of possible opponent hands beaten) on the river starting from that flop, and use that distribution as a representation of that hand. The paper then said it used k-means clustering with the earth mover distance (EMD) to cluster the hands. This part was confusing for me because there's no theoretical guarantee that averaging a set of points produces the best centroid for that group for metrics other than the euclidean metric. So is this really what these papers did, or did they use other methods for dealing with this problem (e.g. k-mediods)?

It does seem plausible that using the mean with the EMD should still give decent results, maybe it just works in practice. I went ahead and tried this and grouped preflop hands using this method and it seemed to work ok, but I'm still skeptical that this is what these papers do. Does anyone know more about this?

Author:  nonpareil [ Wed Aug 27, 2014 7:55 am ]
Post subject:  Re: Using Earth Mover Distance with k-means

All I did was take the average. I'm much more practical than theoretical when it comes to these things. I'm always curious about potential improvements though...

Author:  ScoobySnacks [ Wed Aug 27, 2014 4:50 pm ]
Post subject:  Re: Using Earth Mover Distance with k-means

So I asked this question on stack exchange, and someone provided a good example of when the average does not give the best centroid.
http://stats.stackexchange.com/question ... er-metrics

I'm guessing these papers went ahead and just averaged, and it seems to work in practice. It would be interesting though to see if other clustering techniques give better results. My guess is that the other techniques are way too slow when you want to cluster 55mil+ possible turns.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/