nonpareil wrote:
In the paper, the author is describing the CPRG's three-player limit hold'em strategy for the ACPC. However, the fact that the number of public buckets for that strategy also divide the number of buckets that the two-player no limit strategy used makes me believe that both used an identical public bucketing scheme for the "important" betting tree.
Good observation
nonpareil wrote:
I'd guess the important betting tree is only like 200-400 betting nodes so they can afford to have that many buckets (e.g. sb raise, bb call; sb raise, bb call, bb check; sb raise, bb call, bb check, sb check; sb limp, bb check; etc.)
I assume by important you mean branches of the gametree that are likely to be be traversed during gameplay?! So the bot is going to play smart against smart opponents and fishy against fishy opponents?
I am not yet convinced this is a good approach. Sure, the nemesis in the abstracted game will be able to extract less value since he cannot benefit of the bot's strategy in unlikely action sequences, but the nemesis in the unabstracted game might be able to exploit the bot's weaknesses in these branches so he might be tempted to force the bot into situations where it is weak.
In other words: If you are playing against a human player and the human recognizes that the bot is weak in spots that are unusual/ involve unconventional play he will play in an unconventional way
nonpareil wrote:
Unfortunately, without a pocket super computer, I'm not sure how to run hundreds of k-means groups when dealing with so many buckets without taking months.
If a understood correctly you would group some boards and create holecard-board-bins within these groups. Since each group consists of only a few boards it is sufficient to sample only these boards so each K-Means will converge way faster than the K-Means that groups all holecards on all boards
Ie: you divide all boards in two groups A and B each consisting of 50% of all boards. So you will run K-Means on both groups. Since each group is only have the size of A+B the slgorithm is likely to converge in half the time K-Means requires for grouping all holecards on all possible boards.
So I assume this won't be too much of a difference