Poker-AI.org
http://poker-ai.org/phpbb/

Payoff matrix representation
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2659
Page 1 of 1

Author:  ConvexPolytope [ Sun Dec 08, 2013 6:54 pm ]
Post subject:  Payoff matrix representation

Hello,

I have been looking at gradient based methods for computing epsilon equilibra (Carnegie Mellon University). My question is not directly related to those methods. In one of their papers, they talk about Rhode Island Hold'em and the efficient representation of its payoff matrix in RAM (of course the concept extends to proper sized Hold'em as well).

Consider section 4.1 "Customizing the Algorithm for Poker Games --> Addressing the space requirements" in this paper: https://www.cs.cmu.edu/~gilpin/papers/egt.wine07.pdf

They mention that the payoff matrix for Rhode Island Hold'em would normally be 883,741 x 883,741. How do they arrive at that number? I get a much higher number:

1) the payoff matrix is indexed by the sequences for each player
2) every player has 10 possible sequences in each round (3 rounds total)
3) 7 of these end with a call and therefore lead to the next round
4) 52 possibilities to deal the hole card, 51 possibilities to deal the flop, 50 possibilities to deal the turn

For the total number of sequences for one player I get:
( 52 cards * 10 sequences * 7 continuing leaves)
* ( 51 cards * 10 sequences * 7 continuing leaves)
* ( 50 cards * 10 sequences)
= 6.4974 * 10^9
>> 883,741

I assume they use card isomorphism, but I haven't been able to figure out where that exact number comes from. Can someone shed some light on this?

Regarding the idea of representing the payoff matrix as many smaller matrices: what exactly is the structure of these matrices?

1) "The matrices F_i correspond to sequences of moves in round i that end with a fold"
The dimension of F_i is 10x10, which is exactly the number of sequences per round for each player. So I assume the rows and columns are indexed by the sequences of the row and column player respectively, and its elements are 1 if playing the row and column sequence leads to a fold, and 0 otherwise

2) "The matrices B_i encode the betting structures in round i, while W encodes the win/lose/draw information determined by poker hand ranks."
Here I'm lost. B_1 has dimensions of 13x13, B_2 has dimensions of 205x205 and B_3 has dimensions of 1774.
B_1 reminds me of the 13 different card ranks, but what does this have to do with the betting structure? I can't figure out what the contents of this matrix should be such that the Kronecker Product of B_1 and F_1 results in something meaningful. Maybe my interpretation of F_i is wrong in the first place.

Lastly, I would expect that the Kronecker product of the smaller matrices as described in the paper would result in the payoff matrix with dimension 883,741 x 883,741. However:
dim(A_1) = dim(F_1 * B_1) = 130 x 130
dim(A_2) = dim(F_2 * B_2) = 2050 x 2050
dim(A_3) = dim(F_3 * B_3) = 17740 x 17740

(the * denotes the Kronecker Product)

Constructing the matrix A from the A_i matrices as described in the paper yields a matrix with dimensions 19920 x 19920. Why is this? I would expect it the have the dimension of the original payoff matrix.

I'm very interested in a discussion about these questions.

Here is the game tree of a round of Rhode Island Hold'em for reference:

Image

Author:  spears [ Mon Dec 09, 2013 12:38 pm ]
Post subject:  Re: Payoff matrix representation

I'm in awe of anyone who can understand any of this stuff.

There are 20 actions in each round. So that is 10 actions per player

13 is the total number of ways a round can terminate - 7 calls and 6 folds. Sorry I can't see where 205 and 1774 come from.

Maybe he's talking about non zero terms
19920 = 130 + 2050 + 17740

Author:  amago [ Tue Dec 10, 2013 6:40 pm ]
Post subject:  Re: Payoff matrix representation

I believe the 883,741 figure is not correct. Next page in the same paper states that the lossless abstraction size (obtained by GameShrink automatic algorithm) is 1,237,238.

There are 6 sequences ending with a fold in a Rhode Island poker round of betting.
{crf, crrf, crrrf, rf, rrf, rrrf}
There are 7 sequences ending with a call in a RI poker round of betting.
{cc, crc, crrc, crrrc, rc, rrc, rrrc}

1 card canonical forms = 13
2 card canonical forms = 169
3 card canonical forms = 1755 (is this correct?)

The matrix Fi are then size 6x6. (sequences ending with a fold)
The matrix Bi are 13x13, 169x169 and 1755x1755.

Then matrix A1 is F1 x B1, size 78x78

Matrix A2 is F2 x B2, size 1014x1014. As there are 7 ways to reach the flop, actual A2 matrix is 7,098x7,098.

Round 3 can end with a fold or a showdown, so there are 13 possible sequences, and 1755 x 1755 card canonical forms. A3 size is then 22,815x22,815. As there are 49 ways to reach the turn, actual A3 matrix is 1,117,935x1,117,935.

A contains A1, A2 and A3 and has 1,117,035 + 7,089 + 78=1,125,111 rows

So, I think RI lossless abstraction payoff matrix size is 1,125,111 x 1,125,111. This figure is quite close the size indicated in Table 1 of said paper. I suppose the difference can be due to:

- I have messed with the number of sequences or canonical forms in the analysis above,
- I did not understand correctly the structure of the payoff matrix, or
- GameShrink algorithm does not obtain the smallest possible lossless abstraction.

Author:  spears [ Tue Dec 10, 2013 10:59 pm ]
Post subject:  Re: Payoff matrix representation

Quote:
1 card canonical forms = 13
2 card canonical forms = 169
3 card canonical forms = 1755 (is this correct?)

1755 is correct 3 card canonical form, but I find it hard to believe Gilpin got this and 169 figure wrong

Author:  amago [ Wed Dec 11, 2013 7:05 pm ]
Post subject:  Re: Payoff matrix representation

Gilpin does not calculate card isomorphisms. They used an automatic algorithm (gameshrink) that allows to obtain both, lossless and lossy abstractions. They did not reduce the tree to the canonical elements "by-hand". The following reference describes the algorithm:

Gilpin, A., Sandholm, T., 2007. Lossless abstraction of imperfect information games. Journal of the ACM.

Above reference states that unabstracted Rhode Island Poker payoff matrix size is
91,224,226 x 91,224,226

How do we get this number?

2 (an empty sequence for each player)
52*14 = 728
52*51*14*7 = 259,896
52*51*50*14*7*7 = 90,963,600

Which adds up to 91,224,226 rows.

The paper referenced in opening post states a RI reduced tree of 1,237,238 rows (table 1) which is consistent with:
2 + 13*14 + 205*14*7 + 1774*14*49.

A betting round with 4 bets allowed as in LHE instead of 3 as in RI would have 14 sequences for each player. But then there would be 9 continuing sequences to the flop and 81 to the river instead of 7 and 49. I do not understand why in the formulas above 14 is used instead of 10.

I think that the actual unabstracted RI poker payoff matrix size should be:
2
52*10 = 520
52*51*10*7 = 185,640
52*51*50*10*7*7 = 64,974,000

65,160,162 x 65,160,162

And, correcting my post above, the payoff matrix size after reduction to canonical card forms would be:
2
13*10 = 130
169*10*7 = 11,830
1755*10*7*7 = 859,950

871,912 x 871,912

The following paper contains as an example the payoff matrix for the KQJ game in sequence form. [section 5.2, page 29/43] with a very good explanation of sequence form.

Koller, D., Pfeffer, A., 1996. Representations and solutions for game-theoretic problems.

Author:  ConvexPolytope [ Fri Dec 13, 2013 4:27 am ]
Post subject:  Re: Payoff matrix representation

Thanks a lot for the interesting posts.

Quote:
The paper referenced in opening post states a RI reduced tree of 1,237,238 rows (table 1) which is consistent with:
2 + 13*14 + 205*14*7 + 1774*14*49.


Where does the 205 and 1774 come from? I know they use it in the paper, but how do they arrive at that number?

Because after reading your post, I realized how they arrive at the 883,741 number:

1 +
13 * 10 +
205 * 10 * 7 +
1774 * 10 * 49
= 883,741

It does make sense to add 1 for the empty sequence, because the x and y direction in the matrix encode the sequences for one player each. So if we add 1 empty sequence for each player, we have to add 1 row and 1 column.

I believe that the numbers they use have to make sense somehow. Maybe we can't use canonical form because we have to account for flushes?

edit: Ok, after rereading your post, I think you're saying that GameShrink outputs some bigger numbers and that's where the 205 and 1774 come from, but they would be better off using canonical form? Which then leads to

Quote:
And, correcting my post above, the payoff matrix size after reduction to canonical card forms would be:
2
13*10 = 130
169*10*7 = 11,830
1755*10*7*7 = 859,950

871,912 x 871,912

Author:  amago [ Fri Dec 13, 2013 6:04 pm ]
Post subject:  Re: Payoff matrix representation

It would be a small improvement reducing the game from 883,741 to 871,911 rows. You could go further and remove dominated strategies (folding an ace in the first round for example) from the sequence form and the payoff strategy. The problem is that it would require hand tunning each game and each abstraction while the gameshrink is automatic and applicable to many games, from LHE to RI.

Author:  ConvexPolytope [ Sat Dec 14, 2013 12:10 am ]
Post subject:  Re: Payoff matrix representation

I believe we have some flawed thinking. There are more than 169 strategically different preflop/flop combinations. We have to account for the cases where the holecard and the flopcard are of the same suit, because this constitutes a flush draw which requires a different strategy.

13 * 13 +
13 * 12
= 325

So we have 325 different cases. I wonder how they got to 205 from that number.

Author:  spears [ Sat Dec 14, 2013 11:43 am ]
Post subject:  Re: Payoff matrix representation

Quote:
The matrices B_i encode the betting structures in round i


This doesn't sound like a consideration of the cards to me. Note that 205/13 is roughly twice 1775/205, which makes me wonder if the increase in bet size from 10 to 20 plays a part.

All this should come clear if you were to make a start on constructing these matrices.

Would you care to tell us why you prefer this approach to CRFM?

Author:  amago [ Sat Dec 14, 2013 12:08 pm ]
Post subject:  Re: Payoff matrix representation

Spears, I agree it's weird they call it "betting structure" but in the reference i posted above they construct matrixes for kqj and other games and that is wat they do.

About the suits. 169 takes suits into account. 13 pairs, 13*12/2 suited unpaired and 13*12/2 offsuited unpaires.

But would not take into account order od cards and would not differentiate between private and public cards

This can' be right. JcJd2c is very different if your card is the Jc or the 2c. So you don't want to reduce to canonical form

Then, to arrive to those numbers they must have eliminated all of the dominated strategies.
.
About why to use this instead of CFRM...I suppose the question is for ConvexP but I think there are many possible advantages for some kind of problems.
-much faster convergence
-can obtain exact equilibrium of small games
- smaller memory need for very large games
- more efficient paralellization

As i am absolute beginner the best Ican say is these are potential benefits.

Author:  ConvexPolytope [ Sat Dec 14, 2013 12:52 pm ]
Post subject:  Re: Payoff matrix representation

Quote:
This doesn't sound like a consideration of the cards to me. Note that 205/13 is roughly twice 1775/205, which makes me wonder if the increase in bet size from 10 to 20 plays a part.


Yes, I think you may be right.

Quote:
All this should come clear if you were to make a start on constructing these matrices.


Yes. I did play around a bit with Excel and Kuhn Poker, but no results yet.


Quote:
Would you care to tell us why you prefer this approach to CRFM?


It may or may not be better than CFRM. As far as i know nobody is using it or has done a comparison, so there is no way to know.

Some reasons I like the Carnegie Mellon approach:
- The time complexity for convergence is known as a function of epsilon. No such bound is known for CFRM.
- The algorithm is perfectly suited for parallel computation. This is because the most expensive operations in each iteration are 3 Matrix-Vector products.
- It is not necessary to store a game tree in memory. It is sufficient to store the payoff matrix, which can be stored very efficiently.

The last 2 points make me believe that an optimized implementation could be way faster than CFRM. Computing Matrix-Vector products in a highly parallel fashion is something GPUs do best. The speed-up is tremendous compared to CPU implementations. In the paper they state that a Texas Hold'em abstraction required only 2,49 GB of memory. This means that we can fit the complete payoff matrix in the memory of a modern GPU. This should again be a significant speedup, because no RAM access is necessary.

I do believe that these methods have a lot of potential. It would be interesting to compare CFRM to the Carnegie Mellon algorithms (they actually published 2 different algorithms)

Author:  spears [ Sat Dec 14, 2013 5:16 pm ]
Post subject:  Re: Payoff matrix representation

Quote:
Quote:
All this should come clear if you were to make a start on constructing these matrices.

Yes. I did play around a bit with Excel and Kuhn Poker, but no results yet.

I just meant the algebra step.

Quote:
- The algorithm is perfectly suited for parallel computation. This is because the most expensive operations in each iteration are 3 Matrix-Vector products.

Interesting - I'd missed that bit.

Quote:
required only 2,49 GB of memory. This means that we can fit the complete payoff matrix in the memory of a modern GPU.

Wow. I never imagined GPUs had so much memory.

Author:  amago [ Wed Dec 18, 2013 3:05 pm ]
Post subject:  Re: Payoff matrix representation

Quote:
All this should come clear if you were to make a start on constructing these matrices.

KQJ game
· There is a 3-card deck, containing a K, a Q and a J.
· Players ante $1 and are dealt one card each without replacement.
· There is a round of betting with $1 left to be bet after which there is a showdown (if neither player folds). The high card wins.

There is only one round so the matrix is calculated as F_1 x B_1 + S_1 x W_1 (x is Kroenecker product)

sequences for each player:

(1) c, b, cbf, cbc.
(2) cc, cb, bc, bf.

F_1 4x4 matrix
B_1 3x3 matrix

F_1 MATRIX
-- c, b,cbf,cbc.
cc 0, 0, 0, 0
cb 0, 0, 1, 0
bc 0, 0, 0, 0
bf 0,-1, 0, 0

B_1 MATRIX
- J, Q, K
J 0, 1, 1
Q 1, 0, 1
K 1, 1, 0

S_1 MATRIX
-- c, b,cbf,cbc.
cc 1, 0, 0, 0
cb 0, 0, 0, 2
bc 0, 2, 0, 0
bf 0, 0, 0, 0

W_1 MATRIX
- J, Q, K
J 0,-1,-1
Q 1, 0,-1
K 1, 1, 0

Add a row and column for empty sequence and you have the payoff matrix of the sequence form:

Code:
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0,-1,-1, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 1, 0,-1, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,-2,-2
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 2, 0,-2
0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 2, 2, 0
0, 0, 0, 0, 0,-2,-2, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 2, 0,-2, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0,-1,-1, 0, 0, 0, 0, 0, 0
0, 0, 0, 0,-1, 0,-1, 0, 0, 0, 0, 0, 0
0, 0, 0, 0,-1,-1, 0, 0, 0, 0, 0, 0, 0

Author:  ConvexPolytope [ Thu Dec 19, 2013 12:37 am ]
Post subject:  Re: Payoff matrix representation

I did something similar with Kuhn Poker. This interpretation works for 1 round games but I wasn't able to extend it to multi-round games. I tried to write down an example, but the matrices get too big even with very small toy games and it turned into a big mess that nobody would want to read.

The problem is that at showdown, the payoff for the winning hand depends on the betting sequences of all previous rounds. This information is not included in the matrices if we construct them like above. Here's why:

The B_i matrices correspond to the folds in each round, and the S matrix corresponds to showdown sequences in the last round. So S only gives a payoff for the last round, regardless of the betting that took place in earlier rounds.

If we construct the B_i and W matrices like above, they encode information about card sequences. That's redundant because this information is only required for the W matrix in order to determine the winner at showdown. What's missing at this point is the information, how different betting sequences in earlier rounds change the payoff at showdown. This needs to be encoded in the B_i matrices somehow. In the paper it even says "The matrices B_i encode the betting structures in round i, while W encodes the win/lose/draw information determined by poker hand ranks."

It makes perfect sense that this information is required. But how can we encode it in the B_i matrices with the dimensions as described in the paper?

The weird thing is that the dimensions of W and the B_i in the last round clearly have to be the same, because otherwise the matrix addition wouldn't work. W encodes which hands win and which hands lose. B_i is supposed to encode the betting structure, which doesn't have anything to do with showdown hand values. Yet, they have the same dimension.

---
On a different note, every entry in the (complete) payoff matrix is multiplied with the probability that the corresponding sequence occurs. So this needs to be done in one of the smaller matrices as well.

Author:  amago [ Thu Dec 19, 2013 10:15 pm ]
Post subject:  Re: Payoff matrix representation

ConvexPolytope wrote:
The problem is that at showdown, the payoff for the winning hand depends on the betting sequences of all previous rounds. This information is not included in the matrices if we construct them like above. Here's why:

The B_i matrices correspond to the folds in each round, and the S matrix corresponds to showdown sequences in the last round. So S only gives a payoff for the last round, regardless of the betting that took place in earlier rounds.

Remember when we got to explain the size of the RI poker payoff matrix size. 1 + 13x10 + 205x10x7 + ...

That seven comes from the 7 possible sequences ending with a call before the flop. For example, s={crrc} ends with two antes of 10 and 4 bets of 10 in the pot. This is a 60 chips pot. So each of the 7 matrixes of the flop and the 49 matrixes of the turn encode diferent sequences during the previos rounds. You will construct 49 different S and W matrixes.

Yes. Payoff matrix gets really big! But take into consideration how sparse the payoff matrix is!! EGT as Gilpin proposes takes advantage of the matrixes being that sparse to construct an algorithm that is memory efficient. There is a lot of literature out ther about Sparse Matrix Vector Multiplication or SpMV. (Again, I'm not an experienced programmer so not a lot to say on the topic).

By the way. What is the plural of matrix? matrix, matrixes or matrices? (english is not my mother language)
ConvexPolytope also wrote:
On a different note, every entry in the (complete) payoff matrix is multiplied with the probability that the corresponding sequence occurs. So this needs to be done in one of the smaller matrices as well.

(Relative) probabilities are contained in the B_i matrixes. For example:

B_i entry for (AA,AA) in Hold em shall contain a 6, and (AKo,K3s) shall contain a 36, accounting for the different combinations of each pair of hands.
B:i entry for (J,J) in RI contains a 0 because without replacement the probability of two jacks is exactly 0.

Author:  ConvexPolytope [ Sat Dec 21, 2013 10:28 am ]
Post subject:  Re: Payoff matrix representation

Another insightful post, thanks :)

Quote:
Remember when we got to explain the size of the RI poker payoff matrix size. 1 + 13x10 + 205x10x7 + ...

That seven comes from the 7 possible sequences ending with a call before the flop. For example, s={crrc} ends with two antes of 10 and 4 bets of 10 in the pot. This is a 60 chips pot. So each of the 7 matrixes of the flop and the 49 matrixes of the turn encode diferent sequences during the previos rounds. You will construct 49 different S and W matrixes.


Ah, I see where you're getting at. I think it's enough to keep 1 W matrix and 1 of each B_i matrix though, because these only encode the cards. Different betting sequences don't change the winning or losing hands. We do need several F_i and S matrices (which is not as bad, because they are very small).

Preflop: F_1
Flop: F_2.1, F_1.2, ..., F_1.7
Turn: F_3.1, F_3.2, ..., F_3.49 and S_1, S_2, ..., S_49

Would you agree?

Quote:
Yes. Payoff matrix gets really big! But take into consideration how sparse the payoff matrix is!! EGT as Gilpin proposes takes advantage of the matrixes being that sparse to construct an algorithm that is memory efficient. There is a lot of literature out ther about Sparse Matrix Vector Multiplication or SpMV. (Again, I'm not an experienced programmer so not a lot to say on the topic).


In this paper they talk a bit about the implementation of Tartanian. They actually generate C++ code for the vector matrix product for a particular game abstraction automatically from XML files that describe the betting sequences. That way all the C++ code does is crunch numbers, it doesn't have to traverse a game tree or care about the game logic or rules.

Quote:
By the way. What is the plural of matrix? matrix, matrixes or matrices?


I looked it up, it's matrices (English is not my mother tongue either)

Quote:
(Relative) probabilities are contained in the B_i matrixes. For example:

B_i entry for (AA,AA) in Hold em shall contain a 6, and (AKo,K3s) shall contain a 36, accounting for the different combinations of each pair of hands.
B:i entry for (J,J) in RI contains a 0 because without replacement the probability of two jacks is exactly 0.


Ok, makes sense. So in the W matrix, this number will be negative or positive depending on which hand wins. In the B_i matrices the numbers are always positive, because the winner is determined by the betting sequence (which player folded?) rather than the cards.

Author:  amago [ Sat Dec 21, 2013 3:41 pm ]
Post subject:  Re: Payoff matrix representation

I think you have it. Now I will have to study, or at least revise, a lot of linear algebra if I want to understand Nesterov's algorithm :|

And just to point out. The biggest problem I see to use EGT is how to implement Imperfect Recall. Otherwise it won't be possible to do good abstractions, at least for No Limit games.

ConvexPolytope wrote:
I think it's enough to keep 1 W matrix and 1 of each B_i matrix though, because these only encode the cards. Different betting sequences don't change the winning or losing hands. We do need several F_i and S matrices (which is not as bad, because they are very small).

Yes, I think you are right and we only need one W matrix and one of each B_i matrices. By the way, even though we have 7 different sequences before the flop, there will be only 4 different F_2 matrices. This is because the F_i matrix changes only with the pot size. Sequences {rc} and {crc} would have identical F matrices.

ConvexPolytope wrote:
So in the W matrix, this number will be negative or positive depending on which hand wins. In the B_i matrices the numbers are always positive, because the winner is determined by the betting sequence (which player folded?) rather than the cards.

Yes. The B_i matrices encode card probabilities, so the values will be always positive. F_i matrix values will be positive when one player folds and negative if the folding player is the other.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/