Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 2:10 pm

All times are UTC




Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Apr 23, 2013 12:15 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
I am trying to come closer to my full game approximate nash equilibrium and since I still have some problems with the algorithm, I am trying to solve Kuhn Poker.

I have first created a tree:

Image

Player Nodes are stored in arrays and every playernode has a pointer to the following array for each action.

I have 2 kinds of terminal nodes: SD nodes and non-SD-nodes, the non SD nodes store the payoffs from player 1's perspective, the SD nodes store the payoff for the player winning the SD, I check who has the better hand at SD.

Now I am not sure, if the payoffs are right I also have a problem with the SD nodes and with the utility calculation.

With my current implementation I just itterate over the arrays and go to the enxt array recursivelly, but this leads to strange scenarios, like A checks, next K bets in the next array and Q calls in the last array before the terminal Node. Do I have to follow one path at a time, like A Checks, K bets or checks, A folds or calls?

I know my question is hard to understand, I just need to know in which order to iterate over the information sets and if the payoffs are right. Unfortunatelly there is very little code for vanilla implementations.


Top
 Profile  
 
PostPosted: Tue Apr 23, 2013 12:21 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Amax posted some nice code on the old forum at http://poker-ai.org/archive/www.pokerai ... 335#p40335


Top
 Profile  
 
PostPosted: Wed Apr 24, 2013 6:54 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Thanks a lot. For anyone who wants do DL the code: the link doesnt work with Chrome, so try IE instead.

I have implemented it and I have following results:

BR0: -0.05499724013389262
BR1: 0.05500378289211444

Player: 0
Q : Check: 0.7958561684157758 Bet: 0.20414383158422414
K : Check: 0.999992507918599 Bet: 7.492081400953363E-6
A : Check: 0.38754625849042007 Bet: 0.61245374150958

Player: 1
Q : Check: 0.6666629177423801 Bet: 0.33333708225762004
K : Check: 0.9999965 Bet: 3.5000000000120003E-6
A : Check: 1.0000000000034287E-6 Bet:0.999999

Player: 1
Q : Fold: 0.9999995 Call: 5.000000000017144E-7
K : Fold: 0.6666551891296969 Call: 0.33334481087030304
A : Fold: 5.000000000017144E-7 Call: 0.9999995



So the Player0 always checks K, sometimes bets Q and A
Player 1 always Checks K after the first Check, always bets A, sometimes bluffs Q

After a bet Player1 always calls A, always folds Q and sometimes bluffcatches the K

did you guys have the same results? BR0 and BR1 are close to 1/18, but even after way more iterations, they dont converge.


Top
 Profile  
 
PostPosted: Wed Apr 24, 2013 8:14 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
http://www.poker-ai.org/archive/www.pok ... 453#p39453

On Chrome, use save link as..


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 1:41 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
HontoNiBaka wrote:
Thanks a lot. For anyone who wants do DL the code: the link doesnt work with Chrome, so try IE instead.

IE ???? You have got to be joking ...

_________________
Cheers.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 1:44 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
1/18 sounds right. I posted spears' and my solution to Kuhn as well at some point on the old forum. Too lazy to find it right now though...

Edit: spears was faster and actually already linked to that post so see his post above.

_________________
Cheers.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 12:54 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Thanks. My solution is different than your solutions, but if there is an infinite number of NEs, its to be xpected. At least the exploitability is correct, so I will assume the algorithm works.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 6:23 pm 
Offline
Junior Member

Joined: Thu Apr 11, 2013 10:13 pm
Posts: 22
HontoNiBaka wrote:
Thanks. My solution is different than your solutions, but if there is an infinite number of NEs, its to be xpected. At least the exploitability is correct, so I will assume the algorithm works.



Doesn't Von Neumann's minimax theorem prove that for any two-player, zero-sum game there is a single (unique) mixed Nash equilibrium?


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 10:27 pm 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
I am not familiar with that theorem but Kuhn is proven to have infinite number of equilibria if I recall correctly.

_________________
Cheers.


Top
 Profile  
 
PostPosted: Thu Apr 25, 2013 11:13 pm 
Offline
Junior Member

Joined: Thu Apr 11, 2013 10:13 pm
Posts: 22
Coffee4tw wrote:
I am not familiar with that theorem but Kuhn is proven to have infinite number of equilibria if I recall correctly.


I just looked it up again. The minimax theorem proves that every zero-sum game has a unique value, not a unique mixed strategy Nash equilibrium.


Top
 Profile  
 
PostPosted: Fri Apr 26, 2013 12:34 am 
Offline
Site Admin
User avatar

Joined: Thu Feb 28, 2013 5:24 pm
Posts: 230
That makes sense and is the case with Kuhn, all those equilibria have an EV of 1/18 and -1/18 respective of the player.

_________________
Cheers.


Top
 Profile  
 
PostPosted: Sun Sep 29, 2013 2:23 am 
Offline
Junior Member

Joined: Fri Sep 27, 2013 12:21 am
Posts: 10
Its my first post here so, System.out.println("hello!");

didn't find any Kuhn poker cfr source code in forums, maybe I'm too lazy to use search, so I've build my own implementation. It's in java. It's a bit slow because I wanted to control all aspects of cfr for training purposes. Here is results:

Code:
Iterations:   50000000
P1 expected value:   -0.055548106797223905
card   history     {pass/bet}
1       Profile: [0.79, 0.21]    P1
1   b    Profile: [1.00, 0.00]    P2
1   p    Profile: [0.67, 0.33]    P2
1   pb    Profile: [1.00, 0.00]    P1
2       Profile: [1.00, 0.00]    P1
2   b    Profile: [0.67, 0.33]    P2
2   p    Profile: [1.00, 0.00]    P2
2   pb    Profile: [0.46, 0.54]    P1
3       Profile: [0.37, 0.63]    P1
3   b    Profile: [0.00, 1.00]    P2
3   p    Profile: [0.00, 1.00]    P2
3   pb    Profile: [0.00, 1.00]    P1



This eventually rose some questions. If any one can answer it'll be great, if not it's ok too.
1) for real even HU NL tourney game tree is too big, so It shall be divided as in imperfect recall. Even so its too big for my PCs. So it's _really_ worth trying cfr? cause it'll take months to compute. Maybe you can point out some better algorithms.
2) As in imperfect recall, for example, 1 game tree shall be calculated for preflop, X for flop, Y for turn, Z for river. Giving 1+X+Y+Z LUTs. there each LUT is a consequence of (opponent model, some previous actions, day on week, etc) Am I correct so far?
3) How does cfr implements opponent model?
ty for answers.

PS if any one want to cooperate on writing poker bot, feel free to PM me. I'm total beginner in this.


Attachments:
Kuhn_cfr.rar [4.14 KiB]
Downloaded 605 times
Top
 Profile  
 
PostPosted: Mon Mar 17, 2014 12:34 am 
Offline
New Member

Joined: Tue Mar 11, 2014 12:40 pm
Posts: 4
Anyone implemented it on C#?

Im having issues using sortedlist/ditionary/...etc .... if someone did feel free to share ^^

thanks


Top
 Profile  
 
PostPosted: Mon Mar 17, 2014 9:24 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Try to put a tiny little of work in and look at the code referenced above. And forget about sorted lists and dictionaries because they are too slow.


Top
 Profile  
 
PostPosted: Fri Mar 28, 2014 4:38 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
funkymonkey85 wrote:
Its my first post here so, System.out.println("hello!");

didn't find any Kuhn poker cfr source code in forums, maybe I'm too lazy to use search, so I've build my own implementation. It's in java. It's a bit slow because I wanted to control all aspects of cfr for training purposes. Here is results:

Code:
Iterations:   50000000
P1 expected value:   -0.055548106797223905
card   history     {pass/bet}
1       Profile: [0.79, 0.21]    P1
1   b    Profile: [1.00, 0.00]    P2
1   p    Profile: [0.67, 0.33]    P2
1   pb    Profile: [1.00, 0.00]    P1
2       Profile: [1.00, 0.00]    P1
2   b    Profile: [0.67, 0.33]    P2
2   p    Profile: [1.00, 0.00]    P2
2   pb    Profile: [0.46, 0.54]    P1
3       Profile: [0.37, 0.63]    P1
3   b    Profile: [0.00, 1.00]    P2
3   p    Profile: [0.00, 1.00]    P2
3   pb    Profile: [0.00, 1.00]    P1



This eventually rose some questions. If any one can answer it'll be great, if not it's ok too.
1) for real even HU NL tourney game tree is too big, so It shall be divided as in imperfect recall. Even so its too big for my PCs. So it's _really_ worth trying cfr? cause it'll take months to compute. Maybe you can point out some better algorithms.
2) As in imperfect recall, for example, 1 game tree shall be calculated for preflop, X for flop, Y for turn, Z for river. Giving 1+X+Y+Z LUTs. there each LUT is a consequence of (opponent model, some previous actions, day on week, etc) Am I correct so far?
3) How does cfr implements opponent model?
ty for answers.

PS if any one want to cooperate on writing poker bot, feel free to PM me. I'm total beginner in this.


1. I think it's worth trying. Make your algorithmt parametrized, with a variable number of buckets, so you can still use it, if you get a better pc at some time. But it will always take time to compute, the biggest factor is how much RAM you have, because with little RAM you will not be able to store the tree at all.

2/3 There is not really an opponent model in CFRM, it tries to be unexploitable and doesn't take really look at the opponent, it just tries to be so balanced, that it can not lose, no matter what the opponent strategy is.


Top
 Profile  
 
PostPosted: Mon Mar 31, 2014 3:31 pm 
Offline
Junior Member
User avatar

Joined: Sun Mar 16, 2014 3:36 am
Posts: 36
Location: Germany
Hello,

I have seen above, the "expected value: -0.055548106797223905" (EV of 1/18 )

also called

"Average game value = -0,0557101115915825".

What exactly does this EV "expected value: -0.055548106797223905" mean, and how can I use them?


Tom


Top
 Profile  
 
PostPosted: Tue Apr 01, 2014 12:13 pm 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
It means that knowing P1 and P2 play the nash equilibrium strategies, before having the cards dealt (when P1 says to P2 : "Ok, let's play one hand dude!"), we know P1 will lose 1/18 = 0.05555... on average.


Top
 Profile  
 
PostPosted: Wed Apr 02, 2014 11:19 pm 
Offline
Junior Member
User avatar

Joined: Sun Mar 16, 2014 3:36 am
Posts: 36
Location: Germany
Hello,
Quote:
we know P1 will lose 1/18 = 0.05555... on average

OK, so Player1 loses after 18 games 1 ;-)

Has anyone perhaps a CFR table for post-flop HU FL 10 buckets?

ty
Tom


Top
 Profile  
 
PostPosted: Thu Apr 03, 2014 6:04 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
http://www.deducer.org/pmwiki/index.php ... ndFellOmen


Top
 Profile  
 
PostPosted: Mon Apr 07, 2014 4:17 pm 
Offline
Junior Member
User avatar

Joined: Sun Mar 16, 2014 3:36 am
Posts: 36
Location: Germany
Spears, thank you for the Link.

I have tested INOT and Fell Omen 1.

In my tests came out, that INOT was somewhat stronger than Fell Omen 1.

But INOT and Fell Omen 1 lost against SimBot from Poker Academy.

ty Tom


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group