Image Image Image




Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: Monte-Carlo CFRM question
PostPosted: Tue Feb 05, 2013 6:01 am 
Offline
New member
User avatar

Posts: 4
Favourite Bot: my own
disclaimer: maybe I am confused with terminology and what I'm talking about is called Chance Sampled CFRM.

As far as I understand, I've implemented vanilla CFRM.
In this article http://webdocs.cs.ualberta.ca/~bowling/ ... tpoker.pdf and in few topics on pokerai it is mentioned that Monte-Carlo CFRM approximates to equilibrium much faster than 'vanilla' CFRM.

I would like to implement Monte-Carlo CFRM, but I cannot grasp one fundamental detail.

Tree representation

For sake of simplicity let's assume we do not use card isomorphism. Also let's assume the game is FLHE, starting on the given river.

I store public game tree i.e. each non-leaf node corresponds to betting sequence. I.e. first node is <empty betting sequence>, player1's turn. It has children <bet>, <check>. <bet> has children <bet,fold> (which is leaf), <bet,call>, <bet,raise>. <check> has <check,check> (which is leaf) and children <check,bet>.

In each non-leaf node I store strategy for the player whose turn is in this node, i.e. strategy[hand][action] is probability of doing action with hand.

Current algorithm

In each iteration I calculate ev for each node for each hand. Then I calculate regrets corresponging each action for each node and hand and do regret-matching.

Regret matching algorithm on iteration n gives me a strategy new_strategy for each node and hand. Then I do assignment
Code:
current_strategy := (current_strategy * n + new_strategy) / (n+1)
The paper proves that strategy will converge, distance to equilibrium will be around C/sqrt(#iterations) and seems that my program is similar to Algorithm 1 (page 11).

What I can't figure out.

Let's assume we want to improve player2's strategy for the given hand. In vanilla version we would improve it in all nodes. However, in Monte-Carlo version we sample opponent's actions. Let's assume we sampled that player1's first action was bet. What happens to the node <check> which will never be reached in current iteration? Do we do regret matching using old values? or we simply do not change strategy there?

If we do not change it we have a problem with rarely-reached nodes like check-bet-raise-raise-raise. Let's say we reach it once in a hundred iterations (may be in this exact example we will reach it more often, but if game starts on the flop, there will be nodes that will be reached once in a few thousand times). Initially I have all strategies uniformly random, so strategy at <check-bet-raise-raise-raise> is (0.5, 0.5).

Let's say we reach this node on 100th iteration and regret matching says us that we should always call. According to algorithm,
Code:
current_strategy := (current_strategy * n + new_strategy) / (n+1)
i.e
Code:
current_strategy = [(0.5, 0.5) * 100 + (0, 1)] / 101 approx= (49.5, 50.5)
If we reach it on the 200th iteration, and regret matching would tell us that we should always call, we will set the strategy to (49.2538, 50.7462).

We will reach this node 1000 times by 100k iterations. However weights will be so dismally small (<1/10k most of the time) that we will barely move strategy away from (0.5,0.5)

Seems that I miss some fundamental idea. Please help me to understand the idea behind Monte-Carlo CFRM.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo CFRM question
PostPosted: Tue Feb 05, 2013 9:02 am 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
http://poker.cs.ualberta.ca/publication ... on.msc.pdf top of page 48
I thought it was easier to understand when I had the code http://pokerai.org/pf3/viewtopic.php?f= ... ing#p40335


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo CFRM question
PostPosted: Tue Feb 05, 2013 9:20 am 
Offline
New member
User avatar

Posts: 4
Favourite Bot: my own
Thank you so much, code is very nice.


Top
 Profile E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 


Who is online

Users browsing this forum: No registered users and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: