Image Image Image




Post new topic Reply to topic  [ 57 posts ]  Go to page Previous  1, 2, 3
Author Message
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Tue May 22, 2012 5:59 pm 
Offline
Regular member
User avatar

Posts: 95
Favourite Bot: man
Added public chance sampling (although in this case PCS = vanilla).


Attachments:
RiverSolver2.zip [108.33 KB]
Downloaded 108 times
Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Mon Jun 25, 2012 9:22 pm 
Offline
Junior member
User avatar

Posts: 20
Favourite Bot: myBot
http://poker.cs.ualberta.ca/open_cfr.html
Is this Vanilla or Chance-Sampled ?


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Jun 27, 2012 7:34 pm 
Offline
New member
User avatar

Posts: 1
Favourite Bot: in development
TrumpeT wrote:
http://poker.cs.ualberta.ca/open_cfr.html
Is this Vanilla or Chance-Sampled ?

Vanilla


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Thu Aug 30, 2012 10:32 pm 
Offline
Junior member
User avatar

Posts: 34
Favourite Bot: nanonoko
I've run the amax code w/ Chance Sampling for the khun, but i find a strange result :

-> With an equibrobable distribition (33% K , 33% Q, 33% J) i find the -1/18 and 1/18 winrate -> OK

-> But with : player 1 (~0% K, ~0% Q, ~100% J) and player 2 (~100% K, ~0% Q, ~0% J)
, the player 1 is drawing dead, i find a -0.5 winrate for player1 and +1 for player2 :cry:

It seems bugged, I should find -1 and +1 no ? It's a sum zero game ...


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Fri Aug 31, 2012 8:13 am 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
kenu7 wrote:
I've run the amax code w/ Chance Sampling for the khun, but i find a strange result :

-> With an equibrobable distribition (33% K , 33% Q, 33% J) i find the -1/18 and 1/18 winrate -> OK

-> But with : player 1 (~0% K, ~0% Q, ~100% J) and player 2 (~100% K, ~0% Q, ~0% J)
, the player 1 is drawing dead, i find a -0.5 winrate for player1 and +1 for player2 :cry:

It seems bugged, I should find -1 and +1 no ? It's a sum zero game ...

Did you initialise the probabilities in BestResponse() properly?
Why do you want to do this?


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Fri Aug 31, 2012 5:51 pm 
Offline
Junior member
User avatar

Posts: 34
Favourite Bot: nanonoko
spears wrote:
Did you initialise the probabilities in BestResponse() properly?


Yes, i use the same distribution in Run() and in BestResponse()

spears wrote:
Why do you want to do this?


I 've written a class which dynamically generates the game tree for no limit holdem.
So I did the unit tests with trivial cases to verify the behavior of the CFR, and I immediately preview the results were false when a player was drawingdead.

Then I tested this case on the khun and I saw that it was also wrong,
although it is good for equiprobable distribution of all hands (1/18).

It seems that the best action founded for player1 with his JACK is betting 100% of the time :cry:

so he bets, the player 2 calls 50% and folds 50% , so we loose 0.5*1 - 0.5*2= -0.5


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Fri Aug 31, 2012 8:39 pm 
Offline
Junior member
User avatar

Posts: 34
Favourite Bot: nanonoko
I've found, it was my fault

i am so stupid :xx08


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Sep 05, 2012 5:44 pm 
Offline
PokerAI fellow
User avatar

Posts: 1115
Favourite Bot: Johnny #5
kenu7 wrote:
i am so stupid :xx08

Jesus man, don't beat yourself up about it, everybody makes mistakes.


Top
 Profile  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Sat Sep 08, 2012 6:35 pm 
Offline
Senior member
User avatar

Posts: 124
Favourite Bot: coming
I have a OOP question concerning amax's code and game tree representation in general.
How would you handle chance node? Would create a new class "ChanceNode" extending "GameTreeNode" or would you modify "Decision" to handle chance?


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Tue Sep 11, 2012 2:45 pm 
Offline
Regular member
User avatar

Posts: 95
Favourite Bot: man
LOLWorld wrote:
I have a OOP question concerning amax's code and game tree representation in general.
How would you handle chance node? Would create a new class "ChanceNode" extending "GameTreeNode" or would you modify "Decision" to handle chance?


I wrote a Rhode Island Hold'em solver to demonstrate this. It also has some other new stuff:

- lock-free multithreaded cfr
- better external sampling implementation
- "probing" cfr based on http://webdocs.cs.ualberta.ca/~games/poker/publications/AAAI12-generalmccfr.pdf


Attachments:
RhodeIsland.7z [22.68 KB]
Downloaded 121 times
Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Tue Sep 11, 2012 5:40 pm 
Offline
Senior member
User avatar

Posts: 124
Favourite Bot: coming
amax > Chuck Norris > God


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Sun Sep 16, 2012 2:15 pm 
Offline
Senior member
User avatar

Posts: 124
Favourite Bot: coming
It seems that there is an infinite loop in BestResponse().
I'm investigating this :sherlok


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Sat Sep 22, 2012 3:47 pm 
Offline
Senior member
User avatar

Posts: 124
Favourite Bot: coming
That's weird, I have the infinite loop problem only in debug mode.
I looked on google and some people have the opposite problem.
I guess the parallel implementation can be the cause.

Anyone facing the same issue?


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Oct 03, 2012 8:24 pm 
Offline
Regular member
User avatar

Posts: 95
Favourite Bot: man
There's some O(n^2) code in Showdown.cs that's enabled only in debug build so it's probably just taking a lot of time.


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Sat Nov 03, 2012 6:35 am 
Offline
PokerAI fellow
User avatar

Posts: 1115
Favourite Bot: Johnny #5
Question about a bit of code from your "probing" example: does ap = 1 hold true for all bet nodes, even in games with multiple bet possibilities? And, why 0.5 as the probing probability?

Code:
            for (int i = 0; i < children.Length; i++)
            {
               double ap = (i == 0 && (children.Length == 3 || !(children[1] is Fold))) ? 1.0 : 0.5;

               if (rnd.Value.NextDouble() < ap)
                  u[i] = children[i].TrainProbing(trainplayer, iteration, ooq / ap, probe);
               else
                  u[i] = children[i].TrainProbing(trainplayer, iteration, ooq, true);

               ev += u[i] * s[i];
            }


amax wrote:


Top
 Profile  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Sat Nov 03, 2012 1:30 pm 
Offline
Regular member
User avatar

Posts: 95
Favourite Bot: man
The sampling scheme is straight from the paper:

"Finally, in hold’em, we always sample fold and raise actions, while sampling call with probability 0.5. Folds are cheap to sample (since the game ends) and raise actions increase the number of bets and consequently the magnitude of the utilities."

I don't know why they chose 0.5, but it's a nice probability for an optimized implementation because you can use a fast 1-bit random number generator.


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Nov 07, 2012 7:49 am 
Offline
PokerAI fellow
User avatar

Posts: 1115
Favourite Bot: Johnny #5
So probing off policy at a certain % would be a bit like a Nash clone, or an RNR?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 57 posts ]  Go to page Previous  1, 2, 3


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: