Image Image Image




Post new topic Reply to topic  [ 33 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Feb 25, 2010 9:11 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
1) In my implementation, I add the chance nodes at the leafs of the game tree. You could choose to put them anywhere in the tree, but you have to adapt your opponent model accordingly (it defines what is known in the opponent model and what is unknown)
2) I have not experimented much with rollouts, but we use random actions according to the probability distribution given by the opponent model.
3) We're currently improving that part of the bot. We used to just sample a min bet, allin and some values in between. Now we're also using an opponent model there.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Tue Mar 02, 2010 3:21 am 
Offline
Senior member
User avatar

Posts: 168
Favourite Bot: none
Thanks for your response. So if I understand you correctly your opponent model does not use the board cards to decide on action probabilities but does use them when you decide showdown ranges?

I wonder what the tradeoff is between this and differentiating between board cards and searching a smaller portion of the tree


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Tue Mar 02, 2010 4:19 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
That's correct.
Mathematically, there is no difference. Putting the chance nodes near the leafs makes the action opponent models simpler but less informed.
The problem is that you only observe the hand cards of the people at showdown. This means that building a model of the hand cards before showdown cannot be done online, because there is nothing to learn from. This means that the action model cannot use the hand cards as input. It could use the community cards, but we did not find a representation for the community cards that improved the accuracy of the action model. So we put the chance nodes at the leafs, at showdown.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Wed Mar 03, 2010 9:51 pm 
Offline
Senior member
User avatar

Posts: 168
Favourite Bot: none
guyvdb wrote:
That's correct.
Mathematically, there is no difference. Putting the chance nodes near the leafs makes the action opponent models simpler but less informed.
The problem is that you only observe the hand cards of the people at showdown. This means that building a model of the hand cards before showdown cannot be done online, because there is nothing to learn from. This means that the action model cannot use the hand cards as input. It could use the community cards, but we did not find a representation for the community cards that improved the accuracy of the action model. So we put the chance nodes at the leafs, at showdown.


What do you mean by mathematically theres no difference? You mean the evaluated equities are the same?

It's pretty interesting that you couldn't find better action models that were flop dependent. I know my play personally varies a fair amount dependent on flop texture.

One question about rollouts, you say you use the opponent model to come up with a probability distribution for the simulated action, do you also do the same for your own actions? Do you have a sort of model of yourself that you use or do you just pick randomly?


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Mar 05, 2010 5:09 pm 
Offline
Senior member
User avatar

Posts: 172
Location: France
Favourite Bot: Lucy Liubot
Thanks for this very interesting paper :D
guyvdb wrote:
2) Yes, we use a regression tree to predict the probability that an opponent hand is in a certain "bucket". We take all hand ranks that are possible given the sampled community cards. We sort them and place them in N equally sized buckets. If a person raised a lot, for instance, it will be more likely that he has a hand from a high bucket. This allows us to cumpute the expected value. (See BucketRollOut class)
DistributionRollout4 you refer to is a much simpler model for showdown. It first samples equiprobable hands. Depending on the pot size, it weighs the expected value of those hands by the probability of seeing such a hand at showdown. This is a very simple but very fast model.
It look like you use signalCardShowdown to save the value of the bucket when generating the opponent model.

There is two things that I don't understand:
- How can you use a distribution (bucketDistr) as output with M5P ? Do you convert it before saving ? Maybe you use the value in the 6 buckets as separate output ? I see that you used the average value before but changed it.
- How do you save the data ? (I found nothing in logShowdown)
Also 6 sample per bucket seems a bit low, I've got more repeatable results with 100+. I guess it was a trade-off for speed.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Mar 11, 2010 10:22 am 
Offline
Senior member
User avatar

Posts: 172
Location: France
Favourite Bot: Lucy Liubot
iamnobody wrote:
There is two things that I don't understand:
- How can you use a distribution (bucketDistr) as output with M5P ? Do you convert it before saving ? Maybe you use the value in the 6 buckets as separate output ? I see that you used the average value before but changed it.
- How do you save the data ? (I found nothing in logShowdown)
Nevermind.
I finally found how it work :mrgreen: :xx07


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon May 24, 2010 6:08 pm 
Offline
New member
User avatar

Posts: 3
Favourite Bot: Polaris
Hi,

I will give presentation of this paper in a seminar soon. Does anybody perform some more experiments with this approach? Would be nice if you would share some results.
Thanks in advance


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Wed Aug 04, 2010 7:26 pm 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
I guess it's too late for the seminar...

guyvdb wrote:
That's correct.
Mathematically, there is no difference. Putting the chance nodes near the leafs makes the action opponent models simpler but less informed.
The problem is that you only observe the hand cards of the people at showdown. This means that building a model of the hand cards before showdown cannot be done online, because there is nothing to learn from. This means that the action model cannot use the hand cards as input. It could use the community cards, but we did not find a representation for the community cards that improved the accuracy of the action model. So we put the chance nodes at the leafs, at showdown.

Interesting... My action model is more accurate taking the board into account (it's a neural net).

I guess that is why I have fluctuent results with this approach. Computing the chance nodes at every street slows down the process and increases the parsed size of the tree as probability distributions also change at decision nodes...

I guess I will give it a try when I have the time ;)


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon Jan 31, 2011 12:58 pm 
Offline
New member
User avatar

Posts: 5
Favourite Bot: The human brain
Zoobie wrote:
I suppose you have less perfomance issue by running your bot in java though, c# is definitely not as fast as java.


I have the opposite experience, although my skill level in Java is far from that of C#/.NET. Have you parallelized the algorithm? It can easily be distributed quite cheaply using Azure, Amazon or some other cloud-provider if performance suffer as well.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon Feb 20, 2012 12:52 am 
Offline
Senior member
User avatar

Posts: 458
Location: Now in the mighty PolarBoar variant!
Favourite Bot: ...
I have finally read this great paper (major fail for not getting to it earlier), and I have a question for Guy or anyone who's into the subject:

It seems to me that the proposed idea for selection in chance (and opponent) nodes is basically to select the child which:
a) have led to greatest reward in previous iterations
b) have the greatest uncertainty regarding the reward (in comparison to other children) - measured by either number of samples or std. deviation (depending whether we use formula (1) or (5) from the paper)

While selecting with regard to b) is completely understandable, I can't really grasp why we want to explore children w.r.t. a). Intuitively, it seems to me that we should be focusing on children with higest probability (of ocurring), since they'll be the most influencial for the node's outcome.
Is my intution wrong, or am I reading the paper incorrectly?


Top
 Profile  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Tue Feb 21, 2012 5:53 pm 
Offline
Regular member
User avatar

Posts: 58
Favourite Bot: working on it
I could be wrong, it's been a while since I read the paper. But if I remember correctly it works like this: Let's say we have the following result so far:

fold: EV $0
call: EV $1
1/2 pot sized raise: $9
2/3 pot sized raise: $20
pot sized raise: $19

assuming we a have a decent # of samples for each action, which one would you sample next? Probably not the calling option, because we are not going to do that anyway. So it's more about how we sample our own actions, not opponent or change nodes.

EDIT: Exploring branches with the highest probability first, would bias the result, I think.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Tue Feb 21, 2012 6:30 pm 
Offline
Senior member
User avatar

Posts: 458
Location: Now in the mighty PolarBoar variant!
Favourite Bot: ...
Jack wrote:
I could be wrong, it's been a while since I read the paper. But if I remember correctly it works like this: Let's say we have the following result so far:

fold: EV $0
call: EV $1
1/2 pot sized raise: $9
2/3 pot sized raise: $20
pot sized raise: $19

assuming we a have a decent # of samples for each action, which one would you sample next? Probably not the calling option, because we are not going to do that anyway. So it's more about how we sample our own actions, not opponent or change nodes.


Sampling our options and sampling in opp/chance nodes are two separate matters. My doubts regard only the latter - sampling bot's plays as described in the paper intuitively makes sense to me.

Quote:
EDIT: Exploring branches with the highest probability first, would bias the result, I think.


I'm not saying to explore them first per se, just choose the child randomly (using children's probabilities distribution from the model). Obviously more likely chldren will be sampled more often, but IMO that's OK, since we get to explore more affecting branches in more detail.


Top
 Profile  
 
 Post subject: Monte-Carlo Tree Search
PostPosted: Mon Mar 26, 2012 4:26 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi,

Following a thesis regarding MCTS.

It was really interesting to clearly understand the Monte Carlo Tree Search algo.

http://www.unimaas.nl/games/files/phd/Chaslot_thesis.pdf

Regards,


Top
 Profile E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 33 posts ]  Go to page Previous  1, 2


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: