Image Image Image




Post new topic Reply to topic  [ 33 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Oct 16, 2009 2:13 pm 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
Monte-Carlo Tree Search in Poker using Expected Reward Distributions, 2009

Guy Van den Broeck, Kurt Driessens, and Jan Ramon
Department of Computer Science, Katholieke Universiteit Leuven, Belgium

Abstract: We investigate the use of Monte-Carlo Tree Search (MCTS) within the field of computer Poker, more specifically No-Limit Texas Hold'em. The hidden information in Poker results in so called miximax game trees where opponent decision nodes have to be modeled as chance nodes. The probability distribution in these nodes is modeled by an opponent model that predicts the actions of the opponents. We propose a modification of the standard MCTS selection and backpropagation strategies that explicitly model and exploit the uncertainty of sampled expected values. The new strategies are evaluated as a part of a complete Poker bot that is, to the best of our knowledge, the first exploiting no-limit Texas Hold'em bot that can play at a reasonable level in games of more than two players.

Download: https://lirias.kuleuven.be/bitstream/12 ... lpaper.pdf


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Oct 16, 2009 4:47 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
Nice work Guy


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Sat Oct 17, 2009 12:04 am 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
spears wrote:
Nice work Guy

I did not make the link with the initials ;)
That is definitely the next thing to be included in my bot, good job.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Sat Oct 17, 2009 6:47 am 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
Zoobie wrote:
I did not make the link with the initials ;)
There just aren't many of us using M5P ;)

@Guy
When you calculate the response do you assume that the opponent strategy
1. will be the same this hand as last, or
2. will have changed in response to your behaviour? (it seems you have sufficient info to do this)


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon Oct 19, 2009 8:03 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
Oh look :-)
Zoombie: we assume it will be the same


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 11:27 am 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
I gave a shot at my first implementation of your algorithm for limit HE, and I obviously have a few questions ;)

1) You say in the paper that you deal with chance nodes the same as with opponents nodes. You also told me in another thread that you deal with them at the leaf nodes. It seems quite different to me as you really increase the game tree size in the first case, but you probably have more accuracy with your opponent model. So, do you use some board inputs in your opponent model, and do you have an idea of its accuracy ?

2) Still struggling with models, it seems to me that if you have no particular way to exploit an opponent in a specific game, you must have a good method to estimate his strength at showdown. Do you also use a regression tree to evaluate that, or do you use something simpler like histograms ? I think I will go for histograms as I already use that in my old simulator, but I am not really found of the results they give me.
I had a quick glance at your source code (damn, you love abstract classes and design patterns ! :D ) and could not understand how the DistributionRollout4 class works, maybe you can help me there.

I am definitely impatient to run the first tests... ;)


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 1:10 pm 
Offline
Senior member
User avatar

Posts: 451
Favourite Bot: gimmick
Quote:
I had a quick glance at your source code ...


Where can i find it?
In the paper there was no code.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 2:01 pm 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
Ockham wrote:
Where can i find it?
In the paper there was no code.

It is in CSPoker.
viewforum.php?f=74


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 5:12 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
Zoobie wrote:
I gave a shot at my first implementation of your algorithm for limit HE, and I obviously have a few questions ;)

1) You say in the paper that you deal with chance nodes the same as with opponents nodes. You also told me in another thread that you deal with them at the leaf nodes. It seems quite different to me as you really increase the game tree size in the first case, but you probably have more accuracy with your opponent model. So, do you use some board inputs in your opponent model, and do you have an idea of its accuracy ?

2) Still struggling with models, it seems to me that if you have no particular way to exploit an opponent in a specific game, you must have a good method to estimate his strength at showdown. Do you also use a regression tree to evaluate that, or do you use something simpler like histograms ? I think I will go for histograms as I already use that in my old simulator, but I am not really found of the results they give me.
I had a quick glance at your source code (damn, you love abstract classes and design patterns ! :D ) and could not understand how the DistributionRollout4 class works, maybe you can help me there.

I am definitely impatient to run the first tests... ;)


1) The paper doesn't explain the factorization of the probability distribution. We have a model that predicts opponent actions based on previous actions but independent of the estimate of their hand cards. At showdown, those hand cards need to be known, and we sample them from another model that's also based on the previous actions.
In theory, this should be equivalent to sampling the hand cards first and then predicting actions based on the hand cards. In practice, we believe our approach allows for easier sampling and modeling of the probability distribution.

2) Yes, we use a regression tree to predict the probability that an opponent hand is in a certain "bucket". We take all hand ranks that are possible given the sampled community cards. We sort them and place them in N equally sized buckets. If a person raised a lot, for instance, it will be more likely that he has a hand from a high bucket. This allows us to cumpute the expected value. (See BucketRollOut class)
DistributionRollout4 you refer to is a much simpler model for showdown. It first samples equiprobable hands. Depending on the pot size, it weighs the expected value of those hands by the probability of seeing such a hand at showdown. This is a very simple but very fast model.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 5:14 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
Zoobie wrote:
I am definitely impatient to run the first tests... ;)


Please keep me informed of any findings you make. It's important for us to know how our bot performs compared to other approaches.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Oct 22, 2009 7:54 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
@Guy
First let me say that I think this is a very important and significant step forward. But of course there some questions:
- How do you choose how much MIXI there is in your MIXIMAX?
- Suppose the optimum response to a particular opponent involved bluffing your worst hands more than your intermediate hands, would your AI find that response?


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Oct 23, 2009 10:19 am 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
guyvdb wrote:
Please keep me informed of any findings you make. It's important for us to know how our bot performs compared to other approaches.

Sure. My implementation will certainly differ slightly though as I will use it for Limit HE against human players (most likely at 1/2$ at first), and develop it in C#. I just hope performance will not be an issue. It may take quite some time, as holidays are very close and I still have a lot to do ;)


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Oct 23, 2009 1:45 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
spears wrote:
@Guy
First let me say that I think this is a very important and significant step forward. But of course there some questions:
- How do you choose how much MIXI there is in your MIXIMAX?
- Suppose the optimum response to a particular opponent involved bluffing your worst hands more than your intermediate hands, would your AI find that response?

- The Mixi in MixiMax stands for the weighted average that's taken in opponent decision nodes (using the opponent model probabilities).
- I think it could.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Sun Nov 15, 2009 10:39 pm 
Offline
Level1 member
User avatar

Posts: 46
Favourite Bot: IvoAvido
Sorry if my questions seems stupid but i'm just starting to study these algoritms.
In the paper there is no clarification about the step 2(expansion) and step 3(simulation) of your algorithm .
About expansion: how is it possible to add a leaf as a child of an other leaf? Isn't the leaf a terminal node of the tree (where the game end)?
About simulation:
How can you start a game from a leaf(where the game reach it's conclusion)?
And what is this fast game-playing strategy ?

I really can't get these points :\


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon Nov 16, 2009 9:53 am 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
Franconero wrote:
Sorry if my questions seems stupid but i'm just starting to study these algoritms.
In the paper there is no clarification about the step 2(expansion) and step 3(simulation) of your algorithm .
About expansion: how is it possible to add a leaf as a child of an other leaf? Isn't the leaf a terminal node of the tree (where the game end)?
About simulation:
How can you start a game from a leaf(where the game reach it's conclusion)?
And what is this fast game-playing strategy ?

I really can't get these points :\


You're confused about the 2 different kinds of leaves that are used.
First, there are the leaves of the real game tree, where the game concludes.
Second, there are the leaves of the subtree that is kept in memory. These leaves are not necessarily leafs of the first kind, but they could be.

In the expansion step you add children to a leaf of the second kind, which then stops being a leaf.
In the simulation step, you start from a leaf of the second kind and play randomly until you reach a leaf of the first kind.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Wed Nov 18, 2009 4:27 am 
Offline
Level1 member
User avatar

Posts: 46
Favourite Bot: IvoAvido
Thanks , now it's clear ;)


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Fri Nov 20, 2009 12:33 pm 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
My implementation seems to be running ok now (12 ptbb/100 over 2k hands yesterday, pretty sick). I have only been playing at 0.05/0.1 FLFR in case there were some bugs left though, and it is a really small sample so it does not mean much. I do not think it can sustain this win rate over 10k hands, but I have good hope it is not loosing anymore.

The bot went from a completely loosing bot raising all the time to a winning one just by tweaking the showdown nodes. It seems that it is the most sensitive part of the AI for me. Do you use a single model to evaluate the showdown strength ? What I did at my showdown nodes : bucketing the players to build 10 different showdown models based on the players vpip and pfr, and a very simple calculation to determine which percentage of the pot I can win to get EV at SD. I had to rewrite completely my neural nets training to have better results. I also had to optimize my hands rollout a lot too, as that was the most intensive part of the AI (40 times faster since my first implementation, only by tweaking this part). I suppose you have less perfomance issue by running your bot in java though, c# is definitely not as fast as java.

Is your implementation fast enough to always give the same results when you run the MCTS on the same situation several times ? I tried that when I saw the bot playing strangely (not badly though) and it seems that I do not parse a sufficient part of the tree to have perfectly stable results.

I am now thinking about incorporating Bayesian inference in the models (especially the showdown ones) as the bot is quite agressive atm, and I am not sure it would perform so well at higher stakes where people have two eyes and a brain. I am not really sure that adding new inputs to the neural networks would improve the results, but I will give it a try. Have you been thinking about this kind of implementation ?


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Sun Nov 22, 2009 6:12 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
I have a couple of models that do the showdown EV calculation. The most intelligent one tries to predict the bucket of the rank of the hand of each opponent. It's certainly not perfect.
I've also noticed very agressive play by the MCTS bot.

I'm not sure if our decisions are stable wrt the number of samples. I'll ask our student to look into that more.


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Mon Nov 23, 2009 10:56 am 
Offline
Senior member
User avatar

Posts: 166
Favourite Bot: ZBot
guyvdb wrote:
I have a couple of models that do the showdown EV calculation. The most intelligent one tries to predict the bucket of the rank of the hand of each opponent. It's certainly not perfect.

I use a similar approach, a NN with about 30 inputs regarding the opponent play style and his actions during the hand with 10 outputs representing the buckets of the ranks.

The hands I use to train the models totally modify the bot behaviour. I have to find a clever way to bucket them. If I train them on slow played strong hands the bot is very passive, but if I train them on very aggressive opponents playing crappy hands to showdown the bot is really aggressive.

That means that the main issue here is to be able to categorize the kind of opponent I am facing very quickly. I will probably go for specific models for each opponent as soon as I have enough hands that went to showdown for him, but I am afraid this kind of network needs a lot of data to be accurate. The generic models I use at the moment do not seem to give enough weight to the player stats (af for each street / vpip / pfr, this kind of stuff).


Top
 Profile E-mail  
 
 Post subject: Re: Monte-Carlo Tree Search in Poker using ERD (2009)
PostPosted: Thu Feb 25, 2010 12:52 am 
Offline
Senior member
User avatar

Posts: 168
Favourite Bot: none
I am looking to code up a version of this for NL Holdem, though I'm willing to do it for Limit Holdem first if necessary.
Some questions:
1. How do we deal with chance nodes? It says in the paper to treat them like opponent nodes, but it doesn't seem analogous (the chance nodes are random after all). As zoobie said, seems like it could lead to a large game tree.
2. How does the simulation work, do you just play it out taking random actions? Do you play it out multiple times per iteration or just once? Does the distribution of the actions you take seem to affect outcomes?
3. How are you abstracting No Limit bet sizes?

Thanks, I'll probably have more questions later.


Top
 Profile E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 33 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: