Image Image Image




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Tue Jan 20, 2009 4:48 am 
Offline
Regular member
User avatar

Posts: 84
Favourite Bot: My own
Bayes-Relational Learning of Opponent Models from Incomplete Information in No-Limit Poker
Marc Ponsen, Jan Ramon, Tom Croonenborghs, Kurt Driessens and Karl Tuyls

Abstract: We propose an opponent modeling approach for nolimit Texas hold-em poker that starts from a (learned) prior, i.e., general expectations about opponent behavior and learns a relational regression tree-function that adapts these priors to specific opponents. An important asset is that this approach can learn from incomplete information (i.e. without knowing all players’ hands in training games).

Download: http://www.personeel.unimaas.nl/m-ponse ... ponsen.pdf


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Mon Jan 26, 2009 11:00 pm 
Offline
Senior member
User avatar

Posts: 331
Favourite Bot: code: d
Anyone ever tried to recreate this "system" or used it as an opponent modeller?
I am not exactly sure, how they create the prior. This uniform distribution just gives every possible "example" the same probability? So learning the general prior is actually nothing else, than learning which situations are more probable?

On the flop, they have a likelihood of about 57-60% to predict the action of an opponent without considering his hole cards. Do you think this is a high value?

_________________
You can look at developing pokerbots as doing someone's wife. You want it, she wants it, he doesn't want it, but it's how life is. <indiana>


Top
 Profile  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Tue Jan 27, 2009 12:21 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
I'm using this system for my bot, yes. Part of the implementation is/will be available on http://code.google.com/p/cspoker/ .

The first prior just gives the {check,bet,call,raise,fold} probability in the given round. A function is then learned that changes this probability according to the behaviour of an avarage player in certain situations. Then, a player specific model is learned that changes that probability to better fit the behaviour of a single opponent.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Tue Jan 27, 2009 1:23 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
CodeD wrote:
I am not exactly sure, how they create the prior. This uniform distribution just gives every possible "example" the same probability? So learning the general prior is actually nothing else, than learning which situations are more probable?


I'm not 100% sure I understand, but I think they firstly create an "average" (or generic) opponent model and secondly create "specific" opponent models. They use similar mechanisms in both cases. In the first case the prior is a uniform distribution, in the second case the prior is the distribution of the "average" opponent model.

Using the same mechanism for both learning exercises is just plain elegant. Learning the specific model from the generic model should be quicker than learning the specific model from scratch. Indeed you might take this a step further and learn a specific model from another specific model.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 3:41 am 
Offline
Senior member
User avatar

Posts: 331
Favourite Bot: code: d
spears wrote:
CodeD wrote:
I am not exactly sure, how they create the prior. This uniform distribution just gives every possible "example" the same probability? So learning the general prior is actually nothing else, than learning which situations are more probable?


I'm not 100% sure I understand, but I think they firstly create an "average" (or generic) opponent model and secondly create "specific" opponent models. They use similar mechanisms in both cases. In the first case the prior is a uniform distribution, in the second case the prior is the distribution of the "average" opponent model.

Using the same mechanism for both learning exercises is just plain elegant. Learning the specific model from the generic model should be quicker than learning the specific model from scratch. Indeed you might take this a step further and learn a specific model from another specific model.


Yeah, I must agree, it is pretty elegant and simple. Instead of differentiating from a "generic" model, they maybe should generate X clusters of players with similar stats, and generate specific player prototypes out of these. Then they could learn how a new player differentiates from these, and use that as a multiple deciders which are weighted by his "difference" in stats from those prototypes. At least sounds like a nice idea ;)

@guyvdb
Quote:
I'm using this system for my bot, yes. Part of the implementation is/will be available on http://code.google.com/p/cspoker/ .

The first prior just gives the {check,bet,call,raise,fold} probability in the given round. A function is then learned that changes this probability according to the behaviour of an avarage player in certain situations. Then, a player specific model is learned that changes that probability to better fit the behaviour of a single opponent.

Do you use the Tilde system for that, or something else?

_________________
You can look at developing pokerbots as doing someone's wife. You want it, she wants it, he doesn't want it, but it's how life is. <indiana>


Top
 Profile  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 10:19 am 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
CodeD wrote:
Do you use the Tilde system for that, or something else?


Right now I only use the output of Tilde (the average player model). That will hopefully change in a couple of months.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 1:41 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
What advantages does Tilde have over competing ML schemes such as ANN and C4.5


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 2:33 pm 
Offline
Regular member
User avatar

Posts: 64
Favourite Bot: MCTSBot
spears wrote:
What advantages does Tilde have over competing ML schemes such as ANN and C4.5


It can do relational learning, which is more expressive than propositional learning done by ANN or C4.5. It can recognize structure and relations between objects.
It can theoretically learn such concepts as (I don't want to imply that these concepts are useful)
Code:
position(bot,A), B=A+1, position(C,B), stack(C,D), D<100.

(the player sitting after the bot has a stack lower than 100)
or
Code:
agressive(X), passive(Y), stack(X,S), stack(Y,T), S<T

(there is an agressive player at the table and there is a passive player and the passive player has a larger stack)
and use that in the nodes of the decision tree.

If you'd need that expressivity with a propositional learner such as ANN, you'd have to consider every possible combination of constraints and turn it into individual numerical inputs of the ANN, which is not tractable.


Last edited by guyvdb on Wed Jan 28, 2009 3:48 pm, edited 1 time in total.

Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 3:17 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
Thanks, interesting.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Jan 28, 2009 7:40 pm 
Offline
Senior member
User avatar

Posts: 331
Favourite Bot: code: d
guyvdb wrote:
spears wrote:
What advantages does Tilde have over competing ML schemes such as ANN and C4.5


It can do relational learning, which is more expressive than propositional learning done by ANN or C4.5. It can recognize structure and relations between objects.
It can theoretically learn such concepts as (I don't want to imply that these concepts are useful)
Code:
position(bot,A), B=A+1, position(C,B), stack(C,D), D<100.

(the player sitting after the bot has a stack lower than 100)
or
Code:
agressive(X), passive(Y), stack(X,S), stack(Y,T), S<T

(there is an agressive player at the table and there is a passive player and the passive player has a larger stack)
and use that in the nodes of the decision tree.

If you'd need that expressivity with a propositional learner such as ANN, you'd have to consider every possible combination of constraints and turn it into individual numerical inputs of the ANN, which is not tractable.


Really? I would think the ANN would only need the stack sizes + aggressiveness (i.e. every property you have), and would figure such "context" out itself. Which does not necessarily mean, you can see that the ANN would do this, since it is a blackbox system. But shouldn't it (theoretically) be possible to learn any function with "enough" layers/hidden units?
But you are correct, if the tilde system outputs such concepts it would give much more insights into the opponents strategies.

_________________
You can look at developing pokerbots as doing someone's wife. You want it, she wants it, he doesn't want it, but it's how life is. <indiana>


Top
 Profile  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Wed Apr 01, 2009 10:42 pm 
Offline
Senior member
User avatar

Posts: 165
Location: Vienna, Austria
Favourite Bot: it's toasted
Just skimmed it for now, but I will definitely read it more thoroughly tomorrow (its 11:30 pm here).

Seems they are using (causal?) bayes nets for modelling the players, which is a Reinforcement Learning technique, am I right? It sounds like a great idea - I wonder however if other (newer) approaches to Reinforcement Learning could provide better results than those 60%.

I will definitely give this idea a shot.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Thu Feb 18, 2010 5:21 pm 
Offline
Junior member
User avatar

Posts: 11
Favourite Bot: none yet
Hey guys,

I've also read the paper. Since I'm not a statistics expert I'm not too sure if I got the idea right.
So here is what I think / did not understand:
1) General: Generate model for average player -> Generate model(s) for specific players, both based on observations (= available hand histories)
2) They generate the average model starting from an uniform distribution. I.e. at the beginning we assume all actions possible in a certain situation (fold, check, call, bet, raise) to be equally likely. After learning the average model some action is preferred over others in a certain situations.
3) Same procedure for generating the specific player model. Only difference is that we now use the average player modell action probabilities as starting point.

Right so far?

Now some points I'm not too sure about:
1) The technique to learn these models is a relational decision tree (TILDE). Do they create one tree for both models or only one for the specific model? Or does the decision tree represent the difference between the average and specific model? How may such a tree look like?
2) They describe the leaning task as predicting a) the outcome of a hand based on hand history, board, action and b) the action of opponent given hand history, board and outcome. But how does that work (or make sense)? Because to predict the outcome I need the action and to predict the action I need the outcome. Makes no sense to me. Maybe someone can provide an easy example?
3) In the section "Learning the corrective function" they describe the task in terms of (mixture of) distributions. How do you get the distribution D_p and D_*? Given these distributions the learning task can also be described as deciding wether an arbitrary example x (hand history) belongs to D_* (average model) or D_p (specific model). In other words: How likely is it that x belongs to D_p, i.e. P(D_p | x). For that task they use/construct the TILDE decision tree, right? How does that work exactly?
Moreover they call P(D_p|x) the "learned corrective function" (=decision tree). But isn't it a probability rather than a function?
As a goal we try to compute probability P(x|D_p), i.e. how likely is it that we observe a specific hand x given the specific player model.

Maybe someone can enlighten me a bit :-)

Thanks.


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Thu Feb 18, 2010 10:15 pm 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
This might be rubbish but here goes:
Quote:
Do they create one tree for both models or only one for the specific model?
I think they build a model of both.
Quote:
How may such a tree look like?
Tilde is a type of decision tree learner. There is a nice little tutorial here.
http://decisiontrees.net/?q=node/21
http://www.autonlab.org/tutorials/dtree18.pdf
http://www.dtreg.com/classregress.htm
You could replace Tilde with any machine learner including a neural network. It's pretty easy to try the technique out in WEKA
Quote:
They describe the leaning task as predicting .... and b) the action of opponent given hand history, board and outcome.
I think P(ai|Bi,Hi−1, rp) means the probability of oppo taking an action, given a context, assuming opponent holds certain cards
Quote:
How do you get the distribution D_p and D_*?
They are just derived from hand histories. D* is a set of randomly chosen tuples (i, p, a, rp H, B) and Dp is a set similar tuples for a particular player. When people build opponent models using neural networks they employ similar tuples.
Quote:
But isn't it a probability rather than a function?
If submit an example x to the model of Dp it will tell you the probability that x is from Dp. So it's a function returning a probability.

Another way to look at this. You could build a neural network or other sort of statistical model of many different players and also all players. But how do you know which one of these players is your current opponent? This method tells you.


Last edited by spears on Mon Sep 27, 2010 9:34 pm, edited 1 time in total.
url of tutorial fixed - thanks to birchy


Top
 Profile E-mail  
 
 Post subject: Re: Bayes-Relational Learning of Opponent Models from Incomplete ...
PostPosted: Mon Sep 27, 2010 9:30 pm 
Offline
Senior member
User avatar

Posts: 360
Favourite Bot: Zander
spears wrote:
Url update: http://decisiontrees.net/?q=node/21

_________________
http://www.bespokebots.com


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: