Poker-AI.org • View topic - Adaptive Opponent Modeling

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Adaptive Opponent Modeling

Page 3 of 3

[ 59 posts ]

Go to page Previous 1, 2, 3

Print view

Previous topic | Next topic

Author

Message

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 20, 2014 9:36 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

The goal is to cluster together hands of similar strength and variance. This is slightly different to clustering together hands that have similar strategy. because you might expect for example a medium strength / low variance hand to be played the same way as a high strength / high variance hand. But your MCTS should find out how to play the different hands.

I put in the dummy point to force k means into making more divisions by strength and fewer by variance. You are right about raising the strength to a power but it hasn't been that successful. The justification for this is "expert knowledge", but it could be verified by testing bots using different schemes against one another.

You could of course use expert knowledge for the pre flop hands, but this becomes a large task for the post flop hands. That is why I advocated an algorithmic approach.

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 21, 2014 9:05 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

- What is the timescale for this work?
- Could you summarize the current project objectives and plan?
- I'm wondering if you could use some overall project advice, rather than technical details which is what I've concentrated on so far.
- Given your initial stated objective, the strength mean/variance is something of a distraction if you are in a hurry. You could use ehs2 see page 25 of http://poker.cs.ualberta.ca/publication ... on.msc.pdf

Top

Sacré d'Jeu

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 21, 2014 2:41 pm

Junior Member

Joined: Sun Mar 17, 2013 10:03 pm
Posts: 25

spears wrote:

- In about two weeks, I've to hand in my research.
- Goal is to build an as good as possible pokerbot (NL-multiplayer) with adaptive opponent modeling. I hope to have a working pokerbot by the beginning of next week. Then I've got a week for tweaking, testing and improving.
But the goal is not the most important part. It's more important I can show research and development, so don't worry about the result too much.

- My supervisor wasn't present today, so a final decision on the bucketing will be for tomorrow, but I'm guessing we will raise the means by some power (and maybe also the variance instead of a dummy point) and then use a clustering algorithm. The end results will not differ much though.
And with that, the outline of the bucketing is finished. I'll calculate the mean and variance after every possible flop for every hole, so I can create transition tables. Then do the same for flop->turn and turn->river.
(You are very kind to help me so much, and I really appreciate it!)

So, next problem: the simplified gamestate I = how can I describe the gamestate with a small number of features, so I can accurately model (most of) the opponents' possible strategies P(a | b*, I)?
I'm thinking the beliefs distribution b* of the opponents holecards has already much of the information about previous actions of the opponent, so I don't need to include such information here (correct me if I'm wrong).

Here is a first thought about the features I'll use:
- round (I'm thinking about eventually using a different model for each round, or preflop-postflop, but for now, I'll use only one)
- relative stacksize player (vs. potsize)
- absolute stacksize player (in BB)
- position (only against players still in the hand)
- relative amount to call (vs stacksize)
- absolute amount to call (in BB)
- size last raise (in BB)

- number of opponents (at the start of the hand)
- number of active opponents (= players who are still in the hand)
- average stacksize active opponents (or should I use max(player stacksize, opponent stacksize) and take the average of that?)
- average VPIP, AF and frequency actions active opponents

- number of opponents that raised this round
- average stacksize of active opponents that raised this round (same note as above)
- average VPIP, AF and frequency actions of active opponents that raised this round

- number of players all-in
- number of hands played
------------------------------------------------------------------------------------
I've also thought about:
1. The players own statistics (VPIP, AF, frequency actions), but I'm not sure cause the opponent does not make decisions based on his own statistics. Furthermore, it might cause the model to concentrate too much on these features, I guess. It should be a great help for the default model, as this would lead to different strategies for different characterised players. But for an opponent-specific model, only statistics based on his last x actions/hands would make a difference, right?

2. Include information about the belief distribution of the opponents, as this implies information about the action sequence. (That would mean I'll have to keep a belief distribution for myself too).

3. If there are any statistics I don't need to use cause I'm using MCTS.

4. Information about the board (eg dry/wet board). I've statistics of this with the calculation of the bucketing. I don't think they are already implied in the beliefs distribution, so I use give for example the mean variance of the specific board for all possible holecards as a feature.
-----------------------------------
For the model, I'm thinking to use a NN with the belief distribution and the described gamestate as input, but I'll consult my supervisor for that one too.

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 21, 2014 4:11 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Sacré d'Jeu wrote:

spears wrote:

- Thinking about writing an adaptive bot in less than two weeks makes me feel ill.
- Maybe you could build a reinforcement learning bot that learns which actions are good and which are bad given the strength of your hand and the context. That would much less work than you doing at the moment.
- I'll try to think of some more ideas to cut down the work

Top

jukofyork

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 22, 2014 11:41 am

Junior Member

Joined: Thu Nov 14, 2013 2:56 pm
Posts: 12

spears wrote:

I've tried clustering in weka. Example attached. I've messed about with stretching and compressing the strength axis but hasn't been very successful. Need to think about this some more. Also wondering if points should be weighted by the number of instances. Will return to this tonight/tomorrow.

Have a look at using "Principal Component Analysis" and/or "Spatial Sign Transformation".

Juk

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 22, 2014 12:24 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

jukofyork wrote:

spears wrote:

Have a look at using "Principal Component Analysis" and/or "Spatial Sign Transformation".

Juk

I think I understand PCA but not Spatial Sign Transformation. Is your idea to divide the space with rectangles oriented on the principal axes?

Top

jukofyork

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 22, 2014 1:19 pm

Junior Member

Joined: Thu Nov 14, 2013 2:56 pm
Posts: 12

spears wrote:

Is your idea to divide the space with rectangles oriented on the principal axes?

Here is what you get with the pocket pairs removed:

So you can now break that up into say 3x3 or 4x4 (using either equal areas or equal densities), transform the bounding coordinates back to your original space, and then separately break the pocket pairs in 2, 3 or 4 more clusters, etc.

By the look of it, you might get even better clusters by scaling the factor 2 axis or using quadrilaterals instead of rectangles.

Depending on exactly what you want to do, you might also be able to improve on the "stepiness" of the clustered approximations by interpolation via triangular fuzzy set membership, etc.

The Spatial Sign transformation might map the values to something interesting in 1-dimensional space so it's worth a try.

Juk

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 22, 2014 1:38 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Neat, thanks.

Top

HontoNiBaka

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 27, 2014 1:40 pm

Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267

I am currently working on my advisor again, currently it is pretty much just a set of rules.

My planned approach is to learn a model on a big database with Regression Trees and to classify each opponent I play against based on that model.

I think it will cover my bluffs pretty decently. For instance I am planning to learn how often a player folds to a 4Bet after he 3Bets. I plan to use his 3bet as feature, the 4Bet% of the 4 bettor, the 4 bettors fold to 5bet, PFR, a few moving averages of those stats and special features, like how many 4bets did the 4 bettor make in a row etc.

If I set fold to 1 and call/5Bet to 0, I should get a decent percentage of how often players fold to a 4bet in those situations. I will also learn models of how often the call vs how often they fold (1 vs all classification basically, to get a multiclass regression.

I plan to do the same for contibets for instance, with the feature space now also containing the flop cards.

Since it doesn not matter that much which range he will exactly continue with for my bluffs, what really counts are his folding %, I think this should be a relativelly solid model.

Then I classify my opponents based on that, for instance all players who I have 0 hands from will be classified the same way, and also all players who have the exact same stats at the moment will be classified the same. In my mind that model will allow more variations, than clustering.

When it comes to value betting it will be of course more difficult, or when it comes to calling, because here his range matters, the only exception maybe being calling preflop all ins, because then he can not fold anymore and I can see all his AI hands, so I can learn a vector of probabilities for each hand with a relativelly small bias.

When it comes to actually determining a range of hands, I was thinking about a semi GTO approach. Basically I was thinking about taking the % of hands my regression tells me he is holding and through some sort of fictious play determining hands, that would have a high EV against my perceived range.

I don't know if my ideas make sense though, I have only used CFRM so far, but of course that won't help much for 6 max.

Top

Sacré d'Jeu

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 27, 2014 1:47 pm

Junior Member

Joined: Sun Mar 17, 2013 10:03 pm
Posts: 25

Hey guys, a brief update:

- we've decised to include the bucketing into our research. I'm going to compare different bucketing options, some of them discussed here.
- I'm implementing a good part of the PT statistics into the feature-set. I hope to finish this today.

Now I need training data to build a general model. I want to use a mix of table-sizes and blindsizes, all playing NL Texas Hold'em.
I've read here somewhere about a website where you could buy these, but I've forgotten the name.

And I've been thinking about testing and how to compare different implementations:
- Playing against other bots: alwaysCall, alwaysRaise, SimpleBot.
Are there any other bots shared in the pokerworld, that I could use?

- Playing against eachother: in a heads-up setting, this is obvious. I guess, if you want to test with more players, you put in 'neutral' bots with the two bots you want to compare.

- Johansson used exploitability against exploitation, but I feel that this would be to much work to implement.

Top

ibot

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 27, 2014 1:57 pm

Regular Member

Joined: Tue Mar 05, 2013 9:19 pm
Posts: 50

Sacré d'Jeu wrote:

Now I need training data to build a general model. I want to use a mix of table-sizes and blindsizes, all playing NL Texas Hold'em.
I've read here somewhere about a website where you could buy these, but I've forgotten the name.

HandHQ released a large database of hand histories from several sites and in several different limits, see the second post by spears here: HandHQ DB.

Sacré d'Jeu wrote:

And I've been thinking about testing and how to compare different implementations:
- Playing against other bots: alwaysCall, alwaysRaise, SimpleBot.
Are there any other bots shared in the pokerworld, that I could use?

There are a limited number of multiplayer bots available, however there is a MCTSBot that you can use via opentestbed, along with a few others.

Sacré d'Jeu wrote:

- Playing against eachother: in a heads-up setting, this is obvious. I guess, if you want to test with more players, you put in 'neutral' bots with the two bots you want to compare.

I can't imagine you'll learn much about multiplayer player using heads-up matches, I would go for running simulations against different types of players and several configurations. The more tests and the more hands played, the better.

HontoNiBaka wrote:

I think it will cover my bluffs pretty decently. For instance I am planning to learn how often a player folds to a 4Bet after he 3Bets. I plan to use his 3bet as feature, the 4Bet% of the 4 bettor, the 4 bettors fold to 5bet, PFR, a few moving averages of those stats and special features, like how many 4bets did the 4 bettor make in a row etc.

The problem with 4bet/5bet etc. stats is that they require following a player for a large number of hands before you get anything usable. This is generally the problem with simulation based approaches. In theory it sounds like a solid enough approach, however in practice getting the data you need on each player is quite tough..

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 27, 2014 2:07 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Quote:

When you no data on villain model him as the weighted average of all villains. When you have a little data on villain model him as the weighted average of all villains using the most frequent stats. Use less frequent stats the more data you have on villain.

Top

spears

Post subject: Re: Adaptive Opponent Modeling

Posted: Tue May 27, 2014 5:26 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

HontoNiBaka wrote:

I am currently working on my advisor again, currently it is pretty much just a set of rules.

What terms are the rules written in? eg
- If I have a flush draw and there is so much in pot and villain has 3 bet then check
- If I have a 70% chance of winning at showdown vs a uniform hand yada yada
- Something else?

Are they deterministic, ie give the same action for the same situation, or non-deterministic, ie give a probability of action in a given situation?

Top

HontoNiBaka

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 28, 2014 11:52 am

Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267

spears wrote:

HontoNiBaka wrote:

I am currently working on my advisor again, currently it is pretty much just a set of rules.

I am using a combination of 2 models. The first one is pretty much like you describe it, it is my exploitive model.

For instance I have an AK high FD, villain raises me on a flop, he has a high fold to flop 3bet -> I 3bet.

The only opponent modeling I am using here are stats and laplace corrected stats.

My second model is a pseudo GTO model based on expert knowledge and some basic formulas.

For instance I have my standard opening ranges based on a database analysis of winning players, the opening range itself is not pseudo GTO yet, but if I face a 3bet for instance, I will only fold 55-60% of my range, so noone can exploit me through lose 3betting.

On the flop for instance, I will only fold 40-50% to a pot sized continuation bet, because my opponent has 1:1 odds on his potential bluff and also some equity, that is why I don't fold exactly 50% as the odds itself would dictate.
I rank my hands in this spot based on EHS2 and some heurostiks and select the worst ones for a fold, the best ones for a raise and the ones in the middle for a call.

Bot models are deterministic.

I get some real randomness by combining both models. For instance I might conclude based on the number of hands I have from my opponent, that I want to play my exploitive strategy 20% of the time and my pseudo GTO model 80% of the times, I decide with a random number generator which one I will use.

Of course I see a lot of weaknesses in my approach. The biggest one seems to be, that for my decisions I don't look far into the future. I might make a good 3bet bluff based on his fold to 3bet and his fold to contibet, but I will never be able to plan my exploit through the whole hand. A good algorithm would already consider his fold to riverbet and losen up my preflop range based on situations, that will occur way later in the hand.

The same goes for my pseudo GTO approach, my preflop play may already give me such a weak river range, that making a call, that would be balanced if you looked at the riverplay as isolated, will be very bad in context of the whole hand.

Top

shalako

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 28, 2014 4:37 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

Quote:

For instance I have my standard opening ranges based on a database analysis of winning players, the opening range itself is not pseudo GTO yet, but if I face a 3bet for instance, I will only fold 55-60% of my range, so noone can exploit me through lose 3betting.

I am very interested in your winning player database. How did you come up with that?

Quote:

I rank my hands in this spot based on EHS2 and some heurostiks and select the worst ones for a fold, the best ones for a raise and the ones in the middle for a call.

I would be careful using the above approach as it is exploitable. Your gonna have to slow play some of your value hands on the flop or else your telegraphing you have a marginal holding. Balancing this out is not easy. Move some of your floating hands into your raising range (like backdoor combo draws) and some of your value hands into your calling range.

Quote:

Of course I see a lot of weaknesses in my approach. The biggest one seems to be, that for my decisions I don't look far into the future. I might make a good 3bet bluff based on his fold to 3bet and his fold to contibet, but I will never be able to plan my exploit through the whole hand. A good algorithm would already consider his fold to riverbet and losen up my preflop range based on situations, that will occur way later in the hand.

Yeah the pros seem to have things all planned out on future streets..especially on taking away the pot later in the hand. Floating with complete air with the intention of making a big bluff later is not easy to do for a bot...however..if the villains range is mostly air then I suppose the GTO approach would be to call X% of the time so that he is not getting value on his bluffs and raise/xr on one of the later streets..preferably on a scare card or something. Technically it would be better to float with some hands that have at least some equity if called I guess.

So maybe considering is his fold to river bet % is not really what you should do and it should have more to do with his range and hand combinatorics?

Top

HontoNiBaka

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 28, 2014 7:48 pm

Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267

I am of course also slowplaying, bluffraising etc. But even for that I use a parametrization based on my EHS2 and on the river my HS.

I am pretty much parameterising my range like it is described in Mathematics of Poker.

I don't believe, that you should float with hands with absolutelly zero equity. The good bots also almost never do that. You should use backdoor draws and overcards etc. for your floats and bluffraises and only use some very trashy hands in very rare situations, so you can hit different boards.

Basing the whole thing on his distribution is of course a good idea, you should for instance bluff more and bet bigger if your range is much stronger than his, but in practice it is pretty hard to determine his range correctly if he also adapts.

Top

shalako

Post subject: Re: Adaptive Opponent Modeling

Posted: Wed May 28, 2014 9:01 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

Quote:

I don't believe, that you should float with hands with absolutelly zero equity. The good bots also almost never do that. You should use backdoor draws and overcards etc. for your floats and bluffraises and only use some very trashy hands in very rare situations, so you can hit different boards.

Yeah mine does not float with any zero equity hands either..as you said its backdoor draws, overcards, pair and an over card etc.

Quote:

Basing the whole thing on his distribution is of course a good idea, you should for instance bluff more and bet bigger if your range is much stronger than his, but in practice it is pretty hard to determine his range correctly if he also adapts.

Well that is the trick..determining his range correctly. It is the key to everything imo. I spent a good year working on this trying to get the bot to range as accurately as possible from pre to the river. It was not an easy task. I had the main code after three weeks but the fine tuning was extremely time consuming. The little differences made a big impact on equity calculations. Studying how the pros do this was invaluable..one player in particular.

What I really want to do next is test the range finder vs HH files that go to showdown to determine the accuracy (ie the hand was in the bots predicted range). By the river you can seriously narrow his range just in the fact that he can only be betting a small number of hands for value.

The grey areas are the 3 bet bluff ranges and slowplaying ranges. Some people think you should never eliminate hands out of a persons range and weight them but I am not so sure about that. I came up with a solution to those problems and so far it is working ok but it still bothers me that it could be more accurate.

Top

proud2bBot

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 29, 2014 4:05 pm

Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216

I think weighting hands is always more accurate than a binary model. There are some cases where the weight will be 0, e.g. a 5bet preflop with 72o should be 0 or very close to it. But for example a hand like 55 on a 865r board could be sometimes called or raised versus a cbet - you don't want to assume after villain calls that he has all sets, nor you don't want to assume he does never have a set, so you actually want to weight its probability.

I'd be very interested in the comparison of different algorithms to predict opponents ranges. Obviously we will always have a bias w.r.t. the folded hands (which we will not see). However, even if we only focus on hands that went to SD, how does a good evaluator function look like? It obviously can't just look if the actual hand is in the range and then return 1, otherwise 0 - this way a stupid predictor (always 100% range) would be best. So we need to combine:
- how much weight does the actual hand have in the prediction range
- how large is the predicted range
Ideally, the measure should be scaled from 0 to 1, but I'm still lacking a good idea to normalize it. Did you put some thought into this? If so, it would be cool if we could select like 1M games from real money games as a test bed and compare our algorithms to see how an algorithm compares to others.

Top

shalako

Post subject: Re: Adaptive Opponent Modeling

Posted: Thu May 29, 2014 11:42 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

Quote:

yeah it has been a dilemma for me for quite awhile. When the villain is leading the betting post flop its very accurate but when the bot is leading and the villain is check calling then it gets difficult because of the slow playing range. So what I did was keep the possible value hands in a separate range in case he decided to "let me know" he had a hand and raise me. So..I have assumed that he does NOT have those hands until he lets me know that he does which is why I have avoided weighting.

Quote:

I'd be very interested in the comparison of different algorithms to predict opponents ranges. Obviously we will always have a bias w.r.t. the folded hands (which we will not see). However, even if we only focus on hands that went to SD, how does a good evaluator function look like? It obviously can't just look if the actual hand is in the range and then return 1, otherwise 0 - this way a stupid predictor (always 100% range) would be best. So we need to combine:
- how much weight does the actual hand have in the prediction range
- how large is the predicted range
Ideally, the measure should be scaled from 0 to 1, but I'm still lacking a good idea to normalize it. Did you put some thought into this? If so, it would be cool if we could select like 1M games from real money games as a test bed and compare our algorithms to see how an algorithm compares to others.

Hmm..I see what you mean now. I will think on this.

Top

Page 3 of 3

[ 59 posts ]

Go to page Previous 1, 2, 3

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum