Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 1:41 pm

All times are UTC




Post new topic Reply to topic  [ 44 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Mon Dec 02, 2013 12:00 pm 
Offline
Junior Member

Joined: Fri Nov 08, 2013 6:21 pm
Posts: 17
Hi,
we just presented ourselves in another section of this very interesting forum.

We are a team of two people, a very experienced poker player and a world class physicist specialized in stochastic and montecarlo simulations and all that jazz. The physicist knows nothing about poker, the poker player knows nothing serious about maths and programming.

We are working on an AI for poker, and our current idea is to tune an opensource already compiled poker AI with a heavy rule based strategy and then implementing the result into a more evoluted montecarlo-based AI with opponents modeling etc.
We wanted to know if you guys thought this was a viable path to a decent poker AI or not.

Basically what we are asking is also: do you think a rule based bot could be the core for a NN and/or montecarlo based AI? This given that our intent is not to gain scientific recognition, or to solve the game once for all, but "simply" to end up with a BEish bot for the low to the mid stakes (say up to 200 nlh 6max). And obviously the team already knows for a fact what is profitable and what is not in, say, preflop play, so we don't need for a montecarlo AI to go thru all the gigantic magma of implementing a basic solid and winning strategy - at least we don't need it to do that for some specific and almost already solved parts of the game, as preflop, or Cbetting or something in the like.

At the moment the poker player is giving the physicist all the "correct" opening ranges, flatting ranges, 3bet etc. per position vs specific positions, and the guy is implementing those in something like simplebot (maybe that is *exactly* it, sorry for the inaccuracy, the poker player is writing this).

What you guys think about it? Do you think we could make cooperate our different skills better?
And do you think a so composed team can do good things in the field? We think this could be a very good team in theory but maybe there are things we are overlooking at.

(We plan on hitting a real programmer with some money to do the "stealth" things when and if we have a decent AI).

Any suggestions would be very much appreciated!

Ty!


Top
 Profile  
 
PostPosted: Mon Dec 02, 2013 12:31 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Personally, I think a pure rules based bot is a quick solution for low stakes play, but has little potential for improvement. I think it is much harder than most people imagine to collate all the rules that would ever be required for a really top class player, even supposing you are one.

But I think cooperation between a scientific approach and a poker expert could work. For example, the scientific approach can easily calculate the odds of one hand beating another, but an expert could compose rules to recognize fish much quicker than a scientist. I wouldn't expect this cooperation to be easy, because you will be using a different vocabulary

When I think of more specific cases of how an expert could contribute I'll post them.


Top
 Profile  
 
PostPosted: Mon Dec 02, 2013 1:16 pm 
Offline
Junior Member

Joined: Fri Nov 08, 2013 6:21 pm
Posts: 17
Yes, infact I already had to do a very detailed legenda for the maths guy to get into the poker vocabulary, and I totally see that it might take some time for us to understand each other, but it is quite some time now that we are talking and working on it and things are getting at least started.

I was a little verbouse in my other post, and I beg your pardon for that; mainly I'd like to hear what you think about implementing a rule based core in a montecarlo AI, which was actually my idea because the scientist wanted to go thru all the decisions via more or less a GTO approach.

I really can see how complex and long can be to put together a high level rules based bot with a complete strategy for every street of play, so this in my thoughts could be a good compromise between the two approaches, am I right in this?

For example resolving the preflop play can be a really harsh and long task for a MC AI, while compiling a really valid, even high level preflop strategy in a rule based bot in not hard at all, takes like not more than 5 hours of work, I mean, I did a decent one in more or less that period of time, using a range chart that my friend compiled for the purpose and that the simplebot can read and use.

Thank you!


Top
 Profile  
 
PostPosted: Mon Dec 02, 2013 3:18 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Quote:
mainly I'd like to hear what you think about implementing a rule based core in a montecarlo AI

- What does this mean exactly?
- Do you model the opponents?
- Does chance play a part in the action you calculate?
- Can describe what you propose to do with the board post flop


Top
 Profile  
 
PostPosted: Mon Dec 02, 2013 6:02 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
I really can see how complex and long can be to put together a high level rules based bot with a complete strategy for every street of play, so this in my thoughts could be a good compromise between the two approaches, am I right in this?


Ok..its not difficult at all to put together a high level rules based bot. It really isn't. I have been coding rules based bots for 10 years and I still have not seen a reason to deviate to a simulation based approach even partially and I really considered it this year for HUNL. You would be surprised about how easy it really is once you start coding but it will take lots of time to fine tune it.

Once you code up a basic strategy then put it in play money to find leaks and make adjustments. Play money is good because you will see a wide range of player types and crazy bet sizes. Take notes on mistakes it made. Fix them and put it back in the ring until you think its ready. This process could take you months but I believe it is the best way to do it. Another method I use is to test it against itself to see how it handles a particular situation and to see what its stats would be to the villain. An example is checking to see how often it c bets the flop and what its range is composed of. This is key to check how polarized it is. Like you do not want your bot to heavily c bet only value hands. This would be a leak.

Use PT or write your own stats database for opponent modeling then write rules to adapt to various stats. I have found the villains folding stats to be the most important for adapting/exploiting (3BF, 4BF, Turn fold etc) and all the others for correct ranging (PFR, 3BC etc). This may seem obvious but its easy to overlook some of the stats that make a huge impact on how well the bot plays.

So..write a basic core strategy that does not adjust at all. Once that is done then write the rules to property range the villain (Not as easy as it sounds and probably the most difficult aspect of bot writing), then write code based on the villains stats to either adapt or exploit his weaknesses.

I have found it to be better to write a hyper aggressive bot and force the villain to adapt to you instead of vice versa. Passive bots are usually very predictable and thus are exploitable.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 11:46 am 
Offline
Junior Member

Joined: Fri Nov 08, 2013 6:21 pm
Posts: 17
spears wrote:
mainly I'd like to hear what you think about implementing a rule based core in a montecarlo AI

- What does this mean exactly?


Sorry, this must be a semantic problem as this is the poker player writing. I meant to ask if you can do a decent rule based bot and then let a montecarlo and NN based AI perfect its strategies. Or, in general, if this is a viable approach or not.

spears wrote:
- Do you model the opponents?


We will, in the montecarlo phase. But maybe obv we could have to change this approach and implement oppos modeling in the rule based core.

spears wrote:
- Does chance play a part in the action you calculate?


You mean to ask if we are making the bot so that he will not always do the same things in the same situations? If so, yes. He will have a part of his range with which he'll do some times x and some times y.
If not, again I'm sorry, it must be my poor understanding of the matter in a strictly theoretical mathematical sense. I'll ask to the physicist and come back (he's doing conferences abroad at the moment).

spears wrote:
- Can describe what you propose to do with the board post flop


Should we proceed with the idea we have now, we'll set some simple rules of engagement, so to speak. For example the bot will raise cbets on a class of strategically similar flops when his holding is in an assigned rank (TPTK+, MidPair + backdoor FD etc.) and the opener is in a steal position and his cbet % is a certain value and the bot is IP or OOP and so on.


Last edited by fisherking on Tue Dec 03, 2013 11:53 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 11:49 am 
Offline
Junior Member

Joined: Fri Nov 08, 2013 6:21 pm
Posts: 17
shalako wrote:
CUT


We are *very* thankful for your advice, sir. Ty.
This is already helping our better understanding of the long way we have in front of us.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 3:10 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Quote:

Sorry, this must be a semantic problem as this is the poker player writing. I meant to ask if you can do a decent rule based bot and then let a montecarlo and NN based AI perfect its strategies. Or, in general, if this is a viable approach or not.

Still makes no sense to me. How do the rules and MC approaches play together? Could you give a simple example?

Quote:
Should we proceed with the idea we have now, we'll set some simple rules of engagement, so to speak. For example the bot will raise cbets on a class of strategically similar flops when his holding is in an assigned rank (TPTK+, MidPair + backdoor FD etc.) and the opener is in a steal position and his cbet % is a certain value and the bot is IP or OOP and so on.


OK. Here is something I can comment on. Instead of some long winded description of "strategically similar" such as (TPTK+, MidPair + backdoor FD etc.) you could reduce this to a small number of parameters. For example "probability of win" and "standard deviation of probability of win". Draws have high st dev. And then calculation of these quantities can be very quick if you use precalculated lookup tables. Search around the forum for LUT.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 3:40 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
I've just thought of a productive way you guys could work together that has quite good long term potential. There is some java code for a MCTS bot somewhere. It almost certainly needs some poker value judgements to make it work well so you could do that. IIRC it doesn't know how to bluff or slow play so you could introduce that. Get a crack of PokerAcademy to test it and to spew out diagnostics to help you tune it.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 4:10 pm 
Offline
Regular Member
User avatar

Joined: Wed Oct 02, 2013 5:00 pm
Posts: 64
spears wrote:
There is some java code for a MCTS bot somewhere.

In the opentestbad there is a full MCTS bot framework: https://github.com/corintio/opentestbed .
There are some discussions about it and how to customize it in the archived forums.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 5:39 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
How does MC, MCTS, NN etc handle ranges? I never hear anything about that which I do not understand at all. My whole HUNL bot revolves around it. Not only do you have to get range accurate (not easy against a pro who polarizes) but also dissect that range using hand combinatorics and the board to reduce his range all the way to the river (and I wont even mention dissecting the bots own perceived range for bluffing). A basic strategy without ranging will lose so how does a LUT handle that?

Also..I think you do have to break down what your hand is instead of using a LUT. One thing that LUTS do not reflect is certain hand types (like a pair and backdoor FD, or a backdoor FD and Str8) give you more turn barreling opportunities. Mathematically they suck but by showing aggression on the flop and leading the turn many players will just give up marginal holdings. Even if you miss the turn you can use hand combinatorics to figure out the correct fold equity which might allow you to barrel again anyway. Any villain not holding TP+ will think twice about calling the turn.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 7:28 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
shalako wrote:
How does MC, MCTS, NN etc handle ranges? I never hear anything about that which I do not understand at all. My whole HUNL bot revolves around it. [...]


simulation based bots aren't aware of the concept or ranges. the range (or better probability distribution) in a given spot is more or less given by the probabilities for all hands to reach that spot. these probabilities are balanced (preventing exploitability) and take into consideration the probability distribution of the best possible opponent

EDIT: yay, that post made me a regular member :-)


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 7:44 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
simulation based bots aren't aware of the concept or ranges. the range (or better probability distribution) in a given spot is more or less given by the probabilities for all hands to reach that spot. these probabilities are balanced (preventing exploitability) and take into consideration the probability distribution of the best possible opponent


Ah ok. I understand now.

So do they recognize weaknesses or do they just play "balanced" all the time? Like lets say the villain folds to 60% of 3 bets. My bot will start to 3 bet very wide until the guy adapts and it becomes unprofitable. Does the simulation bot adapt or just play the same way for all situations?

It should also adapt in the opposite manner too..meaning if the guy is 3 betting the hell out of me I start calling wider in position and 4 betting/shoving more until he gets the message its not going to be tolerated...


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 8:19 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
@shalako

They adapt. And that can be quite clever statistically. With the limited stats they have they find from their database of many other players how villain is likely to play in spots they have not yet encountered.

A LUT is just a fast way of calculating something, usually in this case the probability of one hand beating another. But I also use the variance (http://en.wikipedia.org/wiki/Variance) of pwin. IMO this captures the "drawiness" of the hand. (A draw has lots of drawiness) So in my model the plethora of cases that you describe (like a pair and backdoor FD, or a backdoor FD and Str8) is reduced to 2 parameters. Maybe two parameters is too few, so if you have some suggestions for some other parameters I'm listening.

The tension between those advocating rules based approaches and pure mathematical approach is quite deep and quite common. I'm surprised fisherking can get physics guy to play along. Ultimately (and that might be a long way off), the math approach will win. Arguably it already did at Poker Paradime.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 9:53 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
So in my model the plethora of cases that you describe (like a pair and backdoor FD, or a backdoor FD and Str8) is reduced to 2 parameters. Maybe two parameters is too few, so if you have some suggestions for some other parameters I'm listening.


Well I have adopted several "low equity" plays into my bot that most pros use. Anything with 3-5 outs is what they are. Gutshots, Pair and backdoor draw, backdoor draw and overcard, and the worst being backdoor str8 and flush draw combo (3 outs). All these plays do generally is make the bot more aggressive on the flop and provide more turn barreling opportunities. But..the real value of these hands come from being very disguised so when you hit you get more value out of them vs a standard open draw. Kick in any fold equity you have on the turn and these plays start to look pretty good.

Generally (I use this loosely because of predictability) you want to raise lower equity draws and call higher equity hands like open draws. It sounds counter-intuitive but you want to "realize your equity". This basically means you don't want to get blown off your good draws. But you don't always want to do that so you have to check back/call many of your good draws too. Many people make the mistake of raising the nut flush draw more often then not. The problem with this is that you will possibly push a weaker draw off the hand which is losing value.

Quote:
The tension between those advocating rules based approaches and pure mathematical approach is quite deep and quite common. I'm surprised fisherking can get physics guy to play along. Ultimately (and that might be a long way off), the math approach will win. Arguably it already did at Poker Paradime.


What is/happened at Poker Paradime?

I can see how this controversy between rule guys and sim guys can run deep. For one thing I think solving 100bb+ is way off like you say so rules seems like the better choice at the moment. I think Nasher is going to run simulations until rapture and by then it wont matter!


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 10:12 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Poker Paradime was a site featuring a good exploitative limit bot that you could play against amongst other things.

https://www.google.co.uk/search?q=site% ... 2&ie=UTF-8

https://www.google.co.uk/search?q=site% ... 2&ie=UTF-8

https://www.google.co.uk/search?q=site% ... .com+sonia


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 10:23 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Quote:
Well I have adopted several "low equity" plays into my bot that most pros use. Anything with 3-5 outs is what they are. Gutshots, Pair and backdoor draw, backdoor draw and overcard, and the worst being backdoor str8 and flush draw combo (3 outs). All these plays do generally is make the bot more aggressive on the flop and provide more turn barreling opportunities. But..the real value of these hands come from being very disguised so when you hit you get more value out of them vs a standard open draw. Kick in any fold equity you have on the turn and these plays start to look pretty good.

Generally (I use this loosely because of predictability) you want to raise lower equity draws and call higher equity hands like open draws. It sounds counter-intuitive but you want to "realize your equity". This basically means you don't want to get blown off your good draws. But you don't always want to do that so you have to check back/call many of your good draws too. Many people make the mistake of raising the nut flush draw more often then not. The problem with this is that you will possibly push a weaker draw off the hand which is losing value.

I have nearly no idea what this means, and no clue how to get a number out of it. In my terms I think you are saying "call hands with high equity but medium variance, raise hands with low equity but high variance"? But that isn't a new parameter, it's just a (reasonable) rule based on my existing two parameters.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 10:41 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Quote:
I can see how this controversy between rule guys and sim guys can run deep. For one thing I think solving 100bb+ is way off like you say so rules seems like the better choice at the moment. I think Nasher is going to run simulations until rapture and by then it wont matter!

Not all math approaches are the same. Nasher is working on Nash Equilibrium, but MCTS is an exploitative approach. There are quite a few variants of both of these approaches. Personally I think Nash Equilibrium is close for nearly all games, but actually isn't a worthwhile goal because it doesn't maximise income and is easily detectable. What little time I have for development these days is devoted to a hybrid approach. Even if I was interested in writing a rules based based system it's not an option for me because I don't know the rules.

I thought the approach I advocated for fisherking had some benefits both for him and physics guy. It would be an extensible platform that would allow poker guy quite a lot of discretion in the things that MCTS is bad at - like bluffing and slow play frequencies and opponent assessment. Anyway, that's enough shouting across no-man's land for tonight: I do enough of this in the day job as it is.


Top
 Profile  
 
PostPosted: Tue Dec 03, 2013 11:36 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
I have nearly no idea what this means, and no clue how to get a number out of it. In my terms I think you are saying "call hands with high equity but medium variance, raise hands with low equity but high variance"? But that isn't a new parameter, it's just a (reasonable) rule based on my existing two parameters.


Right..that is basically what I am saying..but..you have have to polarize it a bit so your not doing the same thing all the time. But in general yes..you should be raising/betting your lower equity/high variance draws more often then your higher equity/lower variance draws because of equity realization. I am pretty sure that is good GTO advice but dont quote me on it! Some of this GTO stuff I am still trying to learn but it does make sense.


Top
 Profile  
 
PostPosted: Wed Dec 04, 2013 12:39 am 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
Even if I was interested in writing a rules based based system it's not an option for me because I don't know the rules.


Learning the rules is not easy I must admit. I have adopted poker coaching sites in order to understand game theory. One guy in particular I have focused on.

Code:
I thought the approach I advocated for fisherking had some benefits both for him and physics guy. It would be an extensible platform that would allow poker guy quite a lot of discretion in the things that MCTS is bad at - like bluffing and slow play frequencies and opponent assessment. Anyway, that's enough shouting across no-man's land for tonight: I do enough of this in the day job as it is.


Well that sounds like a good combination to me. Bluffing and slowplaying are fairly easy to figure out using hand combinatorics.

Slowplaying is rather straightforward and the frequency should be balanced with your air ideally. What I mean by this is that if your betting air 25% of the time you should be check back value 25% of the time as well to remain balanced. To find the frequency just run sims to see how often you have TP+ on the flop which is what I did. I cannot remember what mine turned out to be.

Bluffing is a bit more complicated in the fact that your perceived range is very important. You have to sell the the bluff and to do that your actions up to that point need to coincide with the hand your trying to represent. My bot will bluff when it has no showdown value and can represent at least 10 value combos (I have a thread here on how to figure that out which is a pain) based on its actions and there is enough fold equity in the villains range to pull it off.

Alot of this goes out the window if the villain is nitty or passive so you have to step up the pressure if he checks or folds too often in certain spots. Against those types I have incorporated more floats with complete air in order to take the pot away on a later street because they are less likely to bet the turn unless they really have the goods. Generally I only float in position but a check raise is a powerfull tool too...


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 44 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group