Poker-AI.org • View topic - Master Thesis: A.I-Learning fr. experiences / handhistories

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Master Thesis: A.I-Learning fr. experiences / handhistories

Page 1 of 1

[ 8 posts ]

Print view

Previous topic | Next topic

Author

Message

botfan

Post subject: Master Thesis: A.I-Learning fr. experiences / handhistories

Posted: Thu Aug 11, 2016 8:08 am

New Member

Joined: Thu Aug 11, 2016 7:12 am
Posts: 2

Hello everybody,

i am programming since years and the last month I've spend times in botting... but I also want to try out some things for my MasterThesis for my university...

- i have millions of hands now and want to start to create a learning bot...
- for the beginning I've changed from shanky and openholdem to AutoIT and scraping the tables with OCR (works fine)

... but now the hard part starts... :roll:

I am asking myself since days, whats the best approach for my decission problem...

i have now a database with millions of hands:
"hole_card", "position", "street", "action_type", "amount", "flop_community", "turn_community", "river_community", "won_or_lost"

and I have a brain / memory table:
- "hand", "position", "street", "action", "flop_community", "turn_community", "river_community", "won_pots_qnty", "lost_pots_qnty"

But I in every situation on the table I could not request "how often would 72o" winns the Pot from early positions and how
often the players at my database lost money woth 72o" and after the Result came back I could click a button... this way I thinks its weak and takes tooo long...

so I think I need something like a experience pattern maybe? (like a human memory...)

--> My First Approach / Thoughts:
- create maybe one table for each hand so 169 holecard tables.
- save for example 20.000.000.00 / 169 different --> so "only" 11343195 mio hands in each table then
- generates for each table an "experience table" e.g. with

72o experience Brain / memory MYSQL table:
- Hand: "72o" --> Pos: "UTG": --> Street: 0 (Preflop) --> Action: "All in" ---> Won pots: "22 times" ----> Lost pots: "1215 times"

AA experience Brain / memory MYSQL table:
- Hand: "AA" --> Pos: "UTG": --> Street: 0 (Preflop) --> Action: "All in" ---> Won pots: "9285 times" ----> Lost pots: "350 times"

and each time my table scrape recognized a situation which "he played in his past before" / "is already in his brain / memory
and the bot is only accessing / requesting his own brain (only a small table with all the end results of each played hand)

--> he should remember how often he made "good experiences" / which means he won more then he lost and based on that he should dicedes now at this
special situation and click for example a Button like "Raise" or "Fold" or "All in" or "Check"

Questions from my side are:

- Is this approach completly the wrong way :?:

- are there also solutions out there, where each situation at the table will be calculated and simulated / solved in realtime? (maybe thats better?)
- is it possible to implement also some exploits (when the brain recognized), that Cbets from the Btn VS BB on the Flop is based on the database results +EV longterm ect.
- my biggest problem is: "How the logic could recognized what is the best betsize exactly in this situation?"

If u could give me any hints, I really would be a happy man hehe :lol:

Top

spears

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Thu Aug 11, 2016 8:38 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

How many hours have you got? Does this have to be original work, not previously published?

Top

botfan

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Thu Aug 11, 2016 8:46 am

New Member

Joined: Thu Aug 11, 2016 7:12 am
Posts: 2

I still have time till next spring hehe, but i am so interested in this thing, that I want to start
bymyself and no at the moment i can I also try everything i want bymyself (means also previously published stuff)

thx for your help!

greetings from Germany...

Top

spears

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Fri Aug 12, 2016 4:52 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

viewtopic.php?f=24&t=2731

Top

shalako

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Fri Aug 12, 2016 6:37 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

I am not exactly sure what methods you are using but I have thought about using HH files to determine several things but just have not had the time to do it.

I think the biggest reason to do this would be to verify how accurate your villain ranging method is using showdown information. So my plan was to take each position and assign a community frequency preflop range (such as a 15% UTG FI range) then verify at showdown that the hand was in the assigned range. So that was step 1. Step 2 would be post flop range assignment using the same method. Preflop might be overkill and not need to be done as those ranges are pretty standard but post flop I see this as very important. I only work on PLO now but for NL I think the only preflop range that would need to be heavily analyzed is the 3B range as in NL it can consist of a big chunk of bluffs. When I was working on NL that was the toughest range to get right as the bluff hands varied widely from one villain to the next. You would have to have a ton of data on one particular player in order to get this correct for so I think in general using a community 3B frequency with range data provided by showdown info would be good here.

So I would then keep adjusting the ranges until the accuracy maxed out. You could start with basic community stats then run simulations on each one of your pre assigned ranges. If you did this for each position you would have a fairly complete picture of what to assign villains on an individual basis. An example is a loose UTG opener at 25% which is close to the CO opening range etc.

You would have to do this for all the various preflop scenarios (UOPFR, 3B, 3B call, limp, over limp, cold call, etc). In general I think you would only need 250k hands (for each limit as these frequencies would be vastly different from limit to limit) that have showdown info which would equate to about 3 or 4 million normal hands.

if you can assign a proper preflop range then post flop will get easier although its much more difficult to get right. To do post flop your going to have to analyze all the various lines a villain can take. The only time I really have big problems is with slow playing. My bot will get that wrong 100% of the time. I am not able to using weighted ranges so its a problem. Luckily villains only slow play less then 5% of the time in general from what I have noticed.

There are really only a few post flop ranges to be concerned about (cbet, XC/call, bet call, raise). The turn and river are fairly standard as in general there are fewer hands he can value bet obviously. OTF its a bit different as he can bluff about 66% of his c betting range with 50% on the turn and 33% on the river. In PLO your not going to see too much pure bluffing as its hard not to get a piece of any board but in NL its a different story.

So that is how I would use HH information to form an AI of sorts using nothing but range assignment. One idea of many I am sure.

Top

spears

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Fri Aug 12, 2016 8:08 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

My plan is:
1. Find a good NE solution.
2. Find the distribution of opponent strategies from hand histories
3. Find best response strategies to each of the opponent strategies
4. At play time, find the distribution of the opponent strategies that fits opponent play
5. Combine the NE solution and the BR response solutions to produce a safe response

If only I could find the time.

Top

shalako

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Tue Aug 16, 2016 5:59 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

I decided to test this method to see how accurate my range assignment was on some ranges that can be quite grey. So I used a UTG open limp range from micro limits. I had 500k worth of data with 2600 showdowns in that spot. Not a ton of data but enough to get a good idea. The UTG PFR was 15% with a VPIP of 40% so my assigned range was 15-40% of hands. I wrote a quick HH parser and found that the villains hand was only in that assigned range 40% of the time, with 40% being between 40-100% and 20% between 1-15%. So it is pretty much impossible to accurately range an open limp in micro limits. What is accurate are the PFR ranges at nearly 80%.

So what I learned from this is that a serious amount of trash hands are being played. The open limp and over limp ranges will never be accurate in lower limits as they are just too wide. I do believe that in higher limits the more accurate the ranges will be as less and less trash will be limped. After running these simulations its obvious that vs 1 limper the bot should probably be raising more as it will have a bigger range advantage then I realized.

Top

shalako

Post subject: Re: Master Thesis: A.I-Learning fr. experiences / handhistor

Posted: Thu Sep 01, 2016 4:25 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

I took this method to the next level and wrote a post flop HH simulator. On my first run I decided to test my range assignment on any river bet as this could have the biggest impact on winnings. I used HM2 to export all the hands with a river bet and showdown. After the first run my assignment was only 59% accurate. After many adjustments I maxed out at 68%. This makes sense as GTO value is 66% on the river so the other 33% will always be out of range as they are bluffs. So with the value range now accurate I was able to adjust the top end of the river checking range (aka showdown range) for the bluff catcher (which uses range distribution) to determine if the villain is over bluffing his range. The next step is to figure out the low point of the villains checking range which is probably overpairs or better. Preliminary HM analysis is showing that most top pairs are bet as a bluff in PLO but I am not really sure yet. Position will also have an impact as I think it would be better to bet fold top pair OOP.

Top

Page 1 of 1

[ 8 posts ]

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum