Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 2:02 pm

All times are UTC




Post new topic Reply to topic  [ 16 posts ] 
Author Message
PostPosted: Sat Jul 19, 2014 12:20 am 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
Hi guys!

I wanna start this topic about my tries in building poker bot. I would like to share my experience, and results (if have one) and discuss with you.

So, I started building poker bot about 1.5 months ago. I found poker-ai site and alberta publications, read quickly many articles. And I decided to start from some simple infrastructure:
1. Poker server with interface to Meerkat and HTML5 interface. I didn't complete it, 'cause didn't decide about many things like FL or NL, HU or Ring. But with this I understand that poker server is not so easy to implement as I think first (like size of rereise or calculation side pot).
2. Wrote own card evaluator. Also connected a few famous evaluator (spears2p2 and ualberta).
3. Function for Hand strength, Hand potential, ICM, Pot odds, etc.
4. Wrote interface from Meerkat to Clojure.

After this I wrote my first simple bot:
Code:
(defn acall [to-bot from-bot]
  (go-loop
   [state {}]
   (let [data (<! to-bot)]
     (match data
          [:turn _]
          (do
            (>! from-bot :check-or-call)
            (recur state))
          :else (recur state)))))


I run it in opentestbed, and happily watched how my bot loses to SmartBot.
Image

At this time I really understood that I cannot wrote bot in short time and it should take many months.

So I started to implement CFRM FL HU strategy. Articles about CFRM was hard for me, and even amax example not helped to understand. But I found this wonderful article http://modelai.gettysburg.edu/2013/cfr/ with source code in Java in 50 lines of code. As i think it is chance sampling CFRM and I ported code to Clojure and got result for Khun poker.

For now I have 2k lines of Code:
Clojure 1499
ClojureScript 109
Tests 511
---------
Total: 2119

I working... and in week I think I'll write another post in this thread about my abstraction in CFRM, results vs simple bots and maybe vs FellOmen.

So I have a few question:
1. Anybody interested in reading such of topic? Should I share this?
2. How I can measure exploitability of my CFRM strategy? I can see such nice graphics in alberta publication, where over time exploitability decrease, but how they measure it?


Top
 Profile  
 
PostPosted: Sat Jul 19, 2014 8:09 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Welcome, great first post, and yes I'm interested in reading. I think you can measure the exploitability (within the abstraction) by running CFRM on one player but not the other.


Top
 Profile  
 
PostPosted: Sat Jul 19, 2014 11:09 am 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
spears wrote:
Welcome, great first post, and yes I'm interested in reading. I think you can measure the exploitability (within the abstraction) by running CFRM on one player but not the other.

Hi spears,
When I run CFRM 1 time - I got some utility, but this utility depends on cards dealer to players. Maybe with different type CFRM(like vanilla or public sample) it is much easier. But for my CFRM I should run many many times and maybe with some sort DIVAT technology to reduce variance.

But thinking about your answers got me one idea. What If I first generate some cards(i.e. test data, all public and private information) and than I can run CFRM (without learning) on every test data and sum responses. So something like Test data concepts in Machine Learning.
Exemple: Test1. ["K" "T"] Test2. ["2" "5"] Test3. ["8" "A"]
And then after every 1000 iteration of CFRM. Run every test and display (sum all-utilities).
Ideally this sum should tend to zero.

I will try this )

UPDATE:
Not tend to zero. Utility tend to some negative value. At first I was wonder about it, than I realized. Second player is getting more information so second player should be more profitable.

Maybe I'll stop trying to get exploitability for now and focus on FL HU abstraction. And after return to exploitability.


Top
 Profile  
 
PostPosted: Sun Jul 20, 2014 1:55 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
You've got the idea.

I'm sceptical if you will ever make any money with a Nash Equilibrium strategy for FL HU. You might with NL HU, but even then it's not guaranteed, and it is a much bigger problem in terms of run time and memory requirements. So you have to think how to determine villain's strategy, exploit it safely, and keep the runtime and memory requirements within sensible bounds.


Top
 Profile  
 
PostPosted: Tue Jul 22, 2014 9:25 pm 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
spears wrote:
I'm sceptical if you will ever make any money with a Nash Equilibrium strategy for FL HU. You might with NL HU, but even then it's not guaranteed, and it is a much bigger problem in terms of run time and memory requirements.

Yeah, I realize it. And I don't have much hope for CFR. But algorithm is simple, don't need opponent modeling, so I will just run it for a few days and watch result against bots. Maybe it will be enough for microlimits. When strategy for FL will be okey, I should move to NL (I hope it will be not so hard to change abstraction)

So, about my results in last days:
Bucketing system generate 9 cards and then convert it to buckets:
[Tc 8h Ad Ks 3d 7d 9s 5h 4d] -> [[2 1 2 2] [4 2 1 2]] (first part for 1 player, second for 2 player)

Then, I have buckets like this [[2 1 2 2] [4 2 1 2]]. And for every game I choose random one from sequence. So with this trick I don't need calculate EHS or create LUT for training (only need for generating this buckets). On each round player just got 1 number - current strength.

Example of information set:
Code:
{:cards [1 1 1 1]
:history {:preflop [:call :call]
              :flop [:call :call]
              :turn [:call :call]
              :river [:call]}}


Abstracted Poker FL HU is just 100 LOC, but tests for them is more then 600 LOC.

Clojure 1727
ClojureScript 109
Tests 1085
---------
Total: 2921

So while CFR is learning I will try or MCTS, or maybe something for opponent modeling.


Top
 Profile  
 
PostPosted: Wed Jul 23, 2014 8:13 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
- Avoiding a lut is a clever idea. There is a small disadvantage in not covering the whole game space, but I doubt it's significant
- Making betting decisions on strength alone will not work: draws and made hands of the same strength are played differently.


Top
 Profile  
 
PostPosted: Mon Aug 04, 2014 9:15 pm 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
Good news everyone! My CFRM abstraction isn't working. I can beat only AlwaysCallBot
Image
Image

I think for a few reasons:
- "Making betting decisions on strength alone will not work: draws and made hands of the same strength are played differently". Thanks spears. To fix this, Maybe I should create bucketing from cards to vector of 2 values [hand strength, hand potential]
- maybe some error in implementation
- should run learning process for a week or more

So, I stopped all work on CFRM. And now I'm trying MCTS, and than I'll choose one.

For trying MCTS, I converted this ( http://mcts.ai/code/python.html ) implementation of NIM game to Clojure and compared speed.
I had known Nim game from TV Show "Fort Boyard" from my childhood.

And this is result:
100,000 iteration
- Python (http://mcts.ai/code/python.html) - 4.2s
- Clojure (only immutable data structures, like zipper) - 3.2s
- PyPy = Python+JIT (http://mcts.ai/code/python.html) - 0.9s
- Clojure Mutable Tree - 0.3s

In pure Java I think speed can be about 0.2s, maybe less with some optimization. But for now I happy with performance of Clojure with Mutable Tree.

Also, I played a little with HHEX poker database.

In short time I'll see at MCTS for poker, and will share experience.

PS: I wrote "Hello messages" about my self viewtopic.php?f=22&t=2&p=6209#p6209


Top
 Profile  
 
PostPosted: Tue Aug 05, 2014 6:34 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
- If this is good news I wonder what is bad news...

- How much does this bot lose in terms of bb/100?
- Are you playing duplicate?
- 1000 games is not many. Are you sure the result is statistically significant? There is a discussion of this somewhere in the archive

- It would be useful to find out if you've reached convergence
- Hand potential, ehs2, and strength variance are all ways to deal with drawing strength

- There is a java implementation of MCTS at https://code.google.com/p/cspoker/sourc ... ava?r=1382


Top
 Profile  
 
PostPosted: Tue Aug 12, 2014 8:32 pm 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
Hi guys!

Another week has passed! And I made some progress with MCTS. I think only now I understand basics of MCTS. Between Nix game and Poker in my MCTS implementation was so many misunderstanding and bugs. And now my bot overplay SimpleBot and SmarterBot, FlockBot, but loose to ChumpBot. So I should stop play with FL and move on to next level. I decided to work on NoLimit Holdem 6-Max. So I'm starting port my FL abstraction to NL.

---

Hi spears,

> If this is good news I wonder what is bad news...
I'm just big fan of Futurama )

> How much does this bot lose in terms of bb/100?
> Are you playing duplicate?
> 1000 games is not many. Are you sure the result is statistically significant?
In opentestbed serialized deck, and in first 1000 hands I already know when good cards deals to bot, when not, and how good playing graph should be like.

> It would be useful to find out if you've reached convergence
Didn't reached convergence. Maybe CFRM implementation was wrong or little time to training.

> There is a java implementation of MCTS at ...
I love CSPoker. It's really good framework. I spend many happy hours in NetBeans debugger )


Top
 Profile  
 
PostPosted: Wed Aug 13, 2014 8:25 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Dowakin wrote:
I'm just big fan of Futurama )

Thanks for the tip. Added to my rental list

Quote:
I love CSPoker. It's really good framework. I spend many happy hours in NetBeans debugger )

Sarcasm? Really, I never tried it and would like to know.


Top
 Profile  
 
PostPosted: Wed Aug 13, 2014 10:41 am 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
spears wrote:
Quote:
I love CSPoker. It's really good framework. I spend many happy hours in NetBeans debugger )

Sarcasm? Really, I never tried it and would like to know.


Really good. In MCTS implementation there are many plugable components for selection, backpropogation etc. Good OpponentModel. And it overplay all bots what I have (FellOmen2, ChumpBot, SmarterBot etc.).


Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 11:13 am 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
Hi!

Another update
  • created NL abstraction (3 max and more)
  • created dev web UI
  • NL MCTS

This is image of my dev UI:
Image

My MCTS is weak. Action selection is just uniform random. Showdown model is Rollout model from CSPoker. And "Hard" part is done, and I should move to "easy" part. I mean to Opponent model )
I will start from action selection based on VPIP AF and maybe a few other.

PS: LOC stats
Clojure 3121
ClojureScript 162
Tests 1307
---------
Total: 4590


Top
 Profile  
 
PostPosted: Sat Jan 10, 2015 10:54 pm 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
Hi guys!
6 month from my last post and I didn't stop my way in world of poker. Just I didn't have much to say. Like all this time I fixed some stuff, added some attributes for opponent modeling for MCTS and tested, tested and tested bot.

Now my bot playing NL 6max with other bot in Poker Academy with some small +, about 5-10 bb. It's hard to say exactly without some DIVAT in Poker Academy, and because I just constantly change some things, so result always different. But main point, that my bot is not good, just opponents are so lame (in NL short-ring). And I dont think I ready for real play.
This is stats:

Code:
|             :id | :count | :mean |  :std |  :allsum |
|-----------------+--------+-------+-------+----------|
| SergoSlowSeidel |   1215 |  0.17 |  9.82 |   205.50 |
|            Zeno |   3969 |  0.17 |  9.67 |   681.04 |
|         Trogdor |   3536 |  0.27 | 15.04 |   951.60 |
|           Enoch |   3614 | -0.36 | 14.15 | -1309.66 |
|        Sklansky |   3886 | -0.10 |  7.42 |  -371.41 |
|   Raspberry Jam |   3838 |  0.04 |  9.99 |   160.00 |


And I can give you some advice, what I would be doing if I started from scratch:
1. I would start from Rule-based player (maybe tight aggresive style)
2. Added OCR and got working profitable bot on micro-limits
3. Only after that would constantly improved bot with CFR for preflop, and MCTS for postflop.
4. For stats use PokerTracker, or HoldemManager. I spend so much time on impementing stats from scratch, but don't even know why.
5. MCTS don't work on preflop. Or I doing something wrong, 'cause for me it so aggresive preflop, and I don't know how fix it, how make it more reasonable (even in theory). So I just found some rules for preflop and implemented it (only for first move).

So guys, I spend 1 year on poker bot and I don't get close for good AI result. For real play I need OCR and this take me about a few month maybe more. And after that I don't know will my bot be profitable or not. So in next months I will know or be working on poker bot fulltime, or just go to "normal" job as developer and at night will be coding bots.

About OCR, I planning using VideoCapture API from VirtualBox, and Mouse API for playing. Don't know how stable is that API, 'cause when many year ago I played with it, API was not stable, I got many crashes. VMWare was much stable that time. I hope this changed.

This is screenshot of my Emacs and Virtual Box with Poker Academy:
Image


Top
 Profile  
 
PostPosted: Sun Jan 11, 2015 8:00 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
It's tough project, that's for sure.

I can see the merit of writing a rules based bot especially if you have good poker knowledge. But instead of writing rules that use the visible cards as inputs, maybe you could write rules that take hand strengths and their variance instead. There would be much fewer rules so the system would be much more manageable.


Top
 Profile  
 
PostPosted: Tue Jan 13, 2015 1:20 am 
Offline
Regular Member
User avatar

Joined: Wed Oct 02, 2013 5:00 pm
Posts: 64
Dowakin wrote:
About OCR, I planning using VideoCapture API from VirtualBox, and Mouse API for playing. Don't know how stable is that API, 'cause when many year ago I played with it, API was not stable, I got many crashes. VMWare was much stable that time. I hope this changed.Image


Yes, it is very stable. I use the VirtualBox Java API with my bot for long periods (up to 10 hours) and it never crashed. But keep in mind that the screen capture for Java is very slow, about 200ms for a 1440x900 screen on a MacBook Air.

Also the Mouse API does not show the mouse pointer. It's weird but the developers said that if a program running on the guest OS queries for the mouse position, it will get the correct value.

I'm planing to try the RDP protocol (Remote Desktop), as VirtualBox has an extension for that and it looks very fast using a RDP client.


Top
 Profile  
 
PostPosted: Tue Jan 13, 2015 1:10 pm 
Offline
Junior Member

Joined: Thu Jun 19, 2014 5:16 pm
Posts: 13
spears wrote:
It's tough project, that's for sure.

I can see the merit of writing a rules based bot especially if you have good poker knowledge. But instead of writing rules that use the visible cards as inputs, maybe you could write rules that take hand strengths and their variance instead. There would be much fewer rules so the system would be much more manageable.

Another way for good poker player is use machine learning on own hand history with known hole cards. I think even a few attributes can be profitable on microlimits ,when so many people go allin with weak cards (like my bot :) ), even without opponent stats. And that step by step player should notice what important for decision in some situation and add new feature for machine learning.

There was one pdf about similar method, where ANN was used, but author used logs without hole cards (or only where was showdown), and with hole cards it should be much stronger bot.

I think it's really good path for good poker player. So sorry I'm not so good player. I was thinking a lot about it, but I cannot find hand history with hole cards. If someone can give me own hand history I would be very grateful. Maybe after I got working interface to poker table, I would try to find poker expert for cooperating in this method.


corintio wrote:
Yes, it is very stable. I use the VirtualBox Java API with my bot for long periods (up to 10 hours) and it never crashed. But keep in mind that the screen capture for Java is very slow, about 200ms for a 1440x900 screen on a MacBook Air.

Also the Mouse API does not show the mouse pointer. It's weird but the developers said that if a program running on the guest OS queries for the mouse position, it will get the correct value.

I'm planing to try the RDP protocol (Remote Desktop), as VirtualBox has an extension for that and it looks very fast using a RDP client.

Oh, I read about this weird mouse visibility. It's may be problem for debugging. I played a lot with FPGA and worked with keyboard interface and with mouse interface, and in theory it's not so hard to do, but invest time in this hardware stuff is only may sense when AI is working.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group