Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 6:36 pm

All times are UTC




Post new topic Reply to topic  [ 19 posts ] 
Author Message
PostPosted: Sat Dec 16, 2017 12:48 am 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
CFR+ is pretty similar to Vanilla, so far I have always used Monte Carlo methods.

So basically the value of a Turn all in should be similar to a River Showdown Node, but instead of just summing wins/losses I think you have to multiply the probability of each opponent hand with your equity against it.

Code:
   std::vector<double> ev(1326, 0);
   
   const Distribution &pholes = distributions[trainPlayer];
   const Distribution &oholes = distributions[!trainPlayer];

   for (int i = 0; i < p.size(); i++){
      double sum = 0;

      for (int j = 0; j < op.size(); j++){

         if (!pholes[i].overlaps(oholes[j])){
            sum += op[j] * (2 * turnEquities[i][j] - 1);
         }
      }

      ev[i] = sum * payoff;
   }

   return ev;


So far my games are always starting from the Turn. I created an equity table, but it takes a long time to create for each turn and it's rather big.
How do programs like Pio Solver do it? I don't think they are creatign tables, but I think they are using Vanilla or CFR+
Any ideas? You see any bugs in my code? It's loosely based on Amax' code.
One idea would of course be to just enumerate all Rivers, but I guess that would be slow?


Top
 Profile  
 
PostPosted: Sat Dec 16, 2017 2:57 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Is an equity table a table of the chance of each hand winning versus every other hand? I think I can do for the turn that in 2 seconds. Is that too long?


Top
 Profile  
 
PostPosted: Sat Dec 16, 2017 4:55 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
spears wrote:
Is an equity table a table of the chance of each hand winning versus every other hand? I think I can do for the turn that in 2 seconds. Is that too long?

Yes. Strange, it takes me much longer than that, more like 2 minutes, I am using the Alberta handevaluator.
Maybe I have some bugs or inefficiencies in my code, if it can be done in 2 seconds that would be more than ok, I will try to improve the table generation then.


Top
 Profile  
 
PostPosted: Sun Dec 17, 2017 6:46 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Ok, I brought it down to about 5 seconds, could even go lower if I really wanted. It's always interesting how a simple thing can make C++ code run orders of magnitude slower.


Top
 Profile  
 
PostPosted: Tue Dec 19, 2017 7:23 am 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
I think my code is working now for the turn. I am not sure about the speed, I compared it to what the UoA said about their implementation for deepstack, and mine seems to be even a bit faster, but when I compare it to Pio Solver it looks much slower.
Anyone here implemented vanilla or cfr+? How long does an iteration from the turn usually take?


Top
 Profile  
 
PostPosted: Tue Dec 19, 2017 7:38 am 
Offline
Senior Member
User avatar

Joined: Sun Mar 10, 2013 10:31 am
Posts: 139
Make shure you have no one operator new or std::vector.pushback in your loops.
Or any others working with memory. Only constant size arrays and operator [].


Last edited by nefton on Tue Dec 19, 2017 11:30 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Tue Dec 19, 2017 10:29 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
I think PioSolver may be doing something cleverer than CFR or CFR+ I don't properly understand it and it's been a while since I looked at it but I think the CMU paper on Libratus shows how CFR+ could be accelerated.


Top
 Profile  
 
PostPosted: Tue Dec 19, 2017 2:19 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
nefton wrote:
Make shure you have no one operator new or std::vector.pushback in your loops.
Or any others working with memory. Only constant size arrays and operator [].

I already only use constant size vectors. I never allocate memory after building the initial tree, each node has fixed vectors for all regrets, cumulative strategies, and even reach probabilities and returned evs. All I ever do is passing pointers.

spears wrote:
I think PioSolver may be doing something cleverer than CFR or CFR+ I don't properly understand it and it's been a while since I looked at it but I think the CMU paper on Libratus shows how CFR+ could be accelerated.

Ok I will check it out, thx.


Top
 Profile  
 
PostPosted: Wed Dec 20, 2017 10:58 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Ok, had an idea for this. For a given turn there are at most 48 unique rivers, maybe less if you can work isomorphs in too (IIRC it's less than 30) There are 1081 hands on the river. You don't have to store an equity table on the river, just the rank of each hand. So the total storage is 1081*48, much less than the turn equity table. Do you actually need the turn equity table?

Edit1: Just noticed it's for all ins, so maybe it's faster to calculate the turn equity table.
Edit2: This isn't a huge problem. Could you solve it using linear programming?


Top
 Profile  
 
PostPosted: Fri Dec 22, 2017 11:34 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
I didn't really think about LP. A friend of mine once implemented the river endgame solving as described by CMU, if I remember correctly the river alone was already using up most of his RAM, so I think it's probably not possible. Then Alberta also used CFR to create the training examples for DeepStack, so I thought it's probably the best method at this point.

I see a few points that I could improve, for example I could collapse most rivers if no flush draw is possible. I tried using some isomorphisms, but didn't get it to work so far. But even then, it will be much slower than Pio, I guess they really do something different. However I think I will just let it run the way it's now. I will build some Neural Networks and see if I can replicate what DeepStack did.


Top
 Profile  
 
PostPosted: Sat Dec 23, 2017 10:41 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Can you just clarify the problem you are trying to solve. Are the only possible actions allin or fold on the turn? And do you have hand probability distributions for both players?
if so, that is a very small problem and I'm fairly sure it could be solved by LP. LP would become infeasible when there are more actions.
If significant numbers of the hpd = zero you might be able to cut down the problem even more.

Edit: Also let me know how long your CFR solution is taking. I have a relatively simple idea that might improve CFR if you are just doing push fold but I need you to confirm my understanding of the problem first.


Top
 Profile  
 
PostPosted: Mon Dec 25, 2017 7:08 pm 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
I am basically trying to recreate the Turn Counterfactual Neural Networks, that were used for Deepstack. On the Turn as well as on the River the possible actions are fold, call, potsize bet and all-in. Stacksize is 100, potsize ranges between 5 and 50.
The reason why I only asked about Turn all-in nodes was, because I already had code for the normal showdowns from Amax' code.

I create random ranges and a random potsize, it terminates quicker with a bigger potsize as the players have less stack left, but on average I need something like 500 seconds for 1000 iterations, which seems to be similar to the runtimes Alberta had.


Top
 Profile  
 
PostPosted: Sat Dec 30, 2017 8:45 pm 
Offline
Junior Member

Joined: Mon Sep 11, 2017 8:01 pm
Posts: 19
PioSolver switched from a variant of CFR to their own algorithm according to some comment by the author I found on twoplustwo forums.

I'm somewhat confused by your post. Equity is not needed in CFR, so I assume you're using it for all-in situations on the turn as an optimization? My intuition is that it wouldn't speed things up too much, because the non-all-in cases would dominate computation time.

Anyway, look into OMPEval. This should be doable in about one millisecond.


Top
 Profile  
 
PostPosted: Sun Dec 31, 2017 8:44 am 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Yes, I only need the equity as an optimization. It takes only a few seconds now and I only have to do it once, so it's ok.

I guess I have to do more research on Pio though. I actually assumed CFR was still the best algorithm, I mean it's used by all the top researchers afterall.


Top
 Profile  
 
PostPosted: Sun Dec 31, 2017 12:32 pm 
Offline
Junior Member

Joined: Mon Sep 11, 2017 8:01 pm
Posts: 19
I would guess PioSolver will be pretty secretive about their algorithm. If your goal is just to replicate DeepStack, an algorithm based on CFR should be enough because that's what they've used.

By the way, I haven't really looked into CFR+ yet, any idea how does it compare to CFR with external sampling (the fastest sampling varian for poker I heard)?

Also, to those who know understand CFR, what's the intuitive explanation why regrets are weighted not by probability of achieving a node, but by "external probability" (our contribution to the probability is excluded)?


Top
 Profile  
 
PostPosted: Sun Dec 31, 2017 3:22 pm 
Offline
Junior Member

Joined: Sat May 27, 2017 2:35 pm
Posts: 33
Some insight into what Pio uses now


Top
 Profile  
 
PostPosted: Sun Dec 31, 2017 7:35 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
mediacalc wrote:


Nice find.

Years ago I tried a variant of fictitious play where on each iteration I
- calculated the best response of the current player
- estimated the value of the game as the average of the best responses of both players
- found the difference between the current best response value and the estimated value of the game (d)
- added the current best response * d / constant to the current strategy. At equilibrium d is zero, so no updating is required

It worked really well on the toy games I tested but failed when I tried something larger.


Top
 Profile  
 
PostPosted: Mon Jan 01, 2018 12:10 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
listerofsmeg wrote:
Also, to those who know understand CFR, what's the intuitive explanation why regrets are weighted not by probability of achieving a node, but by "external probability" (our contribution to the probability is excluded)?


I'm not sure this explanation is going to survive much scrutiny but here goes:
- Hero's reach probability is taken as 1.0 because it is a hypothetical case - if I take this this or that action what is the regret? Suppose the reach probability was 0, there would be no regret.
- Villain's reach probability is taken as the current one. If that were to be zero, there would be no regret.


Top
 Profile  
 
PostPosted: Mon Jan 01, 2018 8:31 pm 
Offline
Junior Member

Joined: Mon Sep 11, 2017 8:01 pm
Posts: 19
spears wrote:
I'm not sure this explanation is going to survive much scrutiny but here goes:
- Hero's reach probability is taken as 1.0 because it is a hypothetical case - if I take this this or that action what is the regret? Suppose the reach probability was 0, there would be no regret.
- Villain's reach probability is taken as the current one. If that were to be zero, there would be no regret.


That actually makes a lot of sense, thanks.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group