Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Wed Jul 17, 2019 5:09 am

All times are UTC




Post new topic Reply to topic  [ 30 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Fri Jun 28, 2019 5:59 am 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
I implemented vanilla CFR and it touches 70 million nodes per second. I'm trying to solve turn subgames currently (both players have 100% ranges, 4 cards on board specified, one bet/raise size allowed on each street). Unfortunately, one full traversal of the game tree requires touching 6.5 billion nodes, so it would take 1.5 minutes to traverse the whole game tree once. That's obviously way too slow for one iteration as I imagine solving the turn subgame to a low level of exploitability would take 10,000 iterations. Maybe there's something wrong with my game tree and it shouldn't consist of 6.5*10^12 nodes. My game tree has 3700 decision nodes and there is a chance node for dealing the river. I take care of dealing hole cards separately. Anyways, 3700 * 1326 * 1326 = 6.5 billion.

Thoughts?


Top
 Profile  
 
PostPosted: Fri Jun 28, 2019 8:12 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Quote:
3700 * 1326 * 1326 = 6.5 billion


This sounds like each decision node is visited by just two hands. Is that what you are doing?

I have used two alternative schemes:
- visit with one training hand and a vector of opponent hands
- visit with a vector of training hands and a vector of opponent hands. If you order the river hands you can do 1326 comparisons at showdown instead of 1326 * 1326 comparisons. Unfortunately I think this also means you lose card removal effects.

I suggest you try CFR+ or sampling.
http://poker.cs.ualberta.ca/publication ... cm2017.pdf
https://poker.cs.ualberta.ca/publicatio ... 12-pcs.pdf

What language are you using?


Top
 Profile  
 
PostPosted: Fri Jun 28, 2019 12:23 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
spears wrote:
Quote:
3700 * 1326 * 1326 = 6.5 billion


This sounds like each decision node is visited by just two hands. Is that what you are doing?

I have used two alternative schemes:
- visit with one training hand and a vector of opponent hands
- visit with a vector of training hands and a vector of opponent hands. If you order the river hands you can do 1326 comparisons at showdown instead of 1326 * 1326 comparisons. Unfortunately I think this also means you lose card removal effects.

I suggest you try CFR+ or sampling.
http://poker.cs.ualberta.ca/publication ... cm2017.pdf
https://poker.cs.ualberta.ca/publicatio ... 12-pcs.pdf

What language are you using?


I’m using Java. C++ is probably faster, but I’m more familiar with developing a GUI in java (I’m hoping to make a commercial solver).

Either I’m not touching enough nodes per second or my game tree is too big or my approach is wrong.

Instead of dealing two hands and walking the tree for every possible combination of two hands, is it faster to pass in a 52x52 2D array (1x1326 may be better) for both ranges? I’m not sure if that would speed things up because it sounds like you end up multiplying this by a 52x52 strategy 2D array for every action and you end up just doing more work at each node, but perhaps I’m wrong.

I have heard of vector form CFR being faster than scalar form CFR. Not sure if vector form CFR refers to vectors for private information.


Top
 Profile  
 
PostPosted: Fri Jun 28, 2019 3:11 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Quote:
I’m using Java.
That's fast enough to start with. Just as long as you aren't using python.

Quote:
Instead of dealing two hands and walking the tree for every possible combination of two hands, is it faster to pass in a 52x52 2D array (1x1326 may be better) for both ranges?

I think passing in one 1326 long array of double and one int is faster than two ints and a double but it is a very long time ago that I compared the two so can't remember. The meat of the work is the same but the two ints has a lot more traversal overhead.

Passing in two 1326 long arrays of doubles is much faster than passing in one 1326 long array of double and one int but that speed up mostly comes from the faster showdown comparison (proportional to 1326 rather than 1326 * 1326). As I wrote before I think means you lose card removal effects.

Quote:
I have heard of vector form CFR being faster than scalar form CFR. Not sure if vector form CFR refers to vectors for private information.
If you can find the quote I'll see what I think it means.


Top
 Profile  
 
PostPosted: Fri Jun 28, 2019 4:31 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
Now that I'm passing in vectors of reach probabilities for both players ranges and doing O(n) terminal node evaluations I'm getting 18 iterations per second. This is much better than 1 iteration every 1.5 minutes. :D

Kind of worried it's operating too fast and there's something wrong with my code...


Top
 Profile  
 
PostPosted: Sat Jun 29, 2019 11:42 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
I've implemented PCS-CFR and I'm getting 787 iterations per second (1.27 ms per iteration, 1 iteration = 1 full walk of the game tree for 1 player) without any form of parallelization. Is that good? Bad? Average? My turn subgame consists of 1352 information sets and both players have 100% ranges. I heard somewhere that you'll have good converge after 1000 iterations * # of information sets which means my program would take 29 minutes to converge which is really bad since piosolver only takes 10 seconds for the same subgame.


Top
 Profile  
 
PostPosted: Sun Jun 30, 2019 7:25 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Lots of people, me included, would love to know how piosolver works so fast. I read somewhere that it doesn't use cfr but some form of gradient descent.


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 10:42 am 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
Finally got best response implemented and was able to test my PCS implementation. Everything is working since exploitability is converging to zero, but as I mentioned before my program is too slow. It takes several minutes to solve a turn subgame. It can do a river subgame pretty quickly though.

These are the functions taking up the most time in my code:

* Terminal node evaluation. I'm doing this in O(n) rather than O(n^2). Currently I'm doing 12 float operations * the number of hands in the largest range. I'm trying to think of a more clever way to account for card removal.

* Decision node evaluation. I have to create new arrays at these nodes for reach probabilities and counterfactual utilities.

* Updating cumulative strategy, getting current strategy, and updating cumulative regret.

These are things that I'm already doing:

* I have a LUT for evaluating hands on the river.

* When I sort player's ranges on the river, I add the sorted ranges to a map so I can retrieve them later for the same board.

I tried the following but they didn't improve speed:

* Creating arrays ahead of time (object pooling). Thought this would help since I create new arrays all the time for reach probabilities and counterfactual utilities.

I haven't parallelized my algorithm yet and I'm not accounting for hand isomorphisms (AhKh, AsKs, and AdKd are strategically the same on 9c7c5c), but even if my program became 4-8x faster it would still be way too slow, so I feel like I'm doing something fundamentally wrong.

Piosolver used to use cfr, so while the algorithm they're using now is faster, the fact that they used it before means it's viable. There's also a lot of new CFR variants, a couple of which may be faster than CFR+ and PCS.


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 1:52 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
You been on this less than week and know as much as I figured out in several years so I'm not sure I can help much. I think Noam Brown managed to prune the trees so that might help. What do you know that is faster than CFR+?


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 2:41 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
You might get some ideas from viewtopic.php?p=7843#p7843


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 3:06 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
spears wrote:
You been on this less than week and know as much as I figured out in several years so I'm not sure I can help much. I think Noam Brown managed to prune the trees so that might help. What do you know that is faster than CFR+?


There is a variant called Lazy CFR that allegedly converges 500 times faster than CFR+ (their Lazy CFR+ variant specifically). Pretty bold claim, but they do lazy updates, meaning they update the strategy at an information set every few iterations rather than every iteration. So on each iteration they only update O(log n) information sets. I’m having trouble understanding their pseudocode but I got most of it figured out.

There is also a variant called instant CFR, but I haven’t looked to deeply into it.


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 3:33 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Thanks. Interesting


Top
 Profile  
 
PostPosted: Mon Jul 01, 2019 10:43 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
My implementation is multithreaded now and 3x faster. It takes less only a few seconds to solve a river subgame but it takes several minutes to do a turn subgame, which is bad since piosolver only takes a few seconds to solve turn subgames.

I wishI could tell if my implementation of PCS is any good or not. I'm not aware of any benchmarks. I'm getting 750 iterations per second (1 iteration = full walk of game tree for each player) for a tree consisting of 1352 information sets.

For some reason my program gets stuck after a while on larger trees (i.e. exploitability won't go any lower and it hovers around the same value).

I experimented with pruning where if the opponent's reach probability with a hand is ~0, then I skip the float operations for that hand at terminal nodes. As far as I can tell this makes the algorithm 3-5% faster, but I haven't tested it fully.

I'm going to work on isomorphic hands and turns next and see how much that speeds things up.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 7:09 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
It's kind of weird, but creating a new array every time I get the strategy at a node slows things down by a factor of 3 relative to having that array be preallocated. However, I seem to get no speedup from preconstructing arrays for subgame counterfactual utilities, even though these arrays are the same size as the strategy ones.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 8:05 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Is the lifetime of one array entirely within a method and the other not?


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 9:01 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
spears wrote:
Is the lifetime of one array entirely within a method and the other not?


In either case, the lifetime of the array is contained within a single method call. The arrays aren't passed down recursively or anything. The only difference I can think of was that the strategy array was being created by a method in another class and returned to the method that needed it whereas the subgame counterfactual utilities array was being created within the method itself.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 9:11 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
I have a new interesting problem. My solver is converging to 0% exploitability for river subgames, but for turn subgames it will hit a low exploitability (1-3%) after only 20-30k iterations, but then the exploitability will barely improve after another million iterations. I thought maybe it had something to do with the size of the game tree, but on river game trees with tons of bet sizes and ranges consisting of 60x as many hands, my solver still converges fine. I can't even converge on a turn subgame with 18 combos for each player, 1 bet size, and a stack to pot ratio of 1.

For turn subgames, I sample the river ahead of time and set the reach probabilities for the appropriate hole card combinations to 0, but maybe I need to be renormalizing in some way.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 10:03 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 575
Quote:
I also checked that river cards were being dealt uniformly, so that's not the issue.


Are you sampling rivers? Can you create a small version of the game that doesn't sample rivers to eliminate errors arising from sampling?


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 10:46 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
spears wrote:
Quote:
I also checked that river cards were being dealt uniformly, so that's not the issue.


Are you sampling rivers? Can you create a small version of the game that doesn't sample rivers to eliminate errors arising from sampling?


By turn subgame I mean four cards are already on the board and then I sample rivers. If I don't sample river with four cards on the board then I'm just doing vanilla CFR instead of PCS.


Top
 Profile  
 
PostPosted: Tue Jul 02, 2019 10:52 pm 
Offline
Junior Member

Joined: Wed Jun 26, 2019 4:02 am
Posts: 22
If anyone is curious, I'm getting faster convergence if I reset negative cumulative regret to 0 after every iteration and if I multiply the cumulative strategy by (n/(n+1))^2 after updating it whenever iterationCount % possible board runouts == 0, where n = iterationCount % possible board runouts. Basically applying CFR+ techniques to PCS.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 30 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group