Poker-AI.org • View topic - Public chance sampling

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Public chance sampling

Page 2 of 2

[ 28 posts ]

Go to page Previous 1, 2

Print view

Previous topic | Next topic

Author

Message

FlashPlayer

Post subject: Re: Public chance sampling

Posted: Mon Apr 22, 2019 10:31 am

Junior Member

Joined: Tue Jun 28, 2016 7:12 pm
Posts: 20

Looks like anyway in any CFR method each node has few iterations that used it in training. So we always must have a strategy there even if it will be poorly researched.

I am also working on best response inplementation now and very confused how commercial solvers (CREv, Monker, Pio etc) compute it so fast? They compute it on fly. But as i understand right - coarse algo here is:
1. Train few iterations with CFR
2. Stop training and compute BR. BR here is a brute force of all chance situations (our range, opp range, board). In every situation we calculate our best response and in the end sum them with probabilities (i mean i know what is BR and how it is calculated in general).
3. Go to 1 and so on..

But point 2 takes lot of time because this is at least one tree travel with lot of computations inside. But as i mentioned before - solvers do this "on fly" like they store and reuse some data during training that helps them to compute BR fast.
I searched for any algos for this, but didn't find anything except http://martin.zinkevich.org/publication ... 1_rgbr.pdf. Algo in this paper states how to minimize tree walks to one. But computations are still too long - checked it.

Any ideas what helps them (solvers) compute BR in second?

P.S (found after more researching): can it be something called "Frequentist Best Response" described here: https://poker.cs.ualberta.ca/publicatio ... -rnash.pdf ?

Top

spears

Post subject: Re: Public chance sampling

Posted: Tue Apr 23, 2019 9:03 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

I would love to know how commercial programs are as fast as they are, both for solving NE and for calculating BR. Somebody on this board HontoNiBaka? recently pointed out that the trees they use are a lot smaller than you would expect so maybe they exploit isomorphisms. I've never tried myself, but can you use sampling to speed up BR?

I doubt FBR helps you as I expect you are calculating BR in your own abstraction already.

Top

FlashPlayer

Post subject: Re: Public chance sampling

Posted: Tue Apr 23, 2019 2:05 pm

Junior Member

Joined: Tue Jun 28, 2016 7:12 pm
Posts: 20

spears wrote:

I tried to use sampling for BR computing but for some reason results are always different with "brute force" BR calculations. Very different. So i'm sure that i made it totally wrong but didn't find a reason.

What i did is just this:
1. Sample player cards, opp cards and board.
2. Walk the tree like we are walking it in classic BR algo - choosing max ev in player nodes and weighting ev (weights are from average strategy of course) in opponent nodes.
3. In terminal nodes as we have all info about cards and board i can calcultate utility - so everything looks easy here (no hand vs range calculations).
3. As result of this walk i have a br that i sum with other br from other iterations and in the end just divide on iterations count.

For some reason this resulting in wrong BRs. And i think i have some error in approach. Like maybe i don't need to sample opponent cards and only use his range or smth like this. So i tried this and then returned to optimizations of "brute force" approach.

About isomorphisms - yes it could speed up, but for example for simple push/checkdown tree from flop (around 13 nodes at all include checks on turn and river) my BR for one player calculates around 80 seconds in one thread. And this is after i did ALL calculations of hand ranks for showdowns. So literally this is a time that CPU needs to iterate through all tripples like hand1/hand2/board and check which rank is bigger in every utility node. Isomorphism can speed up this but not sure it can speed up somehow to 40 times.
Anyway speed of calculations depends on how we preparing data before tree walk.

I think i should describe my current algo a bit - maybe we can find weak place.

After CFR algo we have a tree. I'm using abstractions, so in each node (actually on each street) for every pair hand/board i have a bucket ("information set" they call it, but i preffer a "bucket"). So as result of CFR algo i have a strategy (probabilities with which i should choose every child action here) for each bucket in every node.
Now i take player's range, opponent's range and initial board (i am working on postlop solver, so i have a board from start - lets assume we are talking about a tree from flop - so i have three board cards on start).
Next i iterate through all player's hands and all possible boards. One iteration has on start player hand, opp range (i don't iterate opp range), full board (5 cards) and a probability of this choise. In general probability here is a player's hand probability, because board has same probability every time.
In iteration main idea is to take opponent's range and update probabilities of each hand walking through tree. In each opponent's node i take every hand from range, calculate a bucket for them (i know a board, so i can get a bucket), and then multiply this hand probability on strategy probability for this bucket in this node. And so on until we are not in utility node.
Inside utility node i still have player's hand (it is needed only here) and updated opponent's range. So i can calculate an utility just comparing hand ranks (with probability) of all opponent's range and player's hand rank.

That's all.

Problem here is that i need to iterate all possible boards. I can't take only opponent's range and with only one tree walk update all hands probabilities for all boards. Because in each opponent's node i need a board to get bucket. And i need a bucket to get strategy to update probabilities. And so on.
Yes of course i can do one tree walk, but then i will need to iterate through all boards in every utility node. Anyway this loop must be somewhere. And as i undestand right there is a major upgrade of this algo that removes this "all boards" loop somehow. But i cant find this ugrade)

Any ideas?) Maybe i'm doing something totally wrong?

Top

spears

Post subject: Re: Public chance sampling

Posted: Wed Apr 24, 2019 3:53 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Have you tried these from above?
viewtopic.php?p=7165#p7165
viewtopic.php?p=4125#p4125

Sorry, can't put much time into this, very busy at the moment.

Top

FlashPlayer

Post subject: Re: Public chance sampling

Posted: Thu Apr 25, 2019 8:58 am

Junior Member

Joined: Tue Jun 28, 2016 7:12 pm
Posts: 20

spears wrote:

Have you tried these from above?
viewtopic.php?p=7165#p7165
viewtopic.php?p=4125#p4125

Sorry, can't put much time into this, very busy at the moment.

Thanks)) That was really helpful.
Irony was that all answers are in the same topic lol)
I checked amax code and this helped alot with fast BR realization.

Actualy about this topic. At this moment there are so much CFR algo variants. Sampling class and not sampling. So general question is - what algo at this moment is a state of art in terms of fast convergence for poker domain (HU NLH)? I am interested in sampling algos because they can be paralleled with good efficiency.
I know that in general all sampling algos (Monte Carlo class of algo) are faster than non-sampling (like CFR, CFR+, Pure CFR etc). And PCS is the best in sampling class. Any updates with last years?
Also i heard that Pio solver uses non-CFR class algo inside. What type of algos to find equilibria also exists?

Top

spears

Post subject: Re: Public chance sampling

Posted: Thu Apr 25, 2019 9:15 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

CFR+ is quick.
There is good info and links in http://www.poker-ai.org/phpbb/viewtopic.php?f=26&t=3100

Myself, I'm going for my own low cost variant of DeepStack. This is the zeitgeist. AlphaGo, DeepStack, Libratus all use machine learning to generate evaluation functions at the leaves of the current tree.

Top

FlashPlayer

Post subject: Re: Public chance sampling

Posted: Fri Apr 26, 2019 6:33 pm

Junior Member

Joined: Tue Jun 28, 2016 7:12 pm
Posts: 20

Do you use your DeepStack implementation for botting?

Top

spears

Post subject: Re: Public chance sampling

Posted: Fri Apr 26, 2019 9:05 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Still developing around the day job.

Top

Page 2 of 2

[ 28 posts ]

Go to page Previous 1, 2

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum