Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Wed Jan 19, 2022 2:06 pm

All times are UTC




Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Sun Dec 26, 2021 5:48 pm 
Offline
New Member

Joined: Sat Dec 25, 2021 9:00 am
Posts: 5
I implemented an MCCFR algorithm for HU and calculated some blue prints strategy for different stack sizes.
My cards abstraction is based on the following bucket sizes 169-200-200-200 and based on the eart mover distance metric. My action abstraction allows for 6 different actions and is limitted to 4 actions per street.
The biggest problem is saving and loading blueprints. I cannot compute the entire game tree at once as I run out of RAM quickly. For that reason, I do runs that use my entire RAM (~64GB RAM), save all nodes to disk and start the training again.

For now, I just put everything in a folder structure, that represents the game tree and create small binary files for each leaf holding the strategy. This makes it quite convenient to navigate through the game tree, but saving and loading takes 95% of the training time. (And deleting the tree from disc takes ages...)
I also tried to store a single binary file for the entire blueprint stategy, but that does not help, if I re-run the training and need to load only parts of the entire blue print strategy into memory.

I am currently thinking of creating a database with nested sets. Once the entire game tree is created its topology should not change (with the assumption that you stick to your card and action abstraction) and I can update the leafs.
Has anyone tried something in that direction or is there an alternative approach that you could recommend?


Top
 Profile  
 
PostPosted: Mon Dec 27, 2021 2:40 pm 
Offline
New Member

Joined: Sat Dec 25, 2021 9:00 am
Posts: 5
I have run my first HU training tonight for effecitve stack sizes 25 and 6BB. The screenshots show the effective stack sizes in BB/10. I aggregated all raise sizes and just show the propability for raise/all-in, check/fold and call for the first action of the SB. Interestingly, the strategy seems to converge against completing with pocket pairs.

Attachment:
25BB.png
25BB.png [ 9.75 KiB | Viewed 172 times ]


Attachment:
6BB.png
6BB.png [ 8.94 KiB | Viewed 172 times ]


I still have not worked on a proper way to save blue prints. For now I am digging through Fossana's C++ solver, added support for post river bets and removed the thread building blocks dependencies. I think I got it working - but, I don't have any GTO solver at hand to compare any results. I'll try to plug some ranges from my blue print strategies to the solver in the few next days and do some research on depth limited solving.


I'll keep you poseted!

Thanks,
Ace


Top
 Profile  
 
PostPosted: Wed Jan 05, 2022 7:05 pm 
Offline
New Member

Joined: Sat Dec 25, 2021 9:00 am
Posts: 5
Here is a short update: After running 5 million iterations of the discounted MCCFR, I am able to significantly beat SAGE on the effective stack sizes that SAGE is applicable to (3 million sample hands rolled out). It is fun to play with this type of analysis.

Since this topic does not receive any feedback and the board seems rather dead, I will suspend this thread.

Thanks,
Ace


Top
 Profile  
 
PostPosted: Wed Jan 12, 2022 12:33 pm 
Offline
New Member

Joined: Fri Mar 12, 2021 3:53 pm
Posts: 8
hey, sorry the forum is pretty dead

you should not be running out of memory with that small abstraction size. how many preflop/flop/turn/river nodes are in your public tree?

saving/loading blueprint is also pretty simple, just save/load the strategy to/from disk. if you are storing the regrets/strategy at each node then just iterate over every node and save/load that node.

your 25bb results look off. that is the small blind strategy? only hands that should be folding are like 92o-32o, maybe 83o/73o.

what is discounted mccfr? the way that weighting regrets works in cfr+/discounted cfr don't apply well to mccfr because of the variance. linear mccfr is a method that does a big block of iterations and then weights them.

5m iterations for mccfr is really really small. it doesn't sound like external sampling mccfr. are you using something else? maybe public chance sampling?

pm me if you want another place to discuss this type of stuff


Top
 Profile  
 
PostPosted: Fri Jan 14, 2022 2:44 pm 
Offline
New Member

Joined: Sat Dec 25, 2021 9:00 am
Posts: 5
Thanks for your reply!

I will calculate the size of the tree, and estimate the memory needs. Maybe there is a mistake on my end. Thanks for the hint!

With discounted MCCFR, I meant, that I discount the regret as written in the pluribus paper:
(https://www.science.org/doi/suppl/10.1126/science.aay2400/suppl_file/aay2400-brown-sm.pdf
I am not sure if this is different from the linear MCCFR?

Code:
decay = ((float(m_iteration)) / m_maxIteration) / ((float(m_iteration)) / (m_maxIteration + 1));


I am using external sampling MCCFR as far as I understood.
and you are right, 5M iterations is indeed a very small number of runs, but as said I cannot run more for now without running into memory problems - I believe there is a mistake on my end then... I will double check my code and count my public nodes. Thanks for your help! Highly appreciated!

Thanks
Ace


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group