Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 12:29 pm

All times are UTC




Post new topic Reply to topic  [ 6 posts ] 
Author Message
 Post subject: Best Response Sampling
PostPosted: Wed Apr 30, 2014 12:12 pm 
Offline
Junior Member

Joined: Sat Nov 02, 2013 2:21 pm
Posts: 26
Hey,
if we use Monte-Carlo-CFR, and train only one player while holding the strategy of the opponent constant, will it converge to a best response?


Top
 Profile  
 
PostPosted: Thu May 01, 2014 9:27 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
Yes, I think I did that a long time ago. Very easy to test though...


Top
 Profile  
 
PostPosted: Mon May 12, 2014 8:24 am 
Offline
Junior Member

Joined: Sat Nov 02, 2013 2:21 pm
Posts: 26
Thanks for your answer spears.
I tested it with khun poker and it seems to work there at least. I'm asking because I really want to avoid implementing best response for imperfect recall bucketing holdem, because that's ugly. :)
Imo it totally makes sense that cfrm should converge to a best response if we train only one player. But I hope someone can confirm it? Or am I missing the obvious very easy way to test it, without calculating the real best response within the abstraction?


Top
 Profile  
 
PostPosted: Mon May 12, 2014 3:59 pm 
Offline
Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216
Yes, you'll get a best response. Thats basically how CFRM works: if you learn both player, they are readjusting to a best response versus each other, which leads to a nash equilibrium.


Top
 Profile  
 
PostPosted: Tue Oct 07, 2014 1:34 am 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Why is implementing a best response ugly? You will walk the real game tree in BR anyway, so the abstraction you used in CFRM doesn't matter much.


Top
 Profile  
 
PostPosted: Sun Mar 01, 2015 10:40 am 
Offline
Veteran Member

Joined: Wed Mar 20, 2013 1:43 am
Posts: 267
Just saw this thread again, the problem with OP's approach is: you will really get a best response, but it's not the unabstracted game best response, it is the abstract game best response. In other words you will get the best response which you can represent with the buckets which you used in your CFRM, the EV of that will also only be viable in your abstraction, it might be much worse in the real game.

You will have to do the unabstracted best response in order to know your real game exploitability.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group