Poker-AI.org
http://poker-ai.org/phpbb/

Best Response Sampling
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2742
Page 1 of 1

Author:  flopnflush [ Wed Apr 30, 2014 12:12 pm ]
Post subject:  Best Response Sampling

Hey,
if we use Monte-Carlo-CFR, and train only one player while holding the strategy of the opponent constant, will it converge to a best response?

Author:  spears [ Thu May 01, 2014 9:27 am ]
Post subject:  Re: Best Response Sampling

Yes, I think I did that a long time ago. Very easy to test though...

Author:  flopnflush [ Mon May 12, 2014 8:24 am ]
Post subject:  Re: Best Response Sampling

Thanks for your answer spears.
I tested it with khun poker and it seems to work there at least. I'm asking because I really want to avoid implementing best response for imperfect recall bucketing holdem, because that's ugly. :)
Imo it totally makes sense that cfrm should converge to a best response if we train only one player. But I hope someone can confirm it? Or am I missing the obvious very easy way to test it, without calculating the real best response within the abstraction?

Author:  proud2bBot [ Mon May 12, 2014 3:59 pm ]
Post subject:  Re: Best Response Sampling

Yes, you'll get a best response. Thats basically how CFRM works: if you learn both player, they are readjusting to a best response versus each other, which leads to a nash equilibrium.

Author:  HontoNiBaka [ Tue Oct 07, 2014 1:34 am ]
Post subject:  Re: Best Response Sampling

Why is implementing a best response ugly? You will walk the real game tree in BR anyway, so the abstraction you used in CFRM doesn't matter much.

Author:  HontoNiBaka [ Sun Mar 01, 2015 10:40 am ]
Post subject:  Re: Best Response Sampling

Just saw this thread again, the problem with OP's approach is: you will really get a best response, but it's not the unabstracted game best response, it is the abstract game best response. In other words you will get the best response which you can represent with the buckets which you used in your CFRM, the EV of that will also only be viable in your abstraction, it might be much worse in the real game.

You will have to do the unabstracted best response in order to know your real game exploitability.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/