Poker-AI.org
http://poker-ai.org/phpbb/

Calculate Exploitability and convergence
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2645
Page 1 of 1

Author:  MrNice [ Mon Nov 18, 2013 8:47 am ]
Post subject:  Calculate Exploitability and convergence

Hi Guyz,

I'm working on CFRM-CS for FLHU....

I was wondering at which values should I look to see if my implementation is converging...

And by the way how should I measure the exploitability ... Should I calculate Best Response for both strategies and compare/sum/sub ?

Thanks for your help.

MrNice

Author:  flopnflush [ Mon Nov 18, 2013 9:27 am ]
Post subject:  Re: Calculate Exploitability and convergence

Afair the sum of the best responses should converge to zero.

Author:  MrNice [ Mon Nov 18, 2013 11:44 am ]
Post subject:  Re: Calculate Exploitability and convergence

Meaning that the implementation is converging right ?

Author:  flopnflush [ Mon Nov 18, 2013 1:59 pm ]
Post subject:  Re: Calculate Exploitability and convergence

yes!

Author:  MrNice [ Mon Nov 18, 2013 3:10 pm ]
Post subject:  Re: Calculate Exploitability and convergence

oki thanks :D

Any idea for exploitability ? is it linked to the convergence ? And how should I get it...

Author:  cantina [ Mon Nov 18, 2013 4:56 pm ]
Post subject:  Re: Calculate Exploitability and convergence

I think exploitability is the sum of the best responses. I'd be curious to see a heuristic that estimates this faster.

Author:  fraction [ Mon Dec 02, 2013 3:15 pm ]
Post subject:  Re: Calculate Exploitability and convergence

I'm having real trouble getting my head around how to calculate a best response to my CFRM generated strategy. I've read the accelerated BR paper and I can see how it might be done with PCS, but I'm using plain old CS.

My (probably bad) current understanding is: One player uses the CFRM strategy to make decisions, the other player uses best response strategy to make decisions. Exploitability is profit of best response. So far so good I think. I fall down on how to calculate best response. I know the CFRM strategy should be available to the best response. But in my mind I can't do it without turning the CFRM players hand face-up.

Can anyone point me in the right direction or explain it in, er, non-greek terms?

Author:  fraction [ Mon Dec 02, 2013 3:23 pm ]
Post subject:  Re: Calculate Exploitability and convergence

I'm having real trouble getting my head around how to calculate a best response to my CFRM generated strategy. I've read the accelerated BR paper and I can almost see how it might be done with PCS, but I'm using plain old CS.

My (probably bad) current understanding is: One player uses the CFRM strategy to make decisions, the other player uses best response strategy to make decisions. Exploitability is profit of best response. So far so good I think. I fall down on how to calculate best response. I know the CFRM strategy should be available to the best response. But in my mind I can't do it without turning the CFRM players hand face-up. (or is that the idea here :? )

Can anyone point me in the right direction or explain it in, er, non-greek terms?

Author:  flopnflush [ Mon Dec 02, 2013 5:31 pm ]
Post subject:  Re: Calculate Exploitability and convergence

What kind of bucketing method do you use. And do you want to find the best response within your full abstraction or the best response within your betting-abstraction but with unabstracted cards?

Author:  fraction [ Mon Dec 02, 2013 5:45 pm ]
Post subject:  Re: Calculate Exploitability and convergence

flopnflush wrote:
What kind of bucketing method do you use. And do you want to find the best response within your full abstraction or the best response within your betting-abstraction but with unabstracted cards?

Cheers for the response.

My bucketing is really simple at the moment. It's just EHS buckets based on pokerstove like rollouts vs random hands and my betting is unabstracted (it's limit). I'd be happy to find out it's best response within it's own abstraction, just to check if it's converging but I'd like to be able to check it's unabstracted best response if possible.

Author:  flopnflush [ Mon Dec 02, 2013 6:43 pm ]
Post subject:  Re: Calculate Exploitability and convergence

You can look at amax code to get an idea:
http://www.poker-ai.org/archive/www.pok ... 335#p40335

If you use perfect recall buckets I would recommend you to start by writing a recursive best response function. You can use precalculated bucket vs bucket ev lookup tables to speed it up. The unabstracted best response can also be calculated recursively, but that might be very slow. Implementing best response within an imperfect recall abstraction is tricky and I haven't done this yet.

Btw the sampling method of your cfrm algorithm doesn't matter. We don't use sampling when we calculate the best response. At least I haven't seen anyone doing this, but it could be possible.

Author:  fraction [ Mon Dec 02, 2013 8:38 pm ]
Post subject:  Re: Calculate Exploitability and convergence

flopnflush wrote:
You can look at amax code to get an idea:
http://www.poker-ai.org/archive/www.pok ... 335#p40335

If you use perfect recall buckets I would recommend you to start by writing a recursive best response function. You can use precalculated bucket vs bucket ev lookup tables to speed it up. The unabstracted best response can also be calculated recursively, but that might be very slow. Implementing best response within an imperfect recall abstraction is tricky and I haven't done this yet.

Btw the sampling method of your cfrm algorithm doesn't matter. We don't use sampling when we calculate the best response. At least I haven't seen anyone doing this, but it could be possible.


I've got very loose imperfect recall. I'll check the code out anyway.

i've got an idea about building lookup tables to help best response calcs as I do the CFRM recursion. Need to look into it more, cheers.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/