Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 2:13 pm

All times are UTC




Post new topic Reply to topic  [ 6 posts ] 
Author Message
PostPosted: Fri May 01, 2015 5:11 pm 
Offline
Junior Member

Joined: Mon Jan 19, 2015 4:58 pm
Posts: 15
Dear all,


Afaik CFR goal is to find a strategy which has the highest EV vs its worst-case/BR opponent. I come from an investment background where decisions are made based on risk-reward ratios. I am wondering whether there is a way to adjust CFR so it would exchange part of the EV for a smaller variance ( preferably specifying a tradeoff between std and EV ).

Firstly, I was thinking of replacing the outcome with some risk adjusted version of it - say outcome/standard deviation. The problem is how to track the std and if it makes sense at all.

Supposing that I am able to do some quick and dirty fix on the previous point I started thinking how to test it. The first thing that came to my mind is to play EQ vs Risk adjusted EQ, but then its obv that EQ will win and risk-adjusted will lose :) I could compare it to EQ vs EQ but that somehow feels a bit off.

Having solved these two issues I am planning to dedicate some of my time to it and am willing to share the results!


Cheers,
DIB

PS. I am assuming HUNL here.

_________________
Let's drop conventional languages and talk C++ finally.


Top
 Profile  
 
PostPosted: Fri May 01, 2015 10:54 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
In investment variance and return are correlated, but I'm pretty sure the idea doesn't extend to poker well. If you want better EV than Nash then you have to exploit villain's weakness without getting exploited yourself. There are a few papers from U of A on this subject, the most recent being Data Biased Robust Counter Strategies. There are a few threads here but http://poker-ai.org/archive/www.pokerai ... =79&t=4514 is probably quite a good place to start


Top
 Profile  
 
PostPosted: Sat May 02, 2015 7:15 am 
Offline
Junior Member

Joined: Mon Jan 19, 2015 4:58 pm
Posts: 15
Hi spears,


I have looked into DBR, but with limited success so far and imho DBR is about just EV.

You are right that in investment risk-reward is correlated, but I came to the conclusion poker building on the following argument - tight players experience smaller variance than loose players. Now say you have EQ strategy and you make two equally good adjusted strategies - one loose and one tight. If you play the latters against EQ I would expect that both will underperform, but with difference variance levels.

Now that I think I probably want to just make CFR generate a strategy with adjustable "tightness"


Cheers,
DIB

_________________
Let's drop conventional languages and talk C++ finally.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 9:24 am 
Offline
Junior Member

Joined: Mon Jan 19, 2015 4:58 pm
Posts: 15
hi again,

I am currently revisiting CFR in a different context where I suspect there is a mean-variance tradeoff and my question again - how to make CFR biased towards actions which would make the whole payoff structure less variable even if resulting in lower EV.

I am now experimenting again in the HU poker context how to achieve this. The way I see there are two extremes:
1) EQ strat which maximises utility irrespective of variance
2) minimum variance strategy which minimizes variance irrespective of utility. Example could be for the first player to always fold thus resulting in a fixed payoff and variance of 0

The first one we can already achieve with the regular CFR, but how to do the 2nd one? I suspect that I either should meddle with the payoffs at the bottom of the tree or the regret updates, but I am slightly puzzled on how to compute variance from a single traversal...

What are your thoughs on this?

_________________
Let's drop conventional languages and talk C++ finally.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 6:57 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
You can calculate an ev variance due to the different future board cards in addition to an ev during the cfr calculation. Drawing hands have greater variance than made hands. So I guess you could penalise high variance evs. This doesn't address the other big source of variance though: uncertainty about villain's hand. Since you know villain's current strategy, I guess you could figure out what he is holding (in a statistically sense) and hence deduce a variance from that. It feels messy and I think the calculation is quite expensive.


Top
 Profile  
 
PostPosted: Sat Feb 13, 2016 10:49 pm 
Offline
New Member

Joined: Tue Sep 29, 2015 5:56 pm
Posts: 9
Perhaps you can use public chance sampling to measure variance at terminal nodes?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group