Poker-AI.org • View topic - Efficient MCCFR in Games with Many Player Actions (2012)

View unanswered posts | View active topics

Board index » Public Forums » Poker Papers

All times are UTC

Efficient MCCFR in Games with Many Player Actions (2012)

Page 1 of 1

[ 8 posts ]

Print view

Previous topic | Next topic

Author

Message

Joshua

Post subject: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Fri Mar 15, 2013 12:57 pm

New Member

Joined: Sun Mar 10, 2013 1:59 pm
Posts: 5

PROPERTIES

type = "article"
title = "Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions"
author = "GIBSON, Richard and BURCH, Neil and LANCTOT, Marc and SZAFRON, Duane"
year = "2012"
organization = "University of Alberta - Department of Computing Science"
url = "http://poker.cs.ualberta.ca/publications/NIPS12.pdf"
language = "en"
size = "9"

SUBJECT

abstract = "Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a smaller, sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for a given player. In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player’s actions according to the player’s average strategy. Our new algorithm is inspired by a new, tighter bound on the number of iterations required by CFR to converge to a given solution quality. In addition, we prove a similar, tighter bound for AS and other popular MCCFR variants. Finally, we validate our work by demonstrating that AS converges faster than previous MCCFR algorithms in both no-limit poker and Bluff."

Top

Nose

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sat Jun 08, 2013 1:53 pm

Regular Member

Joined: Sat May 25, 2013 7:36 am
Posts: 73

Question about the given implementation (p5).

Line 9: Shouldn't q be multiplied with the frequency of performing action a given infoset I?

(ie.

Code:

s(I,a) <- s(I,a) + (SIGMA(I,a) * q

instead of

Code:

s(I,a) <- s(I,a) + (SIGMA(I,a) / q

)

referring to cumulative profile-formula (attached as picture)

btw @ admins: a (La)TeX-Plugin ([TEX]-BB-Code respectively) would be nice to create this kind of formula

also when looking at line 5 of the algorithm - does anyone know the rationale of dividing the utility by the probability of reaching terminal node z instead of multiplying with it?

say, why

Code:

if(h \in Z) then return u_i(h)/q end if

EDIT: Attached algorithm for your convenience

Attachments:

File comment: MCCFR (AS)

as-mccfr.jpg [ 64.14 KiB | Viewed 28804 times ]

File comment: Cumulative profile

cum_regret.jpg [ 5 KiB | Viewed 28816 times ]

Top

cantina

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sun Jun 09, 2013 12:58 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

You can multiply your regret and cumulative updates by Q, or divide just the regret update and the leaf node utility.

Top

Nose

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Mon Jun 10, 2013 4:50 am

Regular Member

Joined: Sat May 25, 2013 7:36 am
Posts: 73

Thanks

I have another question: Assume the following scenario:

Let Player i be the SB. Right after the hole cards where dealt ASSampling decides the sample two cases: SB checks and SB min-bets. Assume in both cases ASSampling decides to let BB check behind/ call. In each case a chance node (dealing flop cards) follows.

According to the specification different community cards are dealt in each scenario whereas the outcome of all branches are summed up in order to obtain the regret for each action.

Does anyone know about the impact to the convergence rate if you ensure that each chance node returns the same community cards each iteration?

ASSampling will then be a little closer to CS than to pure, random MC

Top

Isildur11

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sun Jun 30, 2013 9:22 pm

Junior Member

Joined: Mon Jun 03, 2013 9:06 pm
Posts: 20

"for games with mores than 2 players, a second tree walk is required and we omit these details",

it is not clear what that means ? :?:

Top

cantina

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sun Jun 30, 2013 9:38 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Nose wrote:

Does anyone know about the impact to the convergence rate if you ensure that each chance node returns the same community cards each iteration?

Unless I'm misunderstanding your question, all action nodes are supposed to use the same community cards (relative to the round) when doing an iteration of CFRM.

Top

cantina

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sun Jun 30, 2013 9:48 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Isildur11 wrote:

"for games with mores than 2 players, a second tree walk is required and we omit these details",

it is not clear what that means ? :?:

Can you show more of the quote? It may mean they need to train players separately.

Top

Isildur11

Post subject: Re: Efficient MCCFR in Games with Many Player Actions (2012)

Posted: Sun Jun 30, 2013 10:02 pm

Junior Member

Joined: Mon Jun 03, 2013 9:06 pm
Posts: 20

Quote:

At opponent nodes, we also update the cumulative proﬁle (line
9) for reasons that we describe in a previous paper [2, Algorithm 1]. For games with more than two
players, a second tree walk is required and we omit these details.

and

Quote:
For future work, we would like to apply AS to games with
many player actions and with more than two players

weird

Top

Page 1 of 1

[ 8 posts ]

Board index » Public Forums » Poker Papers

All times are UTC

Who is online

Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum