Poker-AI.org Poker AI and Botting Discussion Forum 2013-06-30T22:02:09+00:00 http://poker-ai.org/phpbb/feed.php?f=25&t=2398 2013-06-30T22:02:09+00:00 2013-06-30T22:02:09+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4371#p4371 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]> Quote:

At opponent nodes, we also update the cumulative profile (line
9) for reasons that we describe in a previous paper [2, Algorithm 1]. For games with more than two
players, a second tree walk is required and we omit these details.



and

Quote:

For future work, we would like to apply AS to games with
many player actions and with more than two players


weird

Statistics: Posted by Isildur11 — Sun Jun 30, 2013 10:02 pm


]]>
2013-06-30T21:48:03+00:00 2013-06-30T21:48:03+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4370#p4370 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]> Isildur11 wrote:

"for games with mores than 2 players, a second tree walk is required and we omit these details",

it is not clear what that means ? :?:

Can you show more of the quote? It may mean they need to train players separately.

Statistics: Posted by cantina — Sun Jun 30, 2013 9:48 pm


]]>
2013-06-30T21:38:07+00:00 2013-06-30T21:38:07+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4369#p4369 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]> Nose wrote:

Does anyone know about the impact to the convergence rate if you ensure that each chance node returns the same community cards each iteration?

Unless I'm misunderstanding your question, all action nodes are supposed to use the same community cards (relative to the round) when doing an iteration of CFRM.

Statistics: Posted by cantina — Sun Jun 30, 2013 9:38 pm


]]>
2013-06-30T21:22:31+00:00 2013-06-30T21:22:31+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4366#p4366 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]>
it is not clear what that means ? :?:

Statistics: Posted by Isildur11 — Sun Jun 30, 2013 9:22 pm


]]>
2013-06-10T04:50:00+00:00 2013-06-10T04:50:00+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4310#p4310 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]>
I have another question: Assume the following scenario:

Let Player i be the SB. Right after the hole cards where dealt ASSampling decides the sample two cases: SB checks and SB min-bets. Assume in both cases ASSampling decides to let BB check behind/ call. In each case a chance node (dealing flop cards) follows.

According to the specification different community cards are dealt in each scenario whereas the outcome of all branches are summed up in order to obtain the regret for each action.

Does anyone know about the impact to the convergence rate if you ensure that each chance node returns the same community cards each iteration?

ASSampling will then be a little closer to CS than to pure, random MC

Statistics: Posted by Nose — Mon Jun 10, 2013 4:50 am


]]>
2013-06-09T00:58:46+00:00 2013-06-09T00:58:46+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4309#p4309 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]> Statistics: Posted by cantina — Sun Jun 09, 2013 12:58 am


]]>
2013-06-08T13:53:40+00:00 2013-06-08T13:53:40+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=4308#p4308 <![CDATA[Re: Efficient MCCFR in Games with Many Player Actions (2012)]]>
Line 9: Shouldn't q be multiplied with the frequency of performing action a given infoset I?

(ie.
Code:
s(I,a) <- s(I,a) + (SIGMA(I,a) * q


instead of

Code:
s(I,a) <- s(I,a) + (SIGMA(I,a) / q

)

referring to cumulative profile-formula (attached as picture)

btw @ admins: a (La)TeX-Plugin ([TEX]-BB-Code respectively) would be nice to create this kind of formula

also when looking at line 5 of the algorithm - does anyone know the rationale of dividing the utility by the probability of reaching terminal node z instead of multiplying with it?

say, why
Code:
if(h \in Z) then return u_i(h)/q end if


EDIT: Attached algorithm for your convenience

Statistics: Posted by Nose — Sat Jun 08, 2013 1:53 pm


]]>
2013-03-15T12:57:40+00:00 2013-03-15T12:57:40+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2398&p=3228#p3228 <![CDATA[Efficient MCCFR in Games with Many Player Actions (2012)]]> PROPERTIES

type = "article"
title = "Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions"
author = "GIBSON, Richard and BURCH, Neil and LANCTOT, Marc and SZAFRON, Duane"
year = "2012"
organization = "University of Alberta - Department of Computing Science"
url = "http://poker.cs.ualberta.ca/publications/NIPS12.pdf"
language = "en"
size = "9"


SUBJECT

abstract = "Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a smaller, sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for a given player. In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player’s actions according to the player’s average strategy. Our new algorithm is inspired by a new, tighter bound on the number of iterations required by CFR to converge to a given solution quality. In addition, we prove a similar, tighter bound for AS and other popular MCCFR variants. Finally, we validate our work by demonstrating that AS converges faster than previous MCCFR algorithms in both no-limit poker and Bluff."

Statistics: Posted by Joshua — Fri Mar 15, 2013 12:57 pm


]]>