Quote:
To create an aggressive agent, we used the Counterfactual Regret Minimization technique from
Chapter 3, but with one change. On every terminal node of the game tree, we gave a 7% bonus in
utility to the winning player, while the losing player did not pay any extra cost. This means that the
game is no longer zero sum, as more money is being introduced on each hand. The important benefit
of this approach is that the agent always thinks it has better odds than it actually does — it is more
willing to fight for marginal pots. For every $1 it invests in a bet, it thinks it will receive $1.07 in
return even if the opponent immediately folds, and $2.14 if the opponent calls the bet and the agent
wins the showdown. This extra return on investment encourages it to bet in more situations where it
might otherwise call or fold. However, since it learns its strategy by playing against an agent trying
to exploit it, it learns to express this extra aggression in a balanced way that still effectively hides
information.
source: http://poker.cs.ualberta.ca/publications/johanson.msc.pdf, pg.82
So instead of adding a 7% bonus to pots, you would subtract x% for rake.Statistics: Posted by Magnum — Fri Mar 22, 2013 11:24 pm
]]>