Poker-AI.org :: View topic - A Pattern Learning Strategy Using Convolutional Networks

See also http://www.pokernews.com/strategy/poker ... -24246.htm

It's good to see something new that is not yet another version of CFR, and also that huge models are not required.

I don't understand how the training process leads to a mixed strategy. Does it move the action frequency a little in the direction of greatest reward in each game using this Nestorev momentum thingy?

Where have you read that this method leads to a mixed strategy? The goal of this neural networks is to find the best move after the most reasonable thing is to take it 100% of the time i belive.
Nestorev momentiun is simply a descent algorithm of the gradient, tecnical stuff to train the network.

CFR leads to an equilibrium strategy, so a mixed strategy that is a probability distribution on a support of strategy with the same utility, this exploatative method leads to a pure best strategy. Trained against an equilibrium agent it should leads to a pure strategy in the equilibrium respons support getting the same utility of the equilibrium response but still being a pure strategy.

I was assuming and hoping it produces a mixed strategy. I believe a pure best response strategy would be highly exploitable.

Author:	Orac [ Mon Jun 20, 2016 1:37 am ]
Post subject:	A Pattern Learning Strategy Using Convolutional Networks
Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games Using Convolutional Networks by: Nikolai Yakovenko Liangliang Cao Colin Raffel James Fan Abstract: Poker is a family of card games that includes many variations. We hypothesize that most poker games can be solved as a pattern matching problem, and propose creating a strong poker playing system based on a unified poker representation. Our poker player learns through iterative self-play, and improves its understanding of the game by training on the results of its previous actions without sophisticated domain knowledge. We evaluate our system on three poker games: single player video poker, two-player Limit Texas Hold’em, and finally two-player 2-7 triple draw poker. We show that our model can quickly learn patterns in these very different poker games while it improves from zero knowledge to a competitive player against human experts. The contributions of this paper include: (1) a novel representation for poker games, extendable to different poker variations, (2) a Convolutional Neural Network (CNN) based learning model that can effectively learn the patterns in three different games, and (3) a self-trained system that significantly beats the heuristic-based program on which it is trained, and our system is competitive against human expert players. http://colinraffel.com/publications/aaai2016poker.pdf I hope this paper is of interest.

Author:	spears [ Sat Jun 25, 2016 8:51 pm ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
See also http://www.pokernews.com/strategy/poker ... -24246.htm It's good to see something new that is not yet another version of CFR, and also that huge models are not required. I don't understand how the training process leads to a mixed strategy. Does it move the action frequency a little in the direction of greatest reward in each game using this Nestorev momentum thingy?

Author:	AlephZero [ Thu Aug 25, 2016 4:27 pm ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
spears wrote: See also http://www.pokernews.com/strategy/poker ... -24246.htm It's good to see something new that is not yet another version of CFR, and also that huge models are not required. I don't understand how the training process leads to a mixed strategy. Does it move the action frequency a little in the direction of greatest reward in each game using this Nestorev momentum thingy? Where have you read that this method leads to a mixed strategy? The goal of this neural networks is to find the best move after the most reasonable thing is to take it 100% of the time i belive. Nestorev momentiun is simply a descent algorithm of the gradient, tecnical stuff to train the network. CFR leads to an equilibrium strategy, so a mixed strategy that is a probability distribution on a support of strategy with the same utility, this exploatative method leads to a pure best strategy. Trained against an equilibrium agent it should leads to a pure strategy in the equilibrium respons support getting the same utility of the equilibrium response but still being a pure strategy.

Author:	SkyBot [ Thu Aug 25, 2016 7:24 pm ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
Thanks a lot for that link. I am atm playing with deep q-learning (see google's DeepMind, dqn, dual dqn, dueling dqn,...). However, while all DeepMind approaches use convolutions I did not use them till now, I thought those would not help much for poker. But I think it is time to try this too...

Author:	spears [ Thu Aug 25, 2016 7:48 pm ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
AlephZero wrote: Where have you read that this method leads to a mixed strategy? The goal of this neural networks is to find the best move after the most reasonable thing is to take it 100% of the time i belive. Nestorev momentiun is simply a descent algorithm of the gradient, tecnical stuff to train the network. CFR leads to an equilibrium strategy, so a mixed strategy that is a probability distribution on a support of strategy with the same utility, this exploatative method leads to a pure best strategy. Trained against an equilibrium agent it should leads to a pure strategy in the equilibrium respons support getting the same utility of the equilibrium response but still being a pure strategy. I was assuming and hoping it produces a mixed strategy. I believe a pure best response strategy would be highly exploitable.

Poker-AI.org http://poker-ai.org/phpbb/

A Pattern Learning Strategy Using Convolutional Networks http://poker-ai.org/phpbb/viewtopic.php?f=25&t=2982	Page 1 of 1

Author:	SkyBot [ Fri Aug 26, 2016 8:18 pm ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
spears wrote: I was assuming and hoping it produces a mixed strategy. I believe a pure best response strategy would be highly exploitable. Not sure, but I think you can just train an additional net as explicit policy net. So you have one net that tells you the value of states, and one that tells you percentages to take an action. old: I would probably still use action that leads to maximum state by default, and only use the policy if the state values are near each other (during usage, in training/self-play you use the policy always (except epsilon-greedy exploration stuff))... edit: stupid me: you make the enemy indifferent to calling/folding not yourself, doh new: just trust your nets But I am not sure if net training would require you to evaluate all possible actions (more expensive to train, but no problem for HU-Limit, more of a problem for my case: 6-max-NL...). But we can use the state value estimation network for that.

Author:	SkyBot [ Thu Sep 08, 2016 2:50 am ]
Post subject:	Re: A Pattern Learning Strategy Using Convolutional Networks
For details how to connect the 2 nets to get a mixed strategy profile read this: http://arxiv.org/abs/1603.01121

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/