Poker-AI.org :: View topic - Neural net based approach inspired by CFR

i wonder how you would set a differentiable loss function so that the SGD based optimizers could work? the original CFR algo seems undifferentiable.

So after the update, it might output "91%" for example, depending on the learning rate.

Quote:

So after the update, it might output "91%" for example, depending on the learning rate.

yep, that's where could be hard for an NN approach.

how NN updates its weights in one iter:
1. you set a differentiable loss function loss(x)
2. for each weight wi, calculate its partial derivative dloss/dw and its gradient.
3. update wi = wi - learning_rate * gradient.

for NN or DL approaches, the forward is like:
one_layer = activate(wx + b)
pred = softmax(one_layer(one_layer(...))

how would you design your loss function and one iter?

I think it's feasible with a robust enough network. These days it's not unusual for DBNs to have many thousands of inputs for image recognition.

I tried this awhile back with HUL and SFF NNs. It converged to a point but began favoring more common hands (center of the bell curve).

I think cfr has a chance against 6max.

Using deepstack method, the solving can be limited to 1 street or even half streets.

Furthermore, postflop situations are most of the time 2-3 player only. We'd only have to worry about solving 6max for single street preflop.

Author:	listerofsmeg [ Fri Jan 05, 2018 2:37 pm ]
Post subject:	Neural net based approach inspired by CFR
Do you think the following approach might work for 6-max no limit? Basically, have a neural net that has information visible to the current player as an input and action probabilities as its output. The training would work like this. Play a batch of say 1000 games and calculate regrets and regret-matched strategies for each information set encountered, similarly to outcome-sampled CFR. Then, take these information sets and their strategies use them to update the neural network, so that it plays closer to the computed strategies. Thoughts?

Author:	spears [ Sat Jan 06, 2018 9:44 am ]
Post subject:	Re: Neural net based approach inspired by CFR
I think it would be hard. The inputs to the NN would have to encode the current hand, current and past boards, and all previous actions (of the current hand) of all players. What would the encoding be? If there are too many inputs there would be no generalisation so learning would be very slow. If there are too few inputs there would be error. I think the NN would find it difficult to extract the important features of unencoded inputs in any reasonable timescale.

Author:	listerofsmeg [ Sat Jan 06, 2018 3:25 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
My thinking was to encode hand type (e.g. middle pair + flush draw), hole cards, our stack size, opponent stack sizes summary, pot size and very summarized betting history. For example, on the flop, preflop betting can be summarized as "3bet, we were the aggressor". The pot and stacks are needed because they can't be inferred from the betting history, because it's summarized. Anyway, I'm probably gonna go ahead with the approach of making a formula- or rule-based decision for every hand in my range. First form my raise range, then form a call range based on betsize, and finally the bluff range based on the number of value bets.

Author:	menc [ Thu Jan 11, 2018 8:48 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
maybe it could be possible. but in my opinion, the CFR algo and value-based qlearning equal. CFR focus on "how much you didn't get" while qlearning focus on "how much you get", they are almost the same. i wonder how you would set a differentiable loss function so that the SGD based optimizers could work? the original CFR algo seems undifferentiable.

Author:	listerofsmeg [ Sat Jan 13, 2018 12:25 am ]
Post subject:	Re: Neural net based approach inspired by CFR
Quote: i wonder how you would set a differentiable loss function so that the SGD based optimizers could work? the original CFR algo seems undifferentiable. Not sure I entirely follow you. I imagine it could work like this. In every iteration: 1. Sample from the game tree, calculate regrets for each information set, calculate strategies matching the regrets. For example, we can have "with AA preflop, raise 100% of the time". 2. Do one training backpropagation step on the neural network(s). Let's say the neural network currently predicts "with AA preflop, raise 90% of the time". We say, "no, 90% is incorrect, the correct answer is 100%" and perform an update in that direction. So after the update, it might output "91%" for example, depending on the learning rate. An important note is that the learning rate would be proportional to the counterfactual reach probability. This is similar to how regrets are weighted in CFR. That's the main idea behind this neural network approach. Intuitively, it seems that this could possibly work well, because it works for CFR. Another note, the strategies from step one are obviously incorrect, because they're based on just one sample. But on average, they should be correct (or not? not sure here).

Poker-AI.org http://poker-ai.org/phpbb/

Neural net based approach inspired by CFR http://poker-ai.org/phpbb/viewtopic.php?f=24&t=3107	Page 1 of 1

Author:	menc [ Sat Jan 13, 2018 9:09 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
Quote: So after the update, it might output "91%" for example, depending on the learning rate. yep, that's where could be hard for an NN approach. how NN updates its weights in one iter: 1. you set a differentiable loss function loss(x) 2. for each weight wi, calculate its partial derivative dloss/dw and its gradient. 3. update wi = wi - learning_rate * gradient. for NN or DL approaches, the forward is like: one_layer = activate(wx + b) pred = softmax(one_layer(one_layer(...)) how would you design your loss function and one iter?

Author:	PassiveBot [ Sat Jan 13, 2018 11:20 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
menc wrote: Quote: So after the update, it might output "91%" for example, depending on the learning rate. yep, that's where could be hard for an NN approach. how NN updates its weights in one iter: 1. you set a differentiable loss function loss(x) 2. for each weight wi, calculate its partial derivative dloss/dw and its gradient. 3. update wi = wi - learning_rate * gradient. for NN or DL approaches, the forward is like: one_layer = activate(wx + b) pred = softmax(one_layer(one_layer(...)) how would you design your loss function and one iter? If you're generating examples of State & Action and the Regret Value, then all you would need to do is use mean squared error as a loss function and train the network on those values. Of course you're going to need a lot of examples and/or to represent the state in a way that buckets things enough.

Author:	cantina [ Tue Jul 24, 2018 11:05 am ]
Post subject:	Re: Neural net based approach inspired by CFR
I think it's feasible with a robust enough network. These days it's not unusual for DBNs to have many thousands of inputs for image recognition. I tried this awhile back with HUL and SFF NNs. It converged to a point but began favoring more common hands (center of the bell curve).

Author:	PassiveBot [ Wed Jul 25, 2018 3:41 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
cantina wrote: I think it's feasible with a robust enough network. These days it's not unusual for DBNs to have many thousands of inputs for image recognition. I tried this awhile back with HUL and SFF NNs. It converged to a point but began favoring more common hands (center of the bell curve). It's definitely more than feasible. Deepstack is a neural net approach inspired by CFR. I've had some moderate testing success with some non-CFR based neural networks. Experimented with both ANNs and CNNs (actually also a hybrid of a CNN and an ANN) however I didn't bother with CFR as I was looking for something that stood a chance at 6 max.

Author:	happypepper [ Wed Jul 25, 2018 5:54 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
I think cfr has a chance against 6max. Using deepstack method, the solving can be limited to 1 street or even half streets. Furthermore, postflop situations are most of the time 2-3 player only. We'd only have to worry about solving 6max for single street preflop.

Author:	PassiveBot [ Thu Jul 26, 2018 2:33 pm ]
Post subject:	Re: Neural net based approach inspired by CFR
happypepper wrote: I think cfr has a chance against 6max. Using deepstack method, the solving can be limited to 1 street or even half streets. Furthermore, postflop situations are most of the time 2-3 player only. We'd only have to worry about solving 6max for single street preflop. Perhaps I was being strong, but at the same time the tree does get larger, even if many of those scenarios don't play out in reality. I guess in reality if you do have all players in the pot gets larger and things potentially simplify as well. Possible that there are abstractions that don't lose much. Have you experimented at all in any such fashion?

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/