Coffee4tw wrote:
1) While the average is the same I do think the result will be different if we pick random values at each node because that essentially changes the utility of each action sequence. If you only have fixed ones then one action sequence will always be greater in utility than the other, which might not be true in real life. If we randomize it, the utility changes each time and will eventually lead to a different result. If you can mathematically prove that I'm wrong I believe you, you might actually be on to something there but I'd like to believe it's better. Maybe I am mixing two inherently different things up here but why else would RNR work for example?
As far as I understand, the authors of the paper proved it here: http://poker.cs.ualberta.ca/publications/AAAI11.pdf
Coffee4tw wrote:
3) Yeah that's why I was saying we might need to split up the one node into two. This is opponent depended though, if we have a model that does well against the average opponent we are in a pretty good position. For exploiting opponents we can use what Nasher said and precompute multiple ones including RNR into this approach. Essentially what I am trying to gain is a small tree, not necessarily a less exploitable abstraction but I still believe it is a lot better than anything with one fixed size raise node would be. Is it better than two fixed size raise nodes? Hard to say but that's kind of an unfair comparison.
I still don't get why it should be less exploitable. Lets assume we are in the learning phase. Out data suggest that we should bet 1/3 or 2/3 of the pot each half of the time, so on average we make 50% pot bet. As the algorithm isnt aware of our cards and bets randomly, the resulting strategy of this node would be the same as a fixed betsize of 1/2 pot.
Now we apply your algorithm in a real game and our opponent bets into us. He valuebets 2/3pot and bluffs 1/3 pot. As we just see a bet, we still perform similar against both actions which leads to getting exploitet.
Thinking a bit more about your idea, I think it has some value given our abstraction does not use board textures: on boards like 552r bets are typically small while bets on JT8ss are typically large for obvious reasons. Even if we have different bet size nodes, the algorithm can't exploit those given our abstraction isnt aware of public cards. Given your approach, however, we could bet differently depending on the board texture. This should increase our EV a bit, but not change our exploitability.Statistics: Posted by proud2bBot — Wed Mar 13, 2013 1:09 pm
]]>