FlashPlayer wrote:
Got a new question
Lets assume that we have HU game, we have a big (for example FULL) tree of this game and we calculated GTO for it. Perfect situation. So we have a strategy for every situation in this tree.
But in this tree in some opponent node there are few actions, and one of them (like bet with sizing x100500, or fold with nuts only) has 0% probability for opponent to choose during GTO play.
And what we need to do if our opponent during real game choosed this 0% prob action?
In our tree, if we used CS CFR algo we visited this node maybe one time, got very negative regret for opponent, set probability to zero and then didn't visit this node again, so we have no strategy for this node and all nodes after it.
But logic says that if opponent choosed so unliked action - we must exploit him so much in this situation if we play GTO (and we have this FULL tree with calculated GTO).
But how we can do this?
I don't really know, but maybe this might help.
There are two cases:
1. Villain takes an action that is legal with some hand but not the one he holds. This is non optimal so he will lose in the long term against a hero that continues to play according to the actions that villain takes
2. Villain takes an action that is illegal with any hand. Does this ever happen? If it does, I'd be inclined to proceed assuming he took the most infrequently taken action.
It might be useful to look at
https://www.youtube.com/watch?v=qndXrHcV1sMLibratus suggests you don't really have to exploit to win conclusively. Nevertheless, my opinion is that you have to:
1. Record your opponents action frequencies and showdown hands, then determine his strategy from that info. This takes too many hands to be useful, so you could accelerate it by seeing if the statistics you do collect correspond to previously identified patterns of poor play
2. Exploit his strategy, while not being too exploitable yourself. I've done some experiments on toy games that suggest you can do this by finding the pure strategy that exploits his strategy, and mixing it with your NE strategy. The results I got suggest that small deviations from NE pay off quite well. I guess you could monitor villain statistics for signs that he reacts to your exploitation too. University of Alberta did some work on this - Data biased robust counter strategies. From the work I did on toy games I don't completely agree with their conclusion that you have to recompute NE for every opponent - but I could be wrong.