HontoNiBaka wrote:
There are a few decent AIs that were trained with neural network self play, as far as I know Snowie did it and also Neo used neural networks amongst other things.
I am not really sure, how that is done in principle though. So you have your NN and you have your features, you have to find a usedul representation of the poker game,
maybe the bucket your hand belongs to will be a feature, maybe also public buckets etc. but I don't understand how to progress from there.
I have used NNs several times in different domains, basically I had labeleld training examples and I tried to predict the class from the features, simple enough.
But what is the class label in self play? What do I try to fit or predict? I really have problems with the general idea of using NNs and self play.
I work on something similar.
My idea is to do self play and evaluate multiple possibilities (e.g. call, raise half pot, raise pot) and then for each action chose/learn the one that maximizes the return. [Edit2: But this does not work to learn good folds, it would mean folding any losing hand preflop]
The hard question for me is how to learn good folding.
There are many possibilities:
- if you lose the hand pick random action to be fold instead (maybe not for all lost hands, maybe chose this random too?)
- pick a action to be fold with heuristics (e.g. where Hand Strength/Potential is worst, based on pot odds,...)
- use another bot or hand histories and learn folding from that
[Edit: just to be clear: the NN should learn folding, the question is how I decide as the teacher what are good folds]
But I just started with that approach, maybe it will not work well...