HontoNiBaka wrote:
SkyBot wrote:
[edit:]
Note: I don't follow that paper, I follow the main ideas of another reinforced learning poker paper, but with some significant changes that reduce the effort greatly.
Which paper do you follow in your training? Would be great if you could provide the name or a link.
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
https://arxiv.org/abs/1603.01121I use the main idea of having a average and best response neural net and mix them for training. However, while their stuff should converge to Nash, I use some brutal optimizations where I may lose those properties.