It should be able to play 6-max and 9-max tables, shouldn't require a lot of computation to act and it would be nice if it could be adjusted in some way to exploit the environment.
What is a good approach?
I've been following the poker research for years but this still seems like a somewhat uncharted territory. The state-of-the art seems to be Pluribus, but it's very complex, it's 6-max, requires fixed starting stacks and uses a lot of computation resources. And the near-GTO strategy may not even be appropriate in a highly exploitable setting.
Right now I'm thinking about this, although it's a very vague idea - create some sort of a parametrized strategy template with hundreds or thousands of parameters and optimize the parameters via self-play.
What do you think?Statistics: Posted by listerofsmeg — Sat Apr 09, 2022 7:36 pm
]]>