Some results. 25bb stack level, global threshold adjustment versus a static player, starting at 5% purification. The min/max purification levels were 1%/100%, adjustments were made like:
threshold += threshold * inc
Average (optimum) is probably around 40%, which is about what I expected. A lower delta (increment percentage) would probably have shown a more stable optimum threshold. I don't have a player that can exploit overly predictable strategies (i.e. very pure strategies), so I'm not sure how to show it decreasing it's threshold. It should, though, in theory, against a player with that capability.
The increment sign (+/-) was decided by Zed * Utility of the deviated threshold levels.
Code:
Hand # Threshold (10% delta)
--------------------------------------
50000 0.146923779810831
100000 0.186814088517662
150000 0.449290742512755
200000 0.496626159328362
250000 0.33744537676712
300000 0.442345245425452
350000 0.327312908941213
400000 0.406057514081512
450000 0.342283417617408
500000 0.420383316311766
550000 0.40578471755976
600000 0.488456269506705
650000 0.432960247785639
700000 0.513456650014193
750000 0.422146355060823
800000 0.510797089623596
850000 0.385637976310415
900000 0.403411666168254
950000 0.528818128102072
1000000 0.295419722492907
1050000 0.254078200455836
Next: somehow model a dynamic threshold based on game state.