At the moment I've got it set up so the utility for reaching showdown is
* for the winner: the total invested by the loser
* for the loser: - the total invested by the loser
I also think I may be doing my CSC-FRM utilities wrong when calculating regret. Should I be returning the values above as the utilities, or should I be multiplying it by the opponents reach percentage for that node first?
I ask because some implementations appear to be doing the first (including mine), and others do the other .For example, amex appears to do this, but maybe I'm reading his code wrong. (This is before the decision(s) that lead to it multiple by it's own strategy percentage). I've tried out the new way and it appears to produce weird behaviour, but it's not over many iterations. -ignore this, I think it's due to the way our code is organised, I'm doing the same thing but in the parent node.
Any help with this greatly appreciated.Statistics: Posted by fraction — Wed Dec 11, 2013 2:53 pm
]]>