Question about the given implementation (p5).
Line 9: Shouldn't q be multiplied with the frequency of performing action a given infoset I?
(ie.
Code:
s(I,a) <- s(I,a) + (SIGMA(I,a) * q
instead of
Code:
s(I,a) <- s(I,a) + (SIGMA(I,a) / q
)
referring to cumulative profile-formula (attached as picture)
btw @ admins: a (La)TeX-Plugin ([TEX]-BB-Code respectively) would be nice to create this kind of formula
also when looking at line 5 of the algorithm - does anyone know the rationale of dividing the utility by the probability of reaching terminal node z instead of multiplying with it?
say, why
Code:
if(h \in Z) then return u_i(h)/q end if
EDIT: Attached algorithm for your convenience