I have a couple of questions regarding optimization[1] In
http://webdocs.cs.ualberta.ca/~bowling/papers/15science.pdf they state
Quote:
Finally, unlike with CFR, we have empirically observed that the exploitability of the players’ current strategies during the computation regularly approaches zero. Therefore, we can skip the step of computing and storing the average strategy, instead using the players’ current strategies as the CFR+ solution.
So I modified the pseudocode provided in
http://arxiv.org/pdf/1407.5042.pdf as I understand it. (Attached to this post - See bottom). Do you agree with that?
[2] In
http://webdocs.cs.ualberta.ca/~bowling/papers/15science.pdf they explain
Quote:
To address the memory challenge we store the average strategy and accumulated regrets using compression. We use fixed-point arithmetic by first multiplying all values by a scaling factor and truncating them to integers. The resulting integers are then ordered to maximize compression efficiency, with compression ratios around 13-to-1 on the regrets and 28-to-1 on the strategy. Overall, we require less than 11 TB of storage to store the regrets and 6 TB to store the average strategy during the computation, which is distributed across a cluster of computation nodes.
What might that scaling factor be? 1,000? 10,000? 100,000?
[3] The pseudocode uses a non-declared/ non-preinitialized vector m. What might that be? A temporary variable?
[4] The entire strategy is ~524 TB in size:
- Since the average strategy is not used -> 524 / 2 = 262 TB required
- Since we are using 4 Byte Int32 instead of 8 Byte Doubles -> 262 / 2 = 131 TB required
- Since storage allows a compression of factor 13 -> 131 / 13 = ~10 TB required
Correct? Thanks ahead