Poker-AI.org
http://poker-ai.org/phpbb/

Sync Locking
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2692
Page 1 of 1

Author:  cantina [ Mon Jan 27, 2014 12:46 pm ]
Post subject:  Sync Locking

I thought I would mention this to you guys because in other areas I've stated that I just let my threads collide when doing updates with CFRM.

I found, at least for my setup, this can have disastrous consequences. I don't know how exactly, but some of the more frequently accessed cumulative regrets were becoming (maximally) negative as a result of collision (which would be impossible otherwise because you're only adding positive numbers). As soon as I implemented sync locking for the updates, it didn't happen again.

Edit: By cumulative regrets I meant the average strategy (positive normalized cumulative regrets).

Author:  Pitt [ Mon Jan 27, 2014 1:23 pm ]
Post subject:  Re: Sync Locking

I lock every player node when I reach it going forward and compute the current strategy, then unlock when having new regrets computed.
Note that releasing the lock before the back-pass isn't algorithmically right.
However you can do it but this needs you to copy the current strategy for each player node and thread the time you go deeper.

Even like that, I wonder if it's correct... Having two parallel iterations updating the regret according to the same base strategy.
I feel it should make pretty no difference, but my intuition says it could converge a bit slower than the strict lock.

On the other hand, my threads have a small % of sleep that I'd like to reduce..!

Author:  drunkandrich [ Tue Jan 28, 2014 9:38 pm ]
Post subject:  Re: Sync Locking

I'm too paranoid and opted not to worry about any of that. Using CSCFR, I just check than no conflicting bucket is dealt on each batch. If so, curtail that round and save the bucket sequence with a duplicate until the next round. For 8 threads, 169 pre flop buckets, and other streets much > 169 I only get about 25% performance hit (I don't remember exactly). And with 4 - 6 threads there is very negligible penalty indeed.

It might not work so well using public CS or others.

Author:  algonoob [ Sat Feb 01, 2014 8:11 am ]
Post subject:  Re: Sync Locking

drunkandrich wrote:
I'm too paranoid and opted not to worry about any of that. Using CSCFR, I just check than no conflicting bucket is dealt on each batch. If so, curtail that round and save the bucket sequence with a duplicate until the next round. For 8 threads, 169 pre flop buckets, and other streets much > 169 I only get about 25% performance hit (I don't remember exactly). And with 4 - 6 threads there is very negligible penalty indeed.

It might not work so well using public CS or others.

that can be improved by generating more cards for that batch instead of curtailing.

Author:  Nose [ Thu Mar 06, 2014 10:47 pm ]
Post subject:  Re: Sync Locking

Nasher wrote:
I thought I would mention this to you guys because in other areas I've stated that I just let my threads collide when doing updates with CFRM.

I found, at least for my setup, this can have disastrous consequences. I don't know how exactly, but some of the more frequently accessed cumulative regrets were becoming (maximally) negative as a result of collision (which would be impossible otherwise because you're only adding positive numbers). As soon as I implemented sync locking for the updates, it didn't happen again.


First of all thank you for sharing this. I will add an occasional check for negative cumulative strategies in addition to my checks for NaN and +/- Inf

How did you come up with that observation? Did you check for it because ... You know ... It's impossible to happen? Or did you observe some degenerations somewhere in the preflop strategy? Good work!

I am curious about more details:

(1) You stated this happens to nodes that are frequently visited. Do you know whether that happens to actions that:
- are intuitively a strategic reasonable choice or
- are intuitively a strategic bad choice
- completely randomly

(2) At what point of your simulation did you observe this phenomenon? After a few iterations or after rather plenty iterations? If known: Would you mind sharing the amount of total iterations completed?

(3) Just for clearance: you are using Doubles for the cumulative regret and the average strategy?

EDIT: (4) What do you mean by maximaly negative? Double.NegativeInfinity?

EDIT: Just a quick thought on numerical fuckups:

Assume a very long path with a sampling probabilities p_opp, p_train. In the Showdown nodes utilities are scaled by the opponent player's sampling probability (return u/p_opp). Now we have 3 possible cases:
Case A: p_opp is 'big' enough to not cause problems - a reasonable value is return
Case B: p_opp is 0.0 - impossible, since this node would not be sampled then
Case C: p_opp is slightly (really really slightly larger than 0.0). Now a very large (but still reasonable) value is returned.

In a node's ev computation we have something like
ev = (sum over sampled ev_i) / sampledActions

Since the ev_i are very big now you might geht Infinity in the estimated node's ev and then, during regret update (ev_i - Infinity), substract Infinity which would lead to -Infinity in the cumulative profile

Even if the value of ev is not Infinity yet - since the buckets are (relatively frequently) accessed, they might sum up huge negative values (since ev_i - ev : speak estimated ev of an action minus the sampled ev of that node = sampled regret : can be negative)

But all that does not explain why you are not facing the problem when using syncs ... Maybe you just run lucky this time

My advice would be to check the return value in Showdown-nodes and the sampled ev in decision nodes for reasonable values

Author:  cantina [ Fri Mar 07, 2014 12:31 am ]
Post subject:  Re: Sync Locking

I was just looking at the strategy for first to act. After many restarts, for some actions in what seemed like random pocket hands the average strategy (cumulative positive normalized regrets) were negative, which should never happen. I started looking at the tree traversals, waiting for one to go negative, which seemed to happen spontaneously (i.e. the last addition wasn't the cause). I assumed it was collision that caused it, implemented syncing, and it stopped.

1) Randomly.
2) It happened immediately.
3) Doubles.
4) Double.MinValue (so there was no chance for it to recover).

This was on a dual Xeon six-core workstation (for a total of 24 cores with hyperthreading), on a somewhat small-ish game. I haven't seen/noticed this in larger/different games, nor with fewer threads. So, you might not have to worry about it.

Edit in response to your edits: the numerical 'fuck ups' you describe don't apply.

Author:  Nose [ Fri Mar 07, 2014 10:23 am ]
Post subject:  Re: Sync Locking

Interestingly odd

I was curious about your case because I had a similar problem (Screenshot attached)

I currently have a simulation running for already three days, but I guess I will end up hitting CTRL+C and run it again with implemented Sync-locks since I am now uncertain about the results

Anyway, thanks again. Very interesting post

Attachments:
File comment: Messy strategy
noooo.jpg
noooo.jpg [ 72.61 KiB | Viewed 10444 times ]

Author:  Nose [ Fri Mar 07, 2014 1:34 pm ]
Post subject:  Re: Sync Locking

The "sync-feature" comes at a price of 500ips per thread. Quite expensive ...

Author:  cantina [ Mon Mar 10, 2014 8:02 am ]
Post subject:  Re: Sync Locking

Nose wrote:
The "sync-feature" comes at a price of 500ips per thread. Quite expensive ...

Yeah, it's not very forgiving.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/