Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 2:10 pm

All times are UTC




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: Sync Locking
PostPosted: Mon Jan 27, 2014 12:46 pm 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I thought I would mention this to you guys because in other areas I've stated that I just let my threads collide when doing updates with CFRM.

I found, at least for my setup, this can have disastrous consequences. I don't know how exactly, but some of the more frequently accessed cumulative regrets were becoming (maximally) negative as a result of collision (which would be impossible otherwise because you're only adding positive numbers). As soon as I implemented sync locking for the updates, it didn't happen again.

Edit: By cumulative regrets I meant the average strategy (positive normalized cumulative regrets).


Last edited by cantina on Fri Mar 07, 2014 12:44 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Mon Jan 27, 2014 1:23 pm 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
I lock every player node when I reach it going forward and compute the current strategy, then unlock when having new regrets computed.
Note that releasing the lock before the back-pass isn't algorithmically right.
However you can do it but this needs you to copy the current strategy for each player node and thread the time you go deeper.

Even like that, I wonder if it's correct... Having two parallel iterations updating the regret according to the same base strategy.
I feel it should make pretty no difference, but my intuition says it could converge a bit slower than the strict lock.

On the other hand, my threads have a small % of sleep that I'd like to reduce..!


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Tue Jan 28, 2014 9:38 pm 
Offline
New Member

Joined: Tue May 21, 2013 1:22 am
Posts: 4
I'm too paranoid and opted not to worry about any of that. Using CSCFR, I just check than no conflicting bucket is dealt on each batch. If so, curtail that round and save the bucket sequence with a duplicate until the next round. For 8 threads, 169 pre flop buckets, and other streets much > 169 I only get about 25% performance hit (I don't remember exactly). And with 4 - 6 threads there is very negligible penalty indeed.

It might not work so well using public CS or others.


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Sat Feb 01, 2014 8:11 am 
Offline
Junior Member

Joined: Thu May 23, 2013 11:35 pm
Posts: 23
drunkandrich wrote:
I'm too paranoid and opted not to worry about any of that. Using CSCFR, I just check than no conflicting bucket is dealt on each batch. If so, curtail that round and save the bucket sequence with a duplicate until the next round. For 8 threads, 169 pre flop buckets, and other streets much > 169 I only get about 25% performance hit (I don't remember exactly). And with 4 - 6 threads there is very negligible penalty indeed.

It might not work so well using public CS or others.

that can be improved by generating more cards for that batch instead of curtailing.


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Thu Mar 06, 2014 10:47 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
Nasher wrote:
I thought I would mention this to you guys because in other areas I've stated that I just let my threads collide when doing updates with CFRM.

I found, at least for my setup, this can have disastrous consequences. I don't know how exactly, but some of the more frequently accessed cumulative regrets were becoming (maximally) negative as a result of collision (which would be impossible otherwise because you're only adding positive numbers). As soon as I implemented sync locking for the updates, it didn't happen again.


First of all thank you for sharing this. I will add an occasional check for negative cumulative strategies in addition to my checks for NaN and +/- Inf

How did you come up with that observation? Did you check for it because ... You know ... It's impossible to happen? Or did you observe some degenerations somewhere in the preflop strategy? Good work!

I am curious about more details:

(1) You stated this happens to nodes that are frequently visited. Do you know whether that happens to actions that:
- are intuitively a strategic reasonable choice or
- are intuitively a strategic bad choice
- completely randomly

(2) At what point of your simulation did you observe this phenomenon? After a few iterations or after rather plenty iterations? If known: Would you mind sharing the amount of total iterations completed?

(3) Just for clearance: you are using Doubles for the cumulative regret and the average strategy?

EDIT: (4) What do you mean by maximaly negative? Double.NegativeInfinity?

EDIT: Just a quick thought on numerical fuckups:

Assume a very long path with a sampling probabilities p_opp, p_train. In the Showdown nodes utilities are scaled by the opponent player's sampling probability (return u/p_opp). Now we have 3 possible cases:
Case A: p_opp is 'big' enough to not cause problems - a reasonable value is return
Case B: p_opp is 0.0 - impossible, since this node would not be sampled then
Case C: p_opp is slightly (really really slightly larger than 0.0). Now a very large (but still reasonable) value is returned.

In a node's ev computation we have something like
ev = (sum over sampled ev_i) / sampledActions

Since the ev_i are very big now you might geht Infinity in the estimated node's ev and then, during regret update (ev_i - Infinity), substract Infinity which would lead to -Infinity in the cumulative profile

Even if the value of ev is not Infinity yet - since the buckets are (relatively frequently) accessed, they might sum up huge negative values (since ev_i - ev : speak estimated ev of an action minus the sampled ev of that node = sampled regret : can be negative)

But all that does not explain why you are not facing the problem when using syncs ... Maybe you just run lucky this time

My advice would be to check the return value in Showdown-nodes and the sampled ev in decision nodes for reasonable values


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Fri Mar 07, 2014 12:31 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
I was just looking at the strategy for first to act. After many restarts, for some actions in what seemed like random pocket hands the average strategy (cumulative positive normalized regrets) were negative, which should never happen. I started looking at the tree traversals, waiting for one to go negative, which seemed to happen spontaneously (i.e. the last addition wasn't the cause). I assumed it was collision that caused it, implemented syncing, and it stopped.

1) Randomly.
2) It happened immediately.
3) Doubles.
4) Double.MinValue (so there was no chance for it to recover).

This was on a dual Xeon six-core workstation (for a total of 24 cores with hyperthreading), on a somewhat small-ish game. I haven't seen/noticed this in larger/different games, nor with fewer threads. So, you might not have to worry about it.

Edit in response to your edits: the numerical 'fuck ups' you describe don't apply.


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Fri Mar 07, 2014 10:23 am 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
Interestingly odd

I was curious about your case because I had a similar problem (Screenshot attached)

I currently have a simulation running for already three days, but I guess I will end up hitting CTRL+C and run it again with implemented Sync-locks since I am now uncertain about the results

Anyway, thanks again. Very interesting post


Attachments:
File comment: Messy strategy
noooo.jpg
noooo.jpg [ 72.61 KiB | Viewed 10442 times ]
Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Fri Mar 07, 2014 1:34 pm 
Offline
Regular Member
User avatar

Joined: Sat May 25, 2013 7:36 am
Posts: 73
The "sync-feature" comes at a price of 500ips per thread. Quite expensive ...


Top
 Profile  
 
 Post subject: Re: Sync Locking
PostPosted: Mon Mar 10, 2014 8:02 am 
Offline
Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437
Nose wrote:
The "sync-feature" comes at a price of 500ips per thread. Quite expensive ...

Yeah, it's not very forgiving.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group