Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 3:50 pm

All times are UTC




Post new topic Reply to topic  [ 13 posts ] 
Author Message
 Post subject: Iterative CFR
PostPosted: Thu Apr 10, 2014 7:38 am 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Hi Guyz,

I wondering if it's possible to mod the CFR aglorithm to be iterative (without recursive function) ?

Have you tried ? Is there any speed increase ?

Regards,

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Thu Apr 10, 2014 8:51 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
It's definitely possible because there is a cs theorem that says it is. Dunno if it's faster: received wisdom says it is, but I've never proved it.

http://www.poker-ai.org/archive/www.pok ... &sk=t&sd=a

Have you eliminated all temporary object creations?


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Fri Apr 11, 2014 8:35 am 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Hi Spears,

Thanks for the info. I already have read this post as well but it's suggested to use iterative but that's not clear if it's speed up the training...

Regarding dynamic variablse, most are created before the training... I still have some variables created into the regret calculation function... I will try to avoid the creation of this variable in the function and come back to you with the results...

Anyway, I have already double the iterations/seconds just by activating optimization for compilation...


Next step I will try the multithreading and move to iterative function if I found a interesting way (right now my idea is to use multi pass on my gametree)...

Regards,

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Fri Apr 18, 2014 10:54 am 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
Hi there !

Just to testify, iterative algorithm is a very very huge performance improvement compared to the recursive one.

That's it ;)

EDIT : .. and knowing some attributes of your tree, you can have zero variable creation.


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Thu Apr 24, 2014 2:00 pm 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Hi Pitt,

Thanks for the info ;)

Which CFR are you using ? CS/ES/PCS/OS/... ?


Regards,


MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Thu Apr 24, 2014 2:43 pm 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
I'm using CS.
The games supplied to my engine are interfaced state machines, and all they have to provide as parameters to initialize training variables is :
- The (maximum) depth of the tree
- The number of players
- The maximum number of possible actions over all players nodes.

For other samplings, you'll need numbers related to chances that are not sampled.
Anyway if you don't implement it for generic extensive-form game as I do, it should be easier to initialize what you need :)

Cheers !

PS : I'll release my framework one day, you'll see it on poker-ai, and I think it will be a good example of CS-CFRM implementation (but obviously not the more efficient as it can not take benefits of game specificities... and it's in Java :twisted: ). Anyway don't wait for it, I can't tell when this will happen :mrgreen:


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Sat May 10, 2014 4:44 pm 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Hi Guyz,

I have finished a first version of an iterative function...

The point is before calcualte Ev, regret at a node I have to update probabilities and reach :

Code:
void get_reach_value(int nodeidx, double reach[2], int cards[2])
{
   reach[0] = 1;
   reach[1] = 1;
   
   while(nodeidx)
   {   
            int next_node = nodeidx;
      nodeidx = NodeArray[nodeidx].PreviousNode;
      int player = NodeArray[nodeidx].Player;
      
      double acc_regret= 0;
      
      //Calculate accumulated regret
      for(int i=0; i<2; i++)
      {
         acc_regret += max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i]);
      }
      
      if (acc_regret > EPSILON)
      {
         for(int i=0; i<2; i++)
         {
            NodeArray[nodeidx].probability[i] = max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i])/acc_regret;
         }
      
      }
      else
      {
         NodeArray[nodeidx].probability[0] = 0.5;
         NodeArray[nodeidx].probability[1] = 0.5;
      }
      
   reach[NodeArray[nodeidx].Player] *= NodeArray[nodeidx].probability[NodeArray[next_node].PreviousActionIdx];
   }
}





But in terms of perfomance that's not really good :(

For Kuhn, for recusive I'm at 5.3m of iteration/sec and with this iterative function : 3.5m of iteration/sec... Clearly my iterative function is not really fast ;(

Do you have any idea to help me to improve the speed of my iterative function ?

By the way, I'm working a second version of the function that will based on two passes : the 1st one will update probabilities and reach and the second will calculate regret... But this one is not yet fully functionnal...

Regards,

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Sat May 10, 2014 5:15 pm 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Ok so now with the code with two passes (first update prob&reach and after update regret&ev) I'm reaching 6.3m iterations/sec...

I'll will post my code in a couple of hours just the time to clean it a bit...

Regards,

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Sat May 10, 2014 6:21 pm 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Ok following the code of my Chance Sampling Iterative CFRM:


Main loop :
Code:
//Update reach and probabilities
for(int nodeidx = 0; nodeidx<NODESNUMBER; nodeidx++)
{
   update_reach_and_probabilities(nodeidx, cards);
}

//Update EV
for(int nodeidx=NODESNUMBER-1; nodeidx>=0; nodeidx--)
{
   update_regret_it(nodeidx, cards, results);   
}


Update Reach&Prob :
Code:
void update_reach_and_probabilities(int nodeidx, int cards[2])
{
   
   double sum = 0;
   
   //Define new prob
   for(int i = 0; i<2; i++)
   {
      sum += max((double)0,(NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i]));
   }
   
   if(sum > EPSILON)
   {
      for(int i=0; i<2; i++)
      {
         
         NodeArray[nodeidx].probability[i] = max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i])/sum;
      }
   }
   else
   {
      NodeArray[nodeidx].probability[0] = 0.5;
      NodeArray[nodeidx].probability[1] = 0.5;
   }
      
   //Update Reach
   if(nodeidx != 0)
   {   
      NodeArray[nodeidx].reach[NodeArray[NodeArray[nodeidx].PreviousNode].Player] = NodeArray[NodeArray[nodeidx].PreviousNode].reach[NodeArray[NodeArray[nodeidx].PreviousNode].Player] * NodeArray[NodeArray[nodeidx].PreviousNode].probability[NodeArray[nodeidx].PreviousActionIdx];   
      NodeArray[nodeidx].reach[NodeArray[nodeidx].Player] = NodeArray[NodeArray[nodeidx].PreviousNode].reach[NodeArray[nodeidx].Player];
   }

}


Update Regret&Ev :
Code:
void update_regret_it(int nodeidx, int cards[2], int results)
{
   if(NodeArray[nodeidx].NextNodes[0]==-1 && NodeArray[nodeidx].NextNodes[1]==-1)
   {   
      if(NodeArray[nodeidx].PreviousAction==FOLD)
      {
         if(NodeArray[nodeidx].Player==0)
         {// So P2 folds         
            NodeArray[nodeidx].ev[0]=(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=-(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[0];      
         }
         else
         {//So P1 folds      
            NodeArray[nodeidx].ev[0]=-(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[0];
         }

      }
      else
      {
         //If results == 0 --> P1 WIN, retults == 1, P2 win
         if(results == 0)//--> P1 wins
         {
            NodeArray[nodeidx].ev[0]=(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=-(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[0];
         }
         else//P2 wins
         {
            NodeArray[nodeidx].ev[0]=-(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[0];
         }
      }      
   }
   else if(NodeArray[nodeidx].reach[0]<EPSILON && NodeArray[nodeidx].reach[1]<EPSILON)
   {
      NodeArray[nodeidx].ev[0]=0;
      NodeArray[nodeidx].ev[1]=0;
   }
   else
   {

      
      NodeArray[nodeidx].ev[0] = 0;
      NodeArray[nodeidx].ev[1] = 0;
      
      //Calculated Expected Value
      for(int i=0; i<2; i++)
      {
         NodeArray[nodeidx].ev[NodeArray[nodeidx].Player] += NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[NodeArray[nodeidx].Player]*NodeArray[nodeidx].probability[i];
         NodeArray[nodeidx].ev[!NodeArray[nodeidx].Player] += NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[!NodeArray[nodeidx].Player];
      }
      
      //Calculate regret
      for(int i=0; i<2; i++)
      {
         NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i] += (NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[NodeArray[nodeidx].Player] - NodeArray[nodeidx].ev[NodeArray[nodeidx].Player]);
      }
   }
}



So have you some advices to speed up the process ?


Thanks for your help.

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Sun May 11, 2014 10:03 am 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
Yep,

* Dereferencing your node's attributes only once should be really better, for example update_reach_and_probabilities should take :

int player, double[2] regret, double[2] probability, double[2] reach, int previousPlayer, double[2] previousProba, double[2] previousReach,...

as argument.

* Second improvement, have a class that contains all possible iteration variables you need (player, regret, probability, reach, previousPlayer...) as attributes, and have the iteration method in that class, doing :

for each node
set attributes variables for current node update
do update reach and proba for this node

for each node (reversed)
set attributes for current node regret computing
do update regret for this node

... so that you can have :
- no method call for one iteration
- all variables are class attributes so nothing is created during the iteration (no "double sum" for example).
- avoided all redundant array dereferencings

Other improvements can be performed after that, but I think it can be a good start.

Have a good coding time ;)


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Sun May 11, 2014 8:40 pm 
Offline
Junior Member

Joined: Wed Sep 04, 2013 6:05 pm
Posts: 47
Hi Pitt,

Thanks for your help.

Your second statement is clear but not really the 1st one.... Could you please explain me a bit more ?


Thanks.

MrNice


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Mon May 12, 2014 10:51 am 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
Sorry I wasn't clear.
The first improvement is passing all the variables you need directly in your calls to the two submethods :

Code:
Node n, previousNode ; // Put this as method variable or better : as class attribute

for(int nodeidx = 0; nodeidx<NODESNUMBER; nodeidx++)
{
n = NodeArray[nodeidx];
previousNode = NodeArray[n.PreviousNode];
update_reach_and_probabilities(n.regret, n.player, previousNode.player, previousNode.proba  ...);
}

... same for the backward one.


But I proposed it only to introduce the second one that should perform better inlining your two submethods so without method calls :)


Top
 Profile  
 
 Post subject: Re: Iterative CFR
PostPosted: Wed May 21, 2014 4:23 pm 
Offline
Junior Member

Joined: Wed Dec 04, 2013 12:40 am
Posts: 49
Hi !
I'm responding in this thread as this could be helpful for other people. So here is the MrNice's message :

Quote:
Hi Pitt,


I'm still looking how to move to iterative CFR...

I have applied your advices and I have increased speeds for iterative and recursive (thanks for that)... But my recursive has the same speed (more or less) then my iterative one...

I suppose there is another way to move to iterative... I'm doing iterative with 2 passes... Is there another way ?

But due to the fact that prob/reach need to be propagated top-down and ev down-top I don't really see how...

So if you have to advice that will be great ;)

Regards,

MrNice



First I think you should try to look at performance over bigger games, because small depth games have not many recursive calls.

And to answer how you could do it another way, I do it in a single loop that walks the tree like a recursive method.
It took me a pretty long time to do it. I'll try to explain it simple :

let's say you have a recursive method that walks your tree :
Code:
Object recCall(Node n, Object arg1, ..., argN){
    Object tmp1 = doFirstPass(n, arg1, ..., argN);
    Object[] childrenResults = new Object[n.childrenSize];
   for(int i = 0; i < n.childrenSize; i++)
      childrenResults[depth][i] = recCall(n.children[i], tmp1, argK, ...);
   return doSecondPass(n, childrenResults, tmp1, argH...);
}


First identify what data you need unmodified after each child recursive call (here : i, tmp1, argK, argH)
Identify then what are the recursive values you will need for your second pass (here : childrenResults).

If you have arrays to store all this for each depth, you can have a while loop that walks the tree exactly the same way as recursive algorithm.

Constants :
Code:
int maxDepth;
int maxChildren;


Iteration variables :

Code:
int depth : to know at what depth you are
boolean forward : to indicate the direction of your algorithm.


Iteration arrays to store all what you need:

Code:
Object[][] childrenResults = new Object[maxDepth][maxChildren];
int[] lastChildVisited = new int[maxDepth]; (to store i)
...
and other arrays with maxDepth length to store args etc...


Your algorithm will have the general form :
Code:
Object result;
Node n = getRoot();
while(true){
    if(forward){
      // Do the first pass
       // Store what you need to keep
       lastChildVisited[depth] = 0;
       node = n.children[0];
       depth++;
       continue;
   }
   
   childrenResults[depth][lastChildVisited[depth] ] = result;
   if(lastChildVisited[depth] == node.childrenSize -1){
        result = // Process children results;
        node = n.parent;
        depth--;
        continue;
    }
   lastChildVisited[depth] ++;
   node = n.children[ lastChildVisited[depth]];
   depth++;
   forward = true;
    continue;
}


I didn't write anything about terminal nodes and stop conditions (on depth obviously) and it's poorly pseudo-coded, but I hope you understand the principle.

Keep in mind I'm not sure it's really better than your two passes algorithm for performance. But you asked for another way, hope you'll enjoy ;)

Just for the algo programming skills, I think this systematical way to transform recursive algo to iterative one is a good knowledge .

And don't forget to test on bigger games so that you can really benchmark performance !

Cheers, tell me if something is unclear there...


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
Powered by phpBB® Forum Software © phpBB Group