Poker-AI.org
http://poker-ai.org/phpbb/

Iterative CFR
http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2733
Page 1 of 1

Author:  MrNice [ Thu Apr 10, 2014 7:38 am ]
Post subject:  Iterative CFR

Hi Guyz,

I wondering if it's possible to mod the CFR aglorithm to be iterative (without recursive function) ?

Have you tried ? Is there any speed increase ?

Regards,

MrNice

Author:  spears [ Thu Apr 10, 2014 8:51 am ]
Post subject:  Re: Iterative CFR

It's definitely possible because there is a cs theorem that says it is. Dunno if it's faster: received wisdom says it is, but I've never proved it.

http://www.poker-ai.org/archive/www.pok ... &sk=t&sd=a

Have you eliminated all temporary object creations?

Author:  MrNice [ Fri Apr 11, 2014 8:35 am ]
Post subject:  Re: Iterative CFR

Hi Spears,

Thanks for the info. I already have read this post as well but it's suggested to use iterative but that's not clear if it's speed up the training...

Regarding dynamic variablse, most are created before the training... I still have some variables created into the regret calculation function... I will try to avoid the creation of this variable in the function and come back to you with the results...

Anyway, I have already double the iterations/seconds just by activating optimization for compilation...


Next step I will try the multithreading and move to iterative function if I found a interesting way (right now my idea is to use multi pass on my gametree)...

Regards,

MrNice

Author:  Pitt [ Fri Apr 18, 2014 10:54 am ]
Post subject:  Re: Iterative CFR

Hi there !

Just to testify, iterative algorithm is a very very huge performance improvement compared to the recursive one.

That's it ;)

EDIT : .. and knowing some attributes of your tree, you can have zero variable creation.

Author:  MrNice [ Thu Apr 24, 2014 2:00 pm ]
Post subject:  Re: Iterative CFR

Hi Pitt,

Thanks for the info ;)

Which CFR are you using ? CS/ES/PCS/OS/... ?


Regards,


MrNice

Author:  Pitt [ Thu Apr 24, 2014 2:43 pm ]
Post subject:  Re: Iterative CFR

I'm using CS.
The games supplied to my engine are interfaced state machines, and all they have to provide as parameters to initialize training variables is :
- The (maximum) depth of the tree
- The number of players
- The maximum number of possible actions over all players nodes.

For other samplings, you'll need numbers related to chances that are not sampled.
Anyway if you don't implement it for generic extensive-form game as I do, it should be easier to initialize what you need :)

Cheers !

PS : I'll release my framework one day, you'll see it on poker-ai, and I think it will be a good example of CS-CFRM implementation (but obviously not the more efficient as it can not take benefits of game specificities... and it's in Java :twisted: ). Anyway don't wait for it, I can't tell when this will happen :mrgreen:

Author:  MrNice [ Sat May 10, 2014 4:44 pm ]
Post subject:  Re: Iterative CFR

Hi Guyz,

I have finished a first version of an iterative function...

The point is before calcualte Ev, regret at a node I have to update probabilities and reach :

Code:
void get_reach_value(int nodeidx, double reach[2], int cards[2])
{
   reach[0] = 1;
   reach[1] = 1;
   
   while(nodeidx)
   {   
            int next_node = nodeidx;
      nodeidx = NodeArray[nodeidx].PreviousNode;
      int player = NodeArray[nodeidx].Player;
      
      double acc_regret= 0;
      
      //Calculate accumulated regret
      for(int i=0; i<2; i++)
      {
         acc_regret += max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i]);
      }
      
      if (acc_regret > EPSILON)
      {
         for(int i=0; i<2; i++)
         {
            NodeArray[nodeidx].probability[i] = max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i])/acc_regret;
         }
      
      }
      else
      {
         NodeArray[nodeidx].probability[0] = 0.5;
         NodeArray[nodeidx].probability[1] = 0.5;
      }
      
   reach[NodeArray[nodeidx].Player] *= NodeArray[nodeidx].probability[NodeArray[next_node].PreviousActionIdx];
   }
}





But in terms of perfomance that's not really good :(

For Kuhn, for recusive I'm at 5.3m of iteration/sec and with this iterative function : 3.5m of iteration/sec... Clearly my iterative function is not really fast ;(

Do you have any idea to help me to improve the speed of my iterative function ?

By the way, I'm working a second version of the function that will based on two passes : the 1st one will update probabilities and reach and the second will calculate regret... But this one is not yet fully functionnal...

Regards,

MrNice

Author:  MrNice [ Sat May 10, 2014 5:15 pm ]
Post subject:  Re: Iterative CFR

Ok so now with the code with two passes (first update prob&reach and after update regret&ev) I'm reaching 6.3m iterations/sec...

I'll will post my code in a couple of hours just the time to clean it a bit...

Regards,

MrNice

Author:  MrNice [ Sat May 10, 2014 6:21 pm ]
Post subject:  Re: Iterative CFR

Ok following the code of my Chance Sampling Iterative CFRM:


Main loop :
Code:
//Update reach and probabilities
for(int nodeidx = 0; nodeidx<NODESNUMBER; nodeidx++)
{
   update_reach_and_probabilities(nodeidx, cards);
}

//Update EV
for(int nodeidx=NODESNUMBER-1; nodeidx>=0; nodeidx--)
{
   update_regret_it(nodeidx, cards, results);   
}


Update Reach&Prob :
Code:
void update_reach_and_probabilities(int nodeidx, int cards[2])
{
   
   double sum = 0;
   
   //Define new prob
   for(int i = 0; i<2; i++)
   {
      sum += max((double)0,(NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i]));
   }
   
   if(sum > EPSILON)
   {
      for(int i=0; i<2; i++)
      {
         
         NodeArray[nodeidx].probability[i] = max((double)0, NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i])/sum;
      }
   }
   else
   {
      NodeArray[nodeidx].probability[0] = 0.5;
      NodeArray[nodeidx].probability[1] = 0.5;
   }
      
   //Update Reach
   if(nodeidx != 0)
   {   
      NodeArray[nodeidx].reach[NodeArray[NodeArray[nodeidx].PreviousNode].Player] = NodeArray[NodeArray[nodeidx].PreviousNode].reach[NodeArray[NodeArray[nodeidx].PreviousNode].Player] * NodeArray[NodeArray[nodeidx].PreviousNode].probability[NodeArray[nodeidx].PreviousActionIdx];   
      NodeArray[nodeidx].reach[NodeArray[nodeidx].Player] = NodeArray[NodeArray[nodeidx].PreviousNode].reach[NodeArray[nodeidx].Player];
   }

}


Update Regret&Ev :
Code:
void update_regret_it(int nodeidx, int cards[2], int results)
{
   if(NodeArray[nodeidx].NextNodes[0]==-1 && NodeArray[nodeidx].NextNodes[1]==-1)
   {   
      if(NodeArray[nodeidx].PreviousAction==FOLD)
      {
         if(NodeArray[nodeidx].Player==0)
         {// So P2 folds         
            NodeArray[nodeidx].ev[0]=(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=-(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[0];      
         }
         else
         {//So P1 folds      
            NodeArray[nodeidx].ev[0]=-(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[0];
         }

      }
      else
      {
         //If results == 0 --> P1 WIN, retults == 1, P2 win
         if(results == 0)//--> P1 wins
         {
            NodeArray[nodeidx].ev[0]=(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=-(NodeArray[nodeidx].Pot[1])*NodeArray[nodeidx].reach[0];
         }
         else//P2 wins
         {
            NodeArray[nodeidx].ev[0]=-(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[1];
            NodeArray[nodeidx].ev[1]=(NodeArray[nodeidx].Pot[0])*NodeArray[nodeidx].reach[0];
         }
      }      
   }
   else if(NodeArray[nodeidx].reach[0]<EPSILON && NodeArray[nodeidx].reach[1]<EPSILON)
   {
      NodeArray[nodeidx].ev[0]=0;
      NodeArray[nodeidx].ev[1]=0;
   }
   else
   {

      
      NodeArray[nodeidx].ev[0] = 0;
      NodeArray[nodeidx].ev[1] = 0;
      
      //Calculated Expected Value
      for(int i=0; i<2; i++)
      {
         NodeArray[nodeidx].ev[NodeArray[nodeidx].Player] += NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[NodeArray[nodeidx].Player]*NodeArray[nodeidx].probability[i];
         NodeArray[nodeidx].ev[!NodeArray[nodeidx].Player] += NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[!NodeArray[nodeidx].Player];
      }
      
      //Calculate regret
      for(int i=0; i<2; i++)
      {
         NodeArray[nodeidx].regret[(cards[NodeArray[nodeidx].Player]*2)+i] += (NodeArray[NodeArray[nodeidx].NextNodes[i]].ev[NodeArray[nodeidx].Player] - NodeArray[nodeidx].ev[NodeArray[nodeidx].Player]);
      }
   }
}



So have you some advices to speed up the process ?


Thanks for your help.

MrNice

Author:  Pitt [ Sun May 11, 2014 10:03 am ]
Post subject:  Re: Iterative CFR

Yep,

* Dereferencing your node's attributes only once should be really better, for example update_reach_and_probabilities should take :

int player, double[2] regret, double[2] probability, double[2] reach, int previousPlayer, double[2] previousProba, double[2] previousReach,...

as argument.

* Second improvement, have a class that contains all possible iteration variables you need (player, regret, probability, reach, previousPlayer...) as attributes, and have the iteration method in that class, doing :

for each node
set attributes variables for current node update
do update reach and proba for this node

for each node (reversed)
set attributes for current node regret computing
do update regret for this node

... so that you can have :
- no method call for one iteration
- all variables are class attributes so nothing is created during the iteration (no "double sum" for example).
- avoided all redundant array dereferencings

Other improvements can be performed after that, but I think it can be a good start.

Have a good coding time ;)

Author:  MrNice [ Sun May 11, 2014 8:40 pm ]
Post subject:  Re: Iterative CFR

Hi Pitt,

Thanks for your help.

Your second statement is clear but not really the 1st one.... Could you please explain me a bit more ?


Thanks.

MrNice

Author:  Pitt [ Mon May 12, 2014 10:51 am ]
Post subject:  Re: Iterative CFR

Sorry I wasn't clear.
The first improvement is passing all the variables you need directly in your calls to the two submethods :

Code:
Node n, previousNode ; // Put this as method variable or better : as class attribute

for(int nodeidx = 0; nodeidx<NODESNUMBER; nodeidx++)
{
n = NodeArray[nodeidx];
previousNode = NodeArray[n.PreviousNode];
update_reach_and_probabilities(n.regret, n.player, previousNode.player, previousNode.proba  ...);
}

... same for the backward one.


But I proposed it only to introduce the second one that should perform better inlining your two submethods so without method calls :)

Author:  Pitt [ Wed May 21, 2014 4:23 pm ]
Post subject:  Re: Iterative CFR

Hi !
I'm responding in this thread as this could be helpful for other people. So here is the MrNice's message :

Quote:
Hi Pitt,


I'm still looking how to move to iterative CFR...

I have applied your advices and I have increased speeds for iterative and recursive (thanks for that)... But my recursive has the same speed (more or less) then my iterative one...

I suppose there is another way to move to iterative... I'm doing iterative with 2 passes... Is there another way ?

But due to the fact that prob/reach need to be propagated top-down and ev down-top I don't really see how...

So if you have to advice that will be great ;)

Regards,

MrNice



First I think you should try to look at performance over bigger games, because small depth games have not many recursive calls.

And to answer how you could do it another way, I do it in a single loop that walks the tree like a recursive method.
It took me a pretty long time to do it. I'll try to explain it simple :

let's say you have a recursive method that walks your tree :
Code:
Object recCall(Node n, Object arg1, ..., argN){
    Object tmp1 = doFirstPass(n, arg1, ..., argN);
    Object[] childrenResults = new Object[n.childrenSize];
   for(int i = 0; i < n.childrenSize; i++)
      childrenResults[depth][i] = recCall(n.children[i], tmp1, argK, ...);
   return doSecondPass(n, childrenResults, tmp1, argH...);
}


First identify what data you need unmodified after each child recursive call (here : i, tmp1, argK, argH)
Identify then what are the recursive values you will need for your second pass (here : childrenResults).

If you have arrays to store all this for each depth, you can have a while loop that walks the tree exactly the same way as recursive algorithm.

Constants :
Code:
int maxDepth;
int maxChildren;


Iteration variables :

Code:
int depth : to know at what depth you are
boolean forward : to indicate the direction of your algorithm.


Iteration arrays to store all what you need:

Code:
Object[][] childrenResults = new Object[maxDepth][maxChildren];
int[] lastChildVisited = new int[maxDepth]; (to store i)
...
and other arrays with maxDepth length to store args etc...


Your algorithm will have the general form :
Code:
Object result;
Node n = getRoot();
while(true){
    if(forward){
      // Do the first pass
       // Store what you need to keep
       lastChildVisited[depth] = 0;
       node = n.children[0];
       depth++;
       continue;
   }
   
   childrenResults[depth][lastChildVisited[depth] ] = result;
   if(lastChildVisited[depth] == node.childrenSize -1){
        result = // Process children results;
        node = n.parent;
        depth--;
        continue;
    }
   lastChildVisited[depth] ++;
   node = n.children[ lastChildVisited[depth]];
   depth++;
   forward = true;
    continue;
}


I didn't write anything about terminal nodes and stop conditions (on depth obviously) and it's poorly pseudo-coded, but I hope you understand the principle.

Keep in mind I'm not sure it's really better than your two passes algorithm for performance. But you asked for another way, hope you'll enjoy ;)

Just for the algo programming skills, I think this systematical way to transform recursive algo to iterative one is a good knowledge .

And don't forget to test on bigger games so that you can really benchmark performance !

Cheers, tell me if something is unclear there...

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/