Re: opencfr and recusive function

2013-09-06T18:37:13+00:00

Okiiiiiiiii I get it (I think)...

When I pass a table to a fonction I pass the pointer...

That's the point

Regards,

Statistics: Posted by MrNice — Fri Sep 06, 2013 6:37 pm

Re: opencfr and recusive function

2013-09-06T18:22:54+00:00

Hey Guyz,

I have simplified the function to easily understand what's goin on :

Code:

#include 
#include 

#include 

void recursive_function(int a[2], int tracker)
{
   std::cout<<"[+] Starting function\n";
   if(tracker == 3)
   {
      return;
   }
   if(tracker == 2)
   {
      a[0] = 5;
      a[1]=6;
   }
   if(tracker<2)
   {
      recursive_function(a, tracker+1);
   }
   std::cout<<"A at "<   std::cout<<"[-] End function\n";

}

int main(int argc, char* argv[])
{
   int a[2];
   
   for(int i=0; i<2; i++)
   {
      a[i]=0;
   }
   recursive_function(a,0);
}

And the results are :

Code:

[+] Starting function
[+] Starting function
[+] Starting function
A at 2 : 5 et 6
[-] End function
A at 1 : 5 et 6
[-] End function
A at 0 : 5 et 6
[-] End function

So if you have an explication... I tough I should get 5 and 6 for the last step and 0 for previous but that's not the case

Statistics: Posted by MrNice — Fri Sep 06, 2013 6:22 pm

opencfr and recusive function

2013-09-06T13:32:44+00:00

Hi guyz,

in opencfr, there is a portion of the code that I don't undersant: When the algo is a a terminal node and EV has been calculated, how EV is back propagated to the previous node ????

any idea ?

It's seems that the EV are back propagated via the function parameter EV but that's possible... It is ?

Following the upgrade_regret function :

Code:

static void update_regret(leduc::sequence u, int buckets[2][2], int hole[2], int board, 
   int result, double reach[2], double chance, double ev[2], double cfr[2],
   regret_strategy strat[2]) {

  if (u.is_terminal()) {
    
    /* sequence is terminal */
    int amount = u.win_amount();
      
    if (u.is_fold()) {
         
      if (u.who_folded() == 0) {
            
            ev[0] = -amount*reach[1]*chance;
            ev[1] = amount*reach[0]*chance;
      } else {
            
            ev[0] = amount*reach[1]*chance;
            ev[1] = -amount*reach[0]*chance;
      }
         
    } else {
         
      /* sequence is a showdown */
      ev[0] = result*reach[1]*amount*chance;
      ev[1] = -result*reach[0]*amount*chance;
    }
      
  } else if (reach[0] < EPSILON && reach[1] < EPSILON) {
    
    /* cutoff, do nothing */
    ev[0] = ev[1] = 0;
      
  } else {
    
    /* some convience variables */
    int player   = u.whose_turn();
    int opponent = leduc::opposite_player(player);
    int round    = u.get_round();
      
      /* player is using regret minimizing strategy */
      //Get probability for the 3 possible actions
      double * average_probability = strat[player].get_average_probability(u, buckets[player][round]);
      double * regret = strat[player].get_regret(u, buckets[player][round]); 
    
      /* get the probabilty tuple for each player */
      double probability[3];
      strat[player].get_probability(u, buckets[player][round], probability);    
    
      /* first average the strategy for the player */
      
      
      for(int i=0; i<3; ++i) {
         
         average_probability[i] += reach[player]*probability[i];
      }
      
      /* now compute the regret on each of our actions */
      double expected = 0, sum = 0;
      double old_reach = reach[player];
      double delta_regret[3];
      for(int i=0; i<3; ++i) {
         
         if (u.can_do_action(i)) {
            
            reach[player] = old_reach*probability[i];
            update_regret(u.do_action(i), buckets, hole, board, result, reach, chance, ev, cfr, strat);
        
            delta_regret[i] = ev[player];
            //compute EV of the strategy = current expected value
            expected += ev[player]*probability[i];
            sum      += ev[opponent];
         }
      }
    
      /* restore reachability value */
      reach[player] = old_reach;
      
      /* subtract off expectation to get regret for each action*/
      for(int i=0; i<3; ++i) {
         
         if (u.can_do_action(i)) {
            
            delta_regret[i] -= expected;
            //regret for each action
            regret[i]       += delta_regret[i];
            cfr[player]     += max(0., delta_regret[i]);
         }
      }
    
      /* set return value */
      ev[player]   = expected;
      ev[opponent] = sum;    
   }
}

Statistics: Posted by MrNice — Fri Sep 06, 2013 1:32 pm

Poker-AI.org

Re: opencfr and recusive function

Re: opencfr and recusive function

opencfr and recusive function