Poker-AI.org http://poker-ai.org/phpbb/ |
|
opencfr and recusive function http://poker-ai.org/phpbb/viewtopic.php?f=24&t=2576 |
Page 1 of 1 |
Author: | MrNice [ Fri Sep 06, 2013 1:32 pm ] |
Post subject: | opencfr and recusive function |
Hi guyz, in opencfr, there is a portion of the code that I don't undersant: When the algo is a a terminal node and EV has been calculated, how EV is back propagated to the previous node ???? any idea ? It's seems that the EV are back propagated via the function parameter EV but that's possible... It is ? Following the upgrade_regret function : Code: static void update_regret(leduc::sequence u, int buckets[2][2], int hole[2], int board,
int result, double reach[2], double chance, double ev[2], double cfr[2], regret_strategy strat[2]) { if (u.is_terminal()) { /* sequence is terminal */ int amount = u.win_amount(); if (u.is_fold()) { if (u.who_folded() == 0) { ev[0] = -amount*reach[1]*chance; ev[1] = amount*reach[0]*chance; } else { ev[0] = amount*reach[1]*chance; ev[1] = -amount*reach[0]*chance; } } else { /* sequence is a showdown */ ev[0] = result*reach[1]*amount*chance; ev[1] = -result*reach[0]*amount*chance; } } else if (reach[0] < EPSILON && reach[1] < EPSILON) { /* cutoff, do nothing */ ev[0] = ev[1] = 0; } else { /* some convience variables */ int player = u.whose_turn(); int opponent = leduc::opposite_player(player); int round = u.get_round(); /* player is using regret minimizing strategy */ //Get probability for the 3 possible actions double * average_probability = strat[player].get_average_probability(u, buckets[player][round]); double * regret = strat[player].get_regret(u, buckets[player][round]); /* get the probabilty tuple for each player */ double probability[3]; strat[player].get_probability(u, buckets[player][round], probability); /* first average the strategy for the player */ for(int i=0; i<3; ++i) { average_probability[i] += reach[player]*probability[i]; } /* now compute the regret on each of our actions */ double expected = 0, sum = 0; double old_reach = reach[player]; double delta_regret[3]; for(int i=0; i<3; ++i) { if (u.can_do_action(i)) { reach[player] = old_reach*probability[i]; update_regret(u.do_action(i), buckets, hole, board, result, reach, chance, ev, cfr, strat); delta_regret[i] = ev[player]; //compute EV of the strategy = current expected value expected += ev[player]*probability[i]; sum += ev[opponent]; } } /* restore reachability value */ reach[player] = old_reach; /* subtract off expectation to get regret for each action*/ for(int i=0; i<3; ++i) { if (u.can_do_action(i)) { delta_regret[i] -= expected; //regret for each action regret[i] += delta_regret[i]; cfr[player] += max(0., delta_regret[i]); } } /* set return value */ ev[player] = expected; ev[opponent] = sum; } } |
Author: | MrNice [ Fri Sep 06, 2013 6:22 pm ] |
Post subject: | Re: opencfr and recusive function |
Hey Guyz, I have simplified the function to easily understand what's goin on : Code: #include <stdio.h> #include <stdlib.h> #include <iostream> void recursive_function(int a[2], int tracker) { std::cout<<"[+] Starting function\n"; if(tracker == 3) { return; } if(tracker == 2) { a[0] = 5; a[1]=6; } if(tracker<2) { recursive_function(a, tracker+1); } std::cout<<"A at "<<tracker<<" : "<<a[0]<<" et "<<a[1]<<"\n"; std::cout<<"[-] End function\n"; } int main(int argc, char* argv[]) { int a[2]; for(int i=0; i<2; i++) { a[i]=0; } recursive_function(a,0); } And the results are : Code: [+] Starting function [+] Starting function [+] Starting function A at 2 : 5 et 6 [-] End function A at 1 : 5 et 6 [-] End function A at 0 : 5 et 6 [-] End function So if you have an explication... I tough I should get 5 and 6 for the last step and 0 for previous but that's not the case |
Author: | MrNice [ Fri Sep 06, 2013 6:37 pm ] |
Post subject: | Re: opencfr and recusive function |
Okiiiiiiiii I get it (I think)... When I pass a table to a fonction I pass the pointer... That's the point Regards, |
Page 1 of 1 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |