Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 2:16 pm

All times are UTC




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: CFRM nodes with p=0
PostPosted: Tue Sep 29, 2015 6:31 pm 
Offline
New Member

Joined: Tue Sep 29, 2015 5:56 pm
Posts: 9
Hi, I am trying to learn CFRM and I am playing with external sampling. My code for bet nodes looks something like (inspired by amax code, which was very helpful):

Code:
   public double betNode(GameState state, double p, double op) {
      if (player == trainPlayer) {
         Node node = nodeMap.getNode(state, trainPlayer);
         double[] strategy = node.getRegretBasedStrategy(possibleActions);
         double factor = (1.0 / op) * p;
         node.updateCumulativeStrategy(strategy, factor);// DOES NOTHING IF p = 0!
         double[] u = new double[NUM_ACTIONS];
         double ev = 0;
         for (int i : possibleActions) {
            state.playerAction(i);
            u[i] = node(state, p * strategy[i], op);
            state.undo();
            ev += u[i] * strategy[i];
         }
         node.updateRegrets(u, ev, possibleActions);
         return ev;
      } else {
         playerStrategies[player].getStrategy(state, scratchStrategy);
         int action = Node.sampleStrategy(rnd, scratchStrategy);
         state.playerAction(action);
         double result = node(state, p, op * scratchStrategy[action]);
         state.undo();
         return result;
      }
   }


Suppose we get at infoset A, trained player is faced with a big raise and has extremely bad cards. The regret strategy says: 100% fold. All actions are tried, and when attempting a raise action, we sample an opponent reraise and we reach infoset B with trained player to move. Because the raise action at infoset A was not in A's regret strategy, parameter p is now zero. The regret strategy at B is (also) 100% fold.

Does it make any sense to evaluate other moves than fold at infoset B? Because p = 0, other moves don't contribute to B's EV, so the only merit would be to refine B's regret strategy in the hope that some time in the future B will be reached with p > 0. But isn't that a waste of time? Wouldn't it be better (leading faster to convergence) to concentrate efforts at moves in infosets reached with p>0? Or would skipping such moves lead to incorrect results? Or have I misunderstood something in the algorithm?


Top
 Profile  
 
 Post subject: Re: CFRM nodes with p=0
PostPosted: Sat Oct 17, 2015 1:33 pm 
Offline
New Member

Joined: Tue Sep 29, 2015 5:56 pm
Posts: 9
Update: I tried this in an experiment (heads up no limit) and it did not work out.

The algorithm converged quickly to an equilibrium. At first glance the obtained strategy look promising. But upon closer inspection I found fold probability for 44 as first action to be 100%, and there was no way the algorithm would ever be able to learn that a call or a raise would be better because regret values of sub nodes of call and/or raise, containing all pure strategies, would never update their regret values due to my "optimization".

So the obtained equilibrium was certainly not (close to) a Nash-equilibrium...


Top
 Profile  
 
 Post subject: Re: CFRM nodes with p=0
PostPosted: Sat Oct 24, 2015 9:27 pm 
Offline
Junior Member

Joined: Mon Apr 22, 2013 11:46 am
Posts: 34
You can't ignore them completely, but you don't necessarily have to traverse them every time.

Average strategy sampling sort of does this:
viewtopic.php?f=24&t=5


Top
 Profile  
 
 Post subject: Re: CFRM nodes with p=0
PostPosted: Sun Oct 25, 2015 7:21 pm 
Offline
New Member

Joined: Tue Sep 29, 2015 5:56 pm
Posts: 9
Thank you poor homeless guy for this link, it is really interesting. It describes exactly the kind of optimization I was looking for.

I will give the average strategy sampling a try when I have the time. I glanced through the paper, intuitively I think it should also be possible to skip moves with negative regret when p=0, but not always (as I tested), but with some non-zero probability, I will also retest this some time.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group