Poker-AI.org Poker AI and Botting Discussion Forum 2019-04-26T21:05:46+00:00 http://poker-ai.org/phpbb/feed.php?f=24&t=2483 2019-04-26T21:05:46+00:00 2019-04-26T21:05:46+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7925#p7925 <![CDATA[Re: Public chance sampling]]> Statistics: Posted by spears — Fri Apr 26, 2019 9:05 pm


]]>
2019-04-26T18:33:33+00:00 2019-04-26T18:33:33+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7924#p7924 <![CDATA[Re: Public chance sampling]]> Statistics: Posted by FlashPlayer — Fri Apr 26, 2019 6:33 pm


]]>
2019-04-25T09:15:03+00:00 2019-04-25T09:15:03+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7923#p7923 <![CDATA[Re: Public chance sampling]]> There is good info and links in http://www.poker-ai.org/phpbb/viewtopic.php?f=26&t=3100

Myself, I'm going for my own low cost variant of DeepStack. This is the zeitgeist. AlphaGo, DeepStack, Libratus all use machine learning to generate evaluation functions at the leaves of the current tree.

Statistics: Posted by spears — Thu Apr 25, 2019 9:15 am


]]>
2019-04-25T08:58:42+00:00 2019-04-25T08:58:42+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7922#p7922 <![CDATA[Re: Public chance sampling]]> spears wrote:

Have you tried these from above?
viewtopic.php?p=7165#p7165
viewtopic.php?p=4125#p4125

Sorry, can't put much time into this, very busy at the moment.


Thanks)) That was really helpful.
Irony was that all answers are in the same topic lol)
I checked amax code and this helped alot with fast BR realization.

Actualy about this topic. At this moment there are so much CFR algo variants. Sampling class and not sampling. So general question is - what algo at this moment is a state of art in terms of fast convergence for poker domain (HU NLH)? I am interested in sampling algos because they can be paralleled with good efficiency.
I know that in general all sampling algos (Monte Carlo class of algo) are faster than non-sampling (like CFR, CFR+, Pure CFR etc). And PCS is the best in sampling class. Any updates with last years?
Also i heard that Pio solver uses non-CFR class algo inside. What type of algos to find equilibria also exists?

Statistics: Posted by FlashPlayer — Thu Apr 25, 2019 8:58 am


]]>
2019-04-24T15:53:25+00:00 2019-04-24T15:53:25+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7920#p7920 <![CDATA[Re: Public chance sampling]]> viewtopic.php?p=7165#p7165
viewtopic.php?p=4125#p4125

Sorry, can't put much time into this, very busy at the moment.

Statistics: Posted by spears — Wed Apr 24, 2019 3:53 pm


]]>
2019-04-23T14:05:32+00:00 2019-04-23T14:05:32+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7917#p7917 <![CDATA[Re: Public chance sampling]]> spears wrote:

I would love to know how commercial programs are as fast as they are, both for solving NE and for calculating BR. Somebody on this board HontoNiBaka? recently pointed out that the trees they use are a lot smaller than you would expect so maybe they exploit isomorphisms. I've never tried myself, but can you use sampling to speed up BR?


I tried to use sampling for BR computing but for some reason results are always different with "brute force" BR calculations. Very different. So i'm sure that i made it totally wrong but didn't find a reason.

What i did is just this:
1. Sample player cards, opp cards and board.
2. Walk the tree like we are walking it in classic BR algo - choosing max ev in player nodes and weighting ev (weights are from average strategy of course) in opponent nodes.
3. In terminal nodes as we have all info about cards and board i can calcultate utility - so everything looks easy here (no hand vs range calculations).
3. As result of this walk i have a br that i sum with other br from other iterations and in the end just divide on iterations count.

For some reason this resulting in wrong BRs. And i think i have some error in approach. Like maybe i don't need to sample opponent cards and only use his range or smth like this. So i tried this and then returned to optimizations of "brute force" approach.

About isomorphisms - yes it could speed up, but for example for simple push/checkdown tree from flop (around 13 nodes at all include checks on turn and river) my BR for one player calculates around 80 seconds in one thread. And this is after i did ALL calculations of hand ranks for showdowns. So literally this is a time that CPU needs to iterate through all tripples like hand1/hand2/board and check which rank is bigger in every utility node. Isomorphism can speed up this but not sure it can speed up somehow to 40 times.
Anyway speed of calculations depends on how we preparing data before tree walk.

I think i should describe my current algo a bit - maybe we can find weak place.

After CFR algo we have a tree. I'm using abstractions, so in each node (actually on each street) for every pair hand/board i have a bucket ("information set" they call it, but i preffer a "bucket"). So as result of CFR algo i have a strategy (probabilities with which i should choose every child action here) for each bucket in every node.
Now i take player's range, opponent's range and initial board (i am working on postlop solver, so i have a board from start - lets assume we are talking about a tree from flop - so i have three board cards on start).
Next i iterate through all player's hands and all possible boards. One iteration has on start player hand, opp range (i don't iterate opp range), full board (5 cards) and a probability of this choise. In general probability here is a player's hand probability, because board has same probability every time.
In iteration main idea is to take opponent's range and update probabilities of each hand walking through tree. In each opponent's node i take every hand from range, calculate a bucket for them (i know a board, so i can get a bucket), and then multiply this hand probability on strategy probability for this bucket in this node. And so on until we are not in utility node.
Inside utility node i still have player's hand (it is needed only here) and updated opponent's range. So i can calculate an utility just comparing hand ranks (with probability) of all opponent's range and player's hand rank.

That's all.

Problem here is that i need to iterate all possible boards. I can't take only opponent's range and with only one tree walk update all hands probabilities for all boards. Because in each opponent's node i need a board to get bucket. And i need a bucket to get strategy to update probabilities. And so on.
Yes of course i can do one tree walk, but then i will need to iterate through all boards in every utility node. Anyway this loop must be somewhere. And as i undestand right there is a major upgrade of this algo that removes this "all boards" loop somehow. But i cant find this ugrade)

Any ideas?) Maybe i'm doing something totally wrong?

Statistics: Posted by FlashPlayer — Tue Apr 23, 2019 2:05 pm


]]>
2019-04-23T09:03:38+00:00 2019-04-23T09:03:38+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7916#p7916 <![CDATA[Re: Public chance sampling]]>
I doubt FBR helps you as I expect you are calculating BR in your own abstraction already.

Statistics: Posted by spears — Tue Apr 23, 2019 9:03 am


]]>
2019-04-22T10:31:31+00:00 2019-04-22T10:31:31+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7913#p7913 <![CDATA[Re: Public chance sampling]]>
I am also working on best response inplementation now and very confused how commercial solvers (CREv, Monker, Pio etc) compute it so fast? They compute it on fly. But as i understand right - coarse algo here is:
1. Train few iterations with CFR
2. Stop training and compute BR. BR here is a brute force of all chance situations (our range, opp range, board). In every situation we calculate our best response and in the end sum them with probabilities (i mean i know what is BR and how it is calculated in general).
3. Go to 1 and so on..

But point 2 takes lot of time because this is at least one tree travel with lot of computations inside. But as i mentioned before - solvers do this "on fly" like they store and reuse some data during training that helps them to compute BR fast.
I searched for any algos for this, but didn't find anything except http://martin.zinkevich.org/publication ... 1_rgbr.pdf. Algo in this paper states how to minimize tree walks to one. But computations are still too long - checked it.

Any ideas what helps them (solvers) compute BR in second?

P.S (found after more researching): can it be something called "Frequentist Best Response" described here: https://poker.cs.ualberta.ca/publicatio ... -rnash.pdf ?

Statistics: Posted by FlashPlayer — Mon Apr 22, 2019 10:31 am


]]>
2019-04-21T12:14:51+00:00 2019-04-21T12:14:51+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7912#p7912 <![CDATA[Re: Public chance sampling]]> FlashPlayer wrote:

spears wrote:
2. Villain takes an action that is illegal with any hand. Does this ever happen? If it does, I'd be inclined to proceed assuming he took the most infrequently taken action.


Thanks. Can we discuss this option closely?
...
Thats why don't understand how you want to proceed play here?
...

You must play. If you don't play casino will assume you fold and I doubt that is optimal.
I'm assuming that we are talking about the case where villain is making a mistake, rather than taking an action that you haven't modelled.
Do you know for sure that there are modelled actions (other than folds) for which there are no hands that should play them?
Assuming there are, my suggestions:
1. Play assuming the closest case, ie assume villain took the least probable non zero action
2. Take a look at the game tree. Depending on the algorithm you used there might be a strategy for both you and villain remaining from earlier in the solution process when villains chance of taking this action was greater than zero.
3. Call to the end of the hand

Statistics: Posted by spears — Sun Apr 21, 2019 12:14 pm


]]>
2019-04-21T11:06:38+00:00 2019-04-21T11:06:38+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7911#p7911 <![CDATA[Re: Public chance sampling]]> spears wrote:

2. Villain takes an action that is illegal with any hand. Does this ever happen? If it does, I'd be inclined to proceed assuming he took the most infrequently taken action.


Thanks. Can we discuss this option closely?
Illegal action here means that every hand in GTO strategy has zero probability to choose this action. And if my opponent is too far from GTO play and choosed this zero probability action - i can't do anything with my GTO because i have no strategy for this case. Because zero probability of action means that zero range of opponent will choose this action and nothing can be calculated then here for my GTO strategy (utilities, cfr values etc).
Thats why don't understand how you want to proceed play here? If we have no strategy for this case.

Core problem here is that despite the fact that we have full tree and calculatate GTO strategy - there might be unreachable for GTO nodes. But in real play villain obviously can reach any node. Or i missed something?

This is very cofusing case for me because from one side i have full GTO strategy, but there are situations when i can't do anything with opponent's play.

Statistics: Posted by FlashPlayer — Sun Apr 21, 2019 11:06 am


]]>
2019-04-21T08:47:33+00:00 2019-04-21T08:47:33+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7910#p7910 <![CDATA[Re: Public chance sampling]]> FlashPlayer wrote:

Got a new question :D

Lets assume that we have HU game, we have a big (for example FULL) tree of this game and we calculated GTO for it. Perfect situation. So we have a strategy for every situation in this tree.
But in this tree in some opponent node there are few actions, and one of them (like bet with sizing x100500, or fold with nuts only) has 0% probability for opponent to choose during GTO play.
And what we need to do if our opponent during real game choosed this 0% prob action?
In our tree, if we used CS CFR algo we visited this node maybe one time, got very negative regret for opponent, set probability to zero and then didn't visit this node again, so we have no strategy for this node and all nodes after it.
But logic says that if opponent choosed so unliked action - we must exploit him so much in this situation if we play GTO (and we have this FULL tree with calculated GTO).
But how we can do this?


I don't really know, but maybe this might help.

There are two cases:
1. Villain takes an action that is legal with some hand but not the one he holds. This is non optimal so he will lose in the long term against a hero that continues to play according to the actions that villain takes
2. Villain takes an action that is illegal with any hand. Does this ever happen? If it does, I'd be inclined to proceed assuming he took the most infrequently taken action.

It might be useful to look at https://www.youtube.com/watch?v=qndXrHcV1sM

Libratus suggests you don't really have to exploit to win conclusively. Nevertheless, my opinion is that you have to:
1. Record your opponents action frequencies and showdown hands, then determine his strategy from that info. This takes too many hands to be useful, so you could accelerate it by seeing if the statistics you do collect correspond to previously identified patterns of poor play
2. Exploit his strategy, while not being too exploitable yourself. I've done some experiments on toy games that suggest you can do this by finding the pure strategy that exploits his strategy, and mixing it with your NE strategy. The results I got suggest that small deviations from NE pay off quite well. I guess you could monitor villain statistics for signs that he reacts to your exploitation too. University of Alberta did some work on this - Data biased robust counter strategies. From the work I did on toy games I don't completely agree with their conclusion that you have to recompute NE for every opponent - but I could be wrong.

Statistics: Posted by spears — Sun Apr 21, 2019 8:47 am


]]>
2019-04-20T08:51:41+00:00 2019-04-20T08:51:41+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7908#p7908 <![CDATA[Re: Public chance sampling]]>

Lets assume that we have HU game, we have a big (for example FULL) tree of this game and we calculated GTO for it. Perfect situation. So we have a strategy for every situation in this tree.
But in this tree in some opponent node there are few actions, and one of them (like bet with sizing x100500, or fold with nuts only) has 0% probability for opponent to choose during GTO play.
And what we need to do if our opponent during real game choosed this 0% prob action?
In our tree, if we used CS CFR algo we visited this node maybe one time, got very negative regret for opponent, set probability to zero and then didn't visit this node again, so we have no strategy for this node and all nodes after it.
But logic says that if opponent choosed so unliked action - we must exploit him so much in this situation if we play GTO (and we have this FULL tree with calculated GTO).
But how we can do this?

Statistics: Posted by FlashPlayer — Sat Apr 20, 2019 8:51 am


]]>
2019-04-18T15:12:02+00:00 2019-04-18T15:12:02+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7904#p7904 <![CDATA[Re: Public chance sampling]]> I finally solved this problem and my CFR algo now provides close to famous solvers results.

Statistics: Posted by FlashPlayer — Thu Apr 18, 2019 3:12 pm


]]>
2019-04-15T10:07:54+00:00 2019-04-15T10:07:54+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7900#p7900 <![CDATA[Re: Public chance sampling]]> Quote:

So my question is: how to correctly use non zero starting pot when calculating CFR (i am using CS variant) from flop?

Quote:

If 1st player went All In and second folds - what is utility here? 40 for 1st and 0 for 2nd?


You just need to figure out a convention for how the pot grows and how winnings are allocated. So here is one suggestion, from many possibilities

- Assume starting stacks are both 100
- Increment pot by 2 * amount to call when call or raise occurs.
- At start of flop pot is 2*40
- Player 1 goes all in, ie bets 60. Player 2 calls. Pot is now 200. Player 2 has best hand, gets half of pot, ie 100. Terminal utility is +100 and goes to last player to act, player 2
- or
- Player 1 goes all in, ie bets 60. Player 2 folds. Pot is still 80. Player 2 loses half of pot ie -40. Terminal utility is -40 and goes to last player to act, player 2

Statistics: Posted by spears — Mon Apr 15, 2019 10:07 am


]]>
2019-04-14T19:39:59+00:00 2019-04-14T19:39:59+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7897#p7897 <![CDATA[Re: Public chance sampling]]>
So my question is: how to correctly use non zero starting pot when calculating CFR (i am using CS variant) from flop?

Lets take a look on a push/checkdown tree from flop, where first player can check or push, and second can check or push on 1st players check. If they both checked - they are checking turn and river till showdown.

So if both players have stacks 100 at the start of flop and zero pot (yeah, i know this is impossible, but for example) the utilities in terminal nodes are 100 or 0. 100 when there was All In and Call. 0 when they both checked, or one went All In and other folded.
And this type of tree after lot of iterations ends up with around zero best responses for both players. This looks ok.

But what if starting pot equals 40 for example? How i need to change utilities and what best response i must achieve after training?

If 1st player went All In and second folds - what is utility here? 40 for 1st and 0 for 2nd? But this is a zero sum game, so sum of utilities in each terminal node must be zero. Or not?
Also i thought that best response here (after training) is 20 for both, because with simmetrical tree and non zero starting bank - both players must print money here in long distance and logic (and commercial solvers) says that there EV (and best response) is a half of starting bank. But my experiments ends with smth like BR1=31 and BR2 = 29. This maybe a result of incorrect utilities - thats why i'm looking for help.

Statistics: Posted by FlashPlayer — Sun Apr 14, 2019 7:39 pm


]]>
2017-05-17T18:23:28+00:00 2017-05-17T18:23:28+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7211#p7211 <![CDATA[Re: Public chance sampling]]>
Has anyone tried extending this to more than 2 players?

Statistics: Posted by DreamInBinary — Wed May 17, 2017 6:23 pm


]]>
2017-04-18T12:34:55+00:00 2017-04-18T12:34:55+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7167#p7167 <![CDATA[Re: Public chance sampling]]> DreamInBinary wrote:

... I think could be vectorised further by replacing the for loop with something along the lines:
Code:
strengths = oppreach[0:-1].cumsum() - oppreach[1:]

assuming Scala allows for element-wise substraction and Python-like subindexing of arrays.

Sorry, can't figure out Python fast enough to comment.

DreamInBinary wrote:

Correct me if I'm wrong, but your code seems to assume:
1. oppReach is indexed by ranks ie. own and opponent rank vectors are the same
2. Ranks do not repeat. Maybe oppReach summed over the same ranks?
3. The code does not allow for dependent ranks. Eg. suppose if own rank is 5 then the opp can't have 5 and 3

The 1,2 I see how to handle, but I am puzzled by the third one. I wonder is something like that would work:
Code:
  for(i <- 1 until ranks.length) strengths(i) = strengths(i - 1) + oppReach(i - 1) - oppReach(i) - sum_{i conflicts with j} oppReach(j)



You've got the idea. Ranks repeat a lot and potentially this allows us to shorten the loop though I didn't do that myself. I deal with conflicts approximately and only when oppReach is very uneven.

Statistics: Posted by spears — Tue Apr 18, 2017 12:34 pm


]]>
2017-04-18T10:22:46+00:00 2017-04-18T10:22:46+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7166#p7166 <![CDATA[Re: Public chance sampling]]>
Code:
strengths = oppreach[0:-1].cumsum() - oppreach[1:]

assuming Scala allows for element-wise substraction and Python-like subindexing of arrays.

Correct me if I'm wrong, but your code seems to assume:
1. oppReach is indexed by ranks ie. own and opponent rank vectors are the same
2. Ranks do not repeat. Maybe oppReach summed over the same ranks?
3. The code does not allow for dependent ranks. Eg. suppose if own rank is 5 then the opp can't have 5 and 3

The 1,2 I see how to handle, but I am puzzled by the third one. I wonder is something like that would work:
Code:
  for(i <- 1 until ranks.length) strengths(i) = strengths(i - 1) + oppReach(i - 1) - oppReach(i) - sum_{i conflicts with j} oppReach(j)

Statistics: Posted by DreamInBinary — Tue Apr 18, 2017 10:22 am


]]>
2017-04-18T16:25:43+00:00 2017-04-18T08:12:33+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7165#p7165 <![CDATA[Re: Public chance sampling]]> - Ahead of time I find the vector of integers that sorts ranks into ascending order.
- At runtime I use this sort vector to rearrange opponents probabilities.
- Then, in Scala

Code:
  val ranks = Array(1,2,3,4,5)
  val oppReach = Array(0.1,0.2,0.3,0.1,0.2)
  val payoff = 1.3
 
  val strengths = Array.ofDim[Double](ranks.length)
  strengths(0) = oppReach(0) - oppReach.sum
  for(i <- 1 until ranks.length) strengths(i) = strengths(i - 1) + oppReach(i - 1) + oppReach(i)
 
  val evs = strengths.map { s => s * payoff }

Statistics: Posted by spears — Tue Apr 18, 2017 8:12 am


]]>
2017-04-15T10:52:52+00:00 2017-04-15T10:52:52+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=7162#p7162 <![CDATA[Re: Public chance sampling]]>
Currently I have just the naive form:
terminal_ev = ( non_conflict_matrix * payoffs ) dot reach_opp

Is vector form for the efficient version even possible?

Statistics: Posted by DreamInBinary — Sat Apr 15, 2017 10:52 am


]]>
2013-05-10T15:24:12+00:00 2013-05-10T15:24:12+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4128#p4128 <![CDATA[Re: Public chance sampling]]> Statistics: Posted by amax — Fri May 10, 2013 3:24 pm


]]>
2013-05-10T08:16:32+00:00 2013-05-10T08:16:32+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4127#p4127 <![CDATA[Re: Public chance sampling]]> Statistics: Posted by spears — Fri May 10, 2013 8:16 am


]]>
2013-05-10T07:22:07+00:00 2013-05-10T07:22:07+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4126#p4126 <![CDATA[Re: Public chance sampling]]> and tie. In the code however, only wins and loses are handled (including card removal):

Code:

      public override double[] TrainPublicChanceSampling(int trainplayer, PublicIteration iteration, double[] p, double[] op)
      {
         var pholes = iteration.GetHoles(trainplayer);
         var oholes = iteration.GetHoles(trainplayer ^ 1);

         var ev = new double[p.Length];

         var wincr = new double[52];

         double winsum = 0;
         int j = 0;

         for (int i = 0; i < p.Length; i++)
         {
            while (oholes[j].Rank < pholes[i].Rank)
            {
               winsum += op[j];
               wincr[oholes[j].Card1] += op[j];
               wincr[oholes[j].Card2] += op[j];
               j++;
            }

            ev[i] = (winsum - wincr[pholes[i].Card1] - wincr[pholes[i].Card2]) * value;
         }

         var losecr = new double[52];
         double losesum = 0;
         j = op.Length - 1;

         for (int i = p.Length - 1; i >= 0; i--)
         {
            while (oholes[j].Rank > pholes[i].Rank)
            {
               losesum += op[j];
               losecr[oholes[j].Card1] += op[j];
               losecr[oholes[j].Card2] += op[j];
               j--;
            }

            ev[i] -= (losesum - losecr[pholes[i].Card1] - losecr[pholes[i].Card2]) * value;
         }

         return ev;
      }

Statistics: Posted by proud2bBot — Fri May 10, 2013 7:22 am


]]>
2013-05-10T06:17:43+00:00 2013-05-10T06:17:43+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4125#p4125 <![CDATA[Re: Public chance sampling]]>
http://webdocs.cs.ualberta.ca/~bowling/papers/11ijcai-rgbr.pdf wrote:

Example 3. Suppose we can sort each players’ information
sets by “rank”, and the utility only depends upon the
relative ordering of the players’ ranks. This is exactly the
situation that occurs in poker. For the moment, let us assume
the distribution of the players’ ranks are independent.
In this case, evaluating each of our information sets requires
only O(n) work. We know that our weakest information set
will be weaker than some of the opponent’s hands, equal to
some, and better than some. We keep indices into the opponent’s
ordered list of information sets to mark where these
changes occur. To evaluate our information set, we only need
to know the total probability of the opponent’s information
sets in these three sections. After we evaluate one of our
information sets and move to a stronger one, we just adjust
these two indices up one step in rank.

Statistics: Posted by spears — Fri May 10, 2013 6:17 am


]]>
2013-05-09T21:49:15+00:00 2013-05-09T21:49:15+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4124#p4124 <![CDATA[Re: Public chance sampling]]> It looks however that the implementation is for holdem, so I wonder if this case is handled differently, if the EV is not accurate or if I just didnt understand the algorithm completely?

Statistics: Posted by proud2bBot — Thu May 09, 2013 9:49 pm


]]>
2013-05-07T11:17:39+00:00 2013-05-07T11:17:39+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4114#p4114 <![CDATA[Re: Public chance sampling]]> Statistics: Posted by amax — Tue May 07, 2013 11:17 am


]]>
2013-05-07T06:36:06+00:00 2013-05-07T06:36:06+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4112#p4112 <![CDATA[Re: Public chance sampling]]> http://poker-ai.org/archive/www.pokerai ... 22&p=42973

Statistics: Posted by spears — Tue May 07, 2013 6:36 am


]]>
2013-05-06T01:04:41+00:00 2013-05-06T01:04:41+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2483&p=4109#p4109 <![CDATA[Public chance sampling]]> http://webdocs.cs.ualberta.ca/~johanson/publications/poker/2012-aamas-pcs/2012-aamas-pcs.pdf). As far as I understand, the variables my_i and my_-i are vectors of size (48 choose 2) where each entry represents one of the possible hole cards we can have given the board and indicates how often we have this hand in our range given the history. The strategy sigma is similarly a matrix with (48 choose 2) lines and one column for each action. Is this so far correct?
No what I don't get is the following: how do we - given we have the two reachability vectors and the hand strength of each hand - perform the efficient terminal node evaluation?

Statistics: Posted by proud2bBot — Mon May 06, 2013 1:04 am


]]>