Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 5:27 pm

All times are UTC




Post new topic Reply to topic  [ 8 posts ] 
Author Message
PostPosted: Mon Nov 16, 2015 6:01 pm 
Offline
New Member

Joined: Mon Nov 16, 2015 5:29 pm
Posts: 3
As a poker player who is generally interested in game theory and AI, I've been curious about this problem for quite a while.

Suppose we are looking at the equity for a specific hand against another specific hand on a hold'em flop. It might be 70/30 or 91/9 or 50/50. But this doesn't really tell the whole story. If we're going to plan out the hand, we should know the equities for every turn card. So a better way to describe the equity situation would be as an (unordered) set of 45 numbers between 0 and 1, representing the equity at each turn card. (However, there are a very limited number of these sets that could actually occur.)

So my first question is: What are some good ways to reduce the dimensionality of these "equity sets"? I'm familiar with deep learning, things like restricted boltzmann machines as ways to reduce dimensionality, but I'm thinking maybe that's overkill here?

Now we move on from flop hand vs hand to flop hand vs range analysis. This would could be modeled as a probability distribution over all the possible equity sets for the hand vs hand problem. Are there techniques to reduce the dimensionality here?

Finally we move from flop hand vs range to flop range vs range analysis. This would be a probability distribution over the flop hand vs range distributions.

In hold'em, intuitively we can think of our range vs range at any given point as composing a relatively few number of measurements. E.g., there would be the overall equity, how polarized each range is, how "nut-heavy" each range is, how draw heavy they are, how many of the draws are very strong (more than 9 outs) vs weak (less than 8 outs), etc... Basically, what I'm wondering is - what kind of machine learning techniques, if any, have been applied to extract these kinds of characteristics for range vs range situations.

From an AI perspective, I would think any spot in poker (no limit hold'em) can be modeled as:
1. Amount in pot and effective stack sizes
2. My position, size of bet I'm facing
3. The hand vs range characteristics of this spot (think of this as just a list of quantitative features/measurements)
4. The range vs range characteristics of this spot (again just a list of quantitative features/measurements)

Given actual hands, ranges and boards, how might we reduce this to get the features/measurements for number 3 and 4 above? I would think that each street would have a different way of modeling these range vs range characteristics. Pre flop would get even more complex.


Top
 Profile  
 
PostPosted: Mon Nov 16, 2015 10:18 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
So my first question is: What are some good ways to reduce the dimensionality of these "equity sets"? I'm familiar with deep learning, things like restricted boltzmann machines as ways to reduce dimensionality, but I'm thinking maybe that's overkill here?


I think these questions are too complex for todays computational limitations. Maybe in the next 5-10 years when quantum processors start to become available that it will not be a problem. There is just no way to pre compute equity sets (at least in the matter that you describe) for NLH or PLO that I am aware of although you could do them for situations that come up the most often which a number of bot runners do with software like PioSolver.

But..all of this is easy enough to compute in real time on a hand by hand basis so much of it is overkill like you say. In order to compute a post flop decision for my PLO bot it takes these 3 calculations minimum:

1. hand vs range to compute equity
2. range vs range to compute value combos (for bet sizing and hand repping)
3. range vs range to compute HU fold equity (pointless when 3+)

Those 3 are just the basics but it could take as many as 7 or 8 depending on the situation such as 3 way low SPR, river bluff catching, peel and jam etc.

Quote:
Given actual hands, ranges and boards, how might we reduce this to get the features/measurements for number 3 and 4 above? I would think that each street would have a different way of modeling these range vs range characteristics. Pre flop would get even more complex.


I use the same methods from flop to river so there is not really any significant difference in modeling/methods that I am aware of. OTR things are bit different but the concepts are the same. My methods for preflop are different from post flop but they are rather straight forward.


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 10:39 am 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
slocketping wrote:
Suppose we are looking at the equity for a specific hand against another specific hand on a hold'em flop. It might be 70/30 or 91/9 or 50/50. But this doesn't really tell the whole story. If we're going to plan out the hand, we should know the equities for every turn card. So a better way to describe the equity situation would be as an (unordered) set of 45 numbers between 0 and 1, representing the equity at each turn card. (However, there are a very limited number of these sets that could actually occur.)

So my first question is: What are some good ways to reduce the dimensionality of these "equity sets"? I'm familiar with deep learning, things like restricted boltzmann machines as ways to reduce dimensionality, but I'm thinking maybe that's overkill here?

UofA use Earth Movers Distance and K means to cluster hands http://poker-ai.org/phpbb/viewtopic.php?f=18&t=2381 Personally I think variance (and maybe skew and curtosis too) in equity is a lot simpler and might be worth trying. I've found ways of reducing the dimensionality, but I'm sceptical if deep learning would find them because the dimensionality of the original problem is so high.

slocketping wrote:
Now we move on from flop hand vs hand to flop hand vs range analysis. This would could be modeled as a probability distribution over all the possible equity sets for the hand vs hand problem. Are there techniques to reduce the dimensionality here?

Finally we move from flop hand vs range to flop range vs range analysis. This would be a probability distribution over the flop hand vs range distributions.

Why flop range vs range when you know flop hand? I don't necessarily disagree, but want to check we are on the same wavelength.

slocketping wrote:
In hold'em, intuitively we can think of our range vs range at any given point as composing a relatively few number of measurements. E.g., there would be the overall equity, how polarized each range is, how "nut-heavy" each range is, how draw heavy they are, how many of the draws are very strong (more than 9 outs) vs weak (less than 8 outs), etc... Basically, what I'm wondering is - what kind of machine learning techniques, if any, have been applied to extract these kinds of characteristics for range vs range situations.


Apart from the work quoted above I don't think anything has been published. I have my own approach which reduces the problem size to something quite manageable, but it doesn't use a machine learning approach. Rather I looked at what calculations had to be done to calculate everything completely accurately and then searched for good approximations. It doesn't use poker domain knowledge, because I don't have any. It has taken at least five years. I'd much preferred to have used a machine learning approach but couldn't find one that did the job.

slocketping wrote:
From an AI perspective, I would think any spot in poker (no limit hold'em) can be modeled as:
1. Amount in pot and effective stack sizes
2. My position, size of bet I'm facing
3. The hand vs range characteristics of this spot (think of this as just a list of quantitative features/measurements)
4. The range vs range characteristics of this spot (again just a list of quantitative features/measurements)

Given actual hands, ranges and boards, how might we reduce this to get the features/measurements for number 3 and 4 above? I would think that each street would have a different way of modeling these range vs range characteristics. Pre flop would get even more complex.


Once post flop is solved isn't pre-flop a piece of cake? Just run and cache all possible post flops?


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 6:52 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
Why flop range vs range when you know flop hand? I don't necessarily disagree, but want to check we are on the same wavelength.


Range vs Range is really important as you need it to compute the proper bet size and also to calculate GTO bluffs. You use RvR to figure out how many value combos you have in your perceived range which in turn affects how often you can bluff at GTO to make the villain indifferent to calling. The harder the board hits your perceived range the lower your bet size can be and vice versa. The actual hand has nothing to do with bet sizing which is a mistake many people make.


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 7:38 pm 
Offline
Site Admin
User avatar

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642
@Shalako Agreed.


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 9:41 pm 
Offline
New Member

Joined: Mon Nov 16, 2015 5:29 pm
Posts: 3
shalako wrote:
Quote:
Why flop range vs range when you know flop hand? I don't necessarily disagree, but want to check we are on the same wavelength.


Range vs Range is really important as you need it to compute the proper bet size and also to calculate GTO bluffs. You use RvR to figure out how many value combos you have in your perceived range which in turn affects how often you can bluff at GTO to make the villain indifferent to calling. The harder the board hits your perceived range the lower your bet size can be and vice versa. The actual hand has nothing to do with bet sizing which is a mistake many people make.


As a live poker player, I both agree and disagree. When I'm choosing a sizing for a value bet, I'll look both at the range I'm attacking (that I think will call) as well as how my bet is perceived, and also things like does my sizing cap my range (making it more likely that opponent will bluff raise), etc. So I think there's a mix of sizings, even in GTO play. But generally, yes, if we have a very strong range, we generally have to lower our bet size.


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 9:57 pm 
Offline
New Member

Joined: Mon Nov 16, 2015 5:29 pm
Posts: 3
spears wrote:
I don't necessarily disagree, but want to check we are on the same wavelength.

Once post flop is solved isn't pre-flop a piece of cake? Just run and cache all possible post flops?


My "approach" to the problem of creating a poker bot would look like the following. Note, I'm not all that familiar with the state of the art, though I have at times read through UofA's papers and things like CFRM. This is just kinda my off-the-cuff how I would tackle the problem as a first shot.

1. We start out with an actual game state including hole cards, community cards, pot size, stack size, a range, and an opponent range.
2. We take the actual game state and apply a transformation to reduce it to an abstracted state (including the hand v range and range v range features I mentioned in the OP).
3. From hand histories (or through self-play), we apply machine learning techniques to approximate EV given: (a) the abstract game state, (b) the bet size / action.
4. We choose the action with the highest EV. In order to have a mixed strategy, we might make use of many agents as well as look at expected variance in a single agent's EV estimation to determine a distribution among all possible actions. This of course gets very complicated if we want to approach GTO play. My idea is to have a continuous self-learning model so that the action probabilities might be very exploitable at any single decision, but because they would change each time a decision is made, they shouldn't be nearly as exploitable.
5. Once we've decided upon an action, we use Bayesian inference to update our range. Or if it was villain's action, we update their range using this model.
6. Now that we have a concrete game state again where ranges are assigned, we can go back to (2) and abstract the state again for the next state/action.


Anyway, hopefully this clarifies where I'm coming from with this question. I understand that this is probably a lot different than how most bots work (I guess they are forced basically to solve whole game trees in a reduced version of the game), but I think this approach is straightforward and intuitive.


Top
 Profile  
 
PostPosted: Tue Nov 17, 2015 11:26 pm 
Offline
Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269
Quote:
As a live poker player, I both agree and disagree. When I'm choosing a sizing for a value bet, I'll look both at the range I'm attacking (that I think will call) as well as how my bet is perceived, and also things like does my sizing cap my range (making it more likely that opponent will bluff raise), etc. So I think there's a mix of sizings, even in GTO play. But generally, yes, if we have a very strong range, we generally have to lower our bet size.


Hey. I never thought about sizing capping the range but your exactly right. I am not sure how I can fix this but I will think on it. Can you give me some examples just so I can wrap my head around it?

I like GTO as a good foundation to follow but realistically its rather tough to achieve (especially live). My bot follows it generally (like my sizing) but it deviates quite a bit. One example is c betting. Generally the field over folds so you have to bet fold at a higher rate then GTO otherwise your missing out on a ton of value obviously. So I guess its a combo of GTO and exploitative concepts. OTR alot of GTO goes out the window and you have to do more range exploitation anyway I guess.

So now you got me thinking of ways I can improve it and you have me hooked on more sizing options. Its also not taking advantage of thin value OTR. I am not quite sure what to do about that yet.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group