Poker-AI.org

Re: Merging Strategies

2013-11-28T10:41:21+00:00

You should differ game, player, strategy, ... ets. Tolstoy has a frase (all mixed up ... people, horses).
Let me place all points above i.
1. Game. Poker game. The is no game at all, and poker game at all. But there is particular poker game.
Or N particular poker games (if you want to use mixed strategy idea in you research) link

2. Player strategy. As you see (1). It is a finit number. Cause particular poker game has finit states, and in each state poker player had to choose his move (from finite varyants). One strategy will win or loose from another strategy fixed amount(on the average).

3. Poker tournaments. Or life . Poker players usualy play not one game. But some games (5000 hands for example). Also poker players are people. In each hand poker player can choose any poker strategy (mixed or non). And it is important to understand that mixed strategy != (not equal) that the poker player will choose it every game(hand) in tournament.
see link at the (1) what is mixed strategy.

4. Poker players. While each poker player choose partukular strategy in each hand. He can also vary his strategy each hand.
Cause poker player is a people, and lieves in a world there are infinit causes why he varyed his strategy.

5. Poker player model. If you read 1-4 its simple. No matter how the player looking at the game - his strategy on curent hand boils down to the single number. And then you can come up with as many reasons as you want to how player will change it throw the tournament (life).

Statistics: Posted by nefton — Thu Nov 28, 2013 10:41 am

Re: Merging Strategies

2013-05-27T10:15:19+00:00

I think what you're talking about, jerrys, has to do more with opponent modeling than optimally merging (known) strategies (unless I completely misunderstood your point). You'll find similar topics in the older U of A papers that discuss op range estimation via betting histories -- "something, something ... in a hostile environment."

Statistics: Posted by cantina — Mon May 27, 2013 10:15 am

Re: Merging Strategies

2013-05-24T16:24:45+00:00

Hi,

Long term lurker here. This thread seams very interesting and I finally felt obliged to register. I have experience with machine learning and (especially) its limitations so I appreaciate the idea of using different modules for specific feature extraction and then combining their outputs on a higher level. That's how our brain works, doesn't it?

As a poker player I've also thought a lot about the marriage between betting lines and equity. As all marriages it will only work if they did share something in common. One possibility is these are really two independent domains so marriage will (mostly) never work. But their is another: fold equity

I'm refering to this COTW thread on 2+2 and especially the attached graph.

So basically what happens when taking particular lines against villain is we shift his "natural" foldequity=f(equity) graph up and down. At some point we hit villain's threshold and he folds. Everybody has a threshold for his percieved equity below which he's not interested in playing a hand. For example with fish we have no fold equity if they catch anything on the flop (or even have an ace in their hole cards). So best (GTO optimal?) strategy against them is to value bet every street.

Now if we could model an opponents graph which I suspect will have an interesting shape common to particular opponent types we could somehow restrict villain's range based on the betting actions and thus provide a two way conduit between the two domains.

Poker player example: Hero holds AJ, flop comes Jxx. Villain is a TAG who knows where the fold button is. We have TPTK, great equity, great fold equity. We bet pot, villain reraises. Now many would agree that, bluffing aside, since we didn't realise our theoretical fold equity we can safely limit villains range to a made hand he's trying to protect (e.g. overpair) or a draw for which he thinks he has no implied odds to continue (e.g. obvious FD so if he hits we won't pay him). So we can perhaps at this point make corrections to our equity vs villains range by taking out the improbable holdings and come closer to our true equity in this hand and from there the most +EV move.

More or less this type of reasoning can be automated to link the card holdings with betting lines and perhaps a probabilistic algorithm sitting on top of the two inputs can create a good strategy.

Statistics: Posted by jerrys — Fri May 24, 2013 4:24 pm

Re: Merging Strategies

2013-05-24T03:17:05+00:00

I thought about the memory issues as well, and I can actually compress the cumulative strategies into single byte references (or use some other method of modeling) which gives a close-enough representation of the strategies while reducing their size to 1/8th the original. Slumbot 2012 did the same thing to fit his massive strategy on disk for the competition. I actually planned on encoding the strategies into NNs then running the above CFRM code on them so I don't have to worry about memory or game state discrepancies.

I think there has to be something mutually inclusive about the strategies by which you can tie information sets to one another, in all dimensions, as the dimensions themselves are only loosely related (i.e. hands to bet nodes). A strategy without cards or a strategy without bets have no mutual domain by which to extrapolate further. Think of two circles, A and B, drawn on a piece of paper. If they're overlapping by some degree, you can say that some part of circle A not included in that overlap relates to some other part of circle A that IS in that overlap, and therefor relates to circle B. If the circles aren't touching, or there isn't enough overlap, then it's hard or impossible to relate them because you have no descent reference by which to make the comparison.

There is probably some theoretical optimum domain overlap by which you can merge strategies, whereby it gives you the largest domain exclusivity while maintaining relational value in order to produce the best possible (merged) performance. Graph theory, maybe?

Statistics: Posted by cantina — Fri May 24, 2013 3:17 am

Re: Merging Strategies

2013-05-23T22:19:05+00:00

Hm isn't that how Hyperborean worked? Or at least the version that played against the human pros?
They had multiple strategies and kept EVs for each and picked the best one over time.

Of course this is more of an exploitive approach (is this or that strategy better against this opponent) and I think what you are looking for is a state dependent choice of strategy. Honestly I don't see why this shouldn't be possible but I don't recall any paper actually doing that. I think the problem here is to really get the benefit of differently sized abstractions you need to keep another strategy to store the cumulative strategy that is as big as the states spaces multiplied with each other.

I played around with an extreme of that idea:
Strategy 1 abstracts cards perfectly but only allows one bet size and one bet per round.
Strategy 2 abstracts bet sizes and petting sequences perfectly but disregards cards at all.
In theory we can then merge these and benefit from having two small trees to store only with really high precision. Sounds great, right? Well I couldn't get it to work. Either you just multiply action frequencies with each other, but then your mixing totally different information states into one value and I don't think it would work well in practice.
Or we can do what you proposed, to then do a learning process of how much of each strategy we need to mix together for each state. Well now we need another tree that keeps all the different information states that we want to differentiate, which gets really large if we want this to be useful.

So I couldn't really get the benefit working which I thought would be memory reduction. If the benefit you are trying to achieve is to is better accuracy and you don't really care about large memory consumption then yes I think this is a good method, especially to speed up the learning process as I could see this converging a lot faster than a single large strategy (laws of averages and such).

Statistics: Posted by Coffee4tw — Thu May 23, 2013 10:19 pm

Re: Merging Strategies

2013-05-22T09:58:59+00:00

Thinking about this further, you can actually use CFRM to decide the optimal distribution of strategies. This may also tie into previous discussions about using a formulaic bet size. Instead of just a bet size, it's essentially a formulaic everything.

My terribly sloppy pseudo-code (assumes three strategies to choose from):

Code:

Dim u_s(n) as Double
For j = 0 to n
  u_s(j) = children(j).Train();
Next j

Dim EV as double = 0
Dim U(2) as double

Dim s(2) as double
GetStrategy(r,s)

For i = 0 to 2
  For j = 0 to n
    U(i) += u_s(j) * s_s(i,j)
  Next j
  EV += U(i) * s(i)
Next i

For i = 0 to 2
  r(i) += EV - U(i);
Next i

All it's doing is treating the strategies as additional chance nodes.

Statistics: Posted by cantina — Wed May 22, 2013 9:58 am

Merging Strategies

2013-05-22T05:15:58+00:00

I was wondering if anybody has tried this, or if there is some literature about it somewhere? As I crunch different abstractions, generally speaking, I wonder if certain strategies wouldn't perform better given certain circumstances. For example, I can crunch an abstraction with fewer bet-type nodes, but larger hand value buckets (and any shade of gray there of), then, when a given game is being played, if the opponent uses a narrow range of actions, I could go with the strategy that employs larger hand value buckets while maintaining an accurate state translation. Thoughts?

Additionally, could I simply combine the two strategies by some weighted average if both games reflect the same history up to a given point in the hand? How to define that average? Might it be better to play strategy A at 90% and strategy B at 10% in certain histories but 50/50 at others? I'm not talking about methods of exploitation, just a way to generally improve performance by combining game abstractions solved with different depths/types of information.

Statistics: Posted by cantina — Wed May 22, 2013 5:15 am