Heads-up Limit Holdem Poker is Solved (thanks to CFR+)

In 2014, The University of Alberta researchers used their CFR+ algorithm to effectively solve this poker variant, which is no small feat. They examined 3.19×10^14 decision points! It’s like attempting to count all of the stars in a galaxy, except instead of stars, you have intricate poker hands and betting tactics.

Imagine you’re playing a game of poker against yourself, but instead of for pleasure, you’re trying to become the best poker player you can be. That is precisely what the CFR+ (Counterfactual Regret Minimization Plus) algorithm accomplishes. It starts out with no idea how to play and makes random moves. After each game, it reflects on its decisions and wonders, “If I had done this instead of that, would I have done better?” It then alters its strategy to favor the better moves more frequently.

Think of it as learning to ride a bike by repeatedly falling off. Each time you fall, you learn what not to do next time. CFR+ performs this over billions of poker games, fine-tuning its strategy to reduce its “regret” – the difference between what it did and what the ideal option would have been. It gradually approaches a perfect plan that no opponent can beat, just as you finally stop falling off your bike and begin riding smoothly. CFR+ achieves a Nash equilibrium by averaging its methods throughout all of these games, resulting in a super-smart approach that cannot lose in the long term, even against the most difficult opponents.

But what truly tickles me is the thought that this approach validates certain long-held poker wisdom while refuting others. For example, the algorithm nearly never ‘limps’ (calls the first stake), something many experienced players would sagely acknowledge. However, it also demonstrates that it is occasionally preferable to cap the betting (raise the final amount) with a pair of twos rather than aces. Imagine a seasoned poker player reading this and exclaiming, “No way!” at the screen.

This study does more than merely solve a game; it also provides insight into human decision-making and strategy. The repercussions extend well beyond the poker table. These algorithms can be used in fields such as security and medical decision-making, where uncertainty and strategy are critical. Who knew that talents gained over late-night poker sessions could be applied to something as important as airport security protocols?

Turing once defended his work on gaming algorithms, claiming that it was all for fun. This study exemplifies that spirit. Solving poker was more than just a scientific achievement; it was a fascinating intellectual challenge. If only we could all get professions that allowed us to play games all day in the name of science!

Finally, it is obvious that, while humans have been playing poker for centuries, we are just now beginning to appreciate the complexities of the strategy involved. And who knows. Perhaps the next breakthrough will come from examining another game, such as Monopoly, where we can finally figure out the ideal approach, allowing family game nights to finish in peace rather than board-flipping frustration…

Miximax-based betting approach

I’m recalling about a fantastic manuscript (Davidson A. – Opponent Modeling in Poker: Learning and Acting in a Hostile and Uncertain Environment) that I discovered years ago. Back then, I was immersed in coding projects, and each new understanding felt like a mini-revelation. Now I’m recalling some of the highlights, but with a rather blurry lens.

Davidson compared Miximax-based betting approach against three other programs: FBS-Poki, SBS-Poki, and ArtBot. ArtBot, as it turns out, is a really loose player, almost passive. Imagine playing against someone who is difficult to read because they are defensive and unpredictable. Winning against ArtBot results in little pots, not the massive victories you might hope for. ArtBot outperforms FBS-Poki, who is a bit of a pushover, with roughly +0.35 small bets per hand.

It’s like having a variety of sparring partners to put your movements to the test, each with their own unique style. Davidson’s Miximax player was put to the test here, starting from zero and occasionally using pre-built strategy from prior games. This Miximax isn’t your typical min-max player; it’s a Miximix, a slightly modified version that doesn’t always go for the highest EV move. Consider it as adding a little randomness to avoid becoming too predictable – a necessary quirk to keep things interesting and open up all options.

So, what are the results? Miximax suffers a little during the first few thousand hands, much like a new student navigating their first few games. However, once it understands the opponent’s playing style, it begins to accumulate chips. Compared to the FBS and SBS tactics, it averages +0.4 to +0.5 small bets per hand. ArtBot struggles more, with results ranging from +0.1 to +0.2 small bets every hand. ArtBot’s fast-changing style appears to uncover flaws in Miximax’s context trees.

Here’s an interesting part of the thesis: Davidson adds that this is only scraping the surface. The AI needs to learn faster, either by enhancing the context trees or taking ideas from previous opponent models. The biggest challenge is expanding this to multiplayer games, where the game tree becomes extremely complex. Davidson implies that managing more players may be difficult unless some major trimming is done. This was difficult to manage when desktop PCs could only run at 1GHz and 4GB RAM was considered a lot of capacity. Now it’s not an issue.

I recall this brilliant ending in which Davidson quotes Josh Billings: “Life consists not in holding good cards, but in playing well those you do hold”. It serves as a reminder that in poker, as in life, making the most of what you have frequently outweighs pure luck.

Miximax

There’s this graph that displays how Miximax performs versus various opponents. It’s very clear that as the AI learns more about its opponents, it adjusts and improves its win rate. It’s unstable at first, but it settles down as it collects more data, much like how humans adapt via experience.

So, Davidson’s work demonstrates that developing a poker AI is more than just crunching numbers. It’s about navigating through a veil of uncertainty and making educated estimates based on trends and probabilities. If you enjoy poker and AI, or simply seeing how machines can play games, this is a must-read.

Poker AI for Texas Hold’em

I looked into some old research on poker AI, but after a long day and a few beers, my brain is a little fuzzy. Anyway, I read this article about AI for Texas Hold’em, and it made me think about how we can train machines to bluff better than the average Joe.

First and foremost, poker is more than just a card game. It’s a magical combination of statistics, psychology, and a touch of voodoo. Have you ever tried to guess if the player across the table will raise or fold? It’s like guessing what my cat wants for dinner: impossible. So, naturally, we turn to artificial intelligence to help us make sense of this pandemonium.

Now, developing a poker AI is similar to teaching a toddler to play chess, but with more snacks and less screaming (typically). Start with the basics: educate it to recognize strong hands. However, poker differs from chess in that it involves randomization and participants do not reveal their cards to the public. You need your AI to deal with incomplete information, make educated predictions, and not look foolish when someone pulls a quick one.

Assume you have this hand: A♣ Q♥. The flop shows up: 3♦ 4♠ J♥. Your AI must figure out, “What are the chances this hand is any good?” It’s like determining whether the leftover pizza is still safe to eat after a week in the fridge. Spoiler: it probably isn’t, and your AI has a ~58.5% chance of being correct. However, when the number of players increases, the odds decrease faster than your WiFi during a critical Zoom call. Against five opponents, your fancy A-Q may only win ~6.9% of the time. Ouch.

Then there’s the concept of “potential” – for example, your current hand may be bad, but with a little luck, it may transform into a royal flush. It’s like wagering that your startup will succeed despite the odds. Consider holding 6♦ 7♦ with a flop of 5♦ A♠ 8♦. Does it not look good? However, with the perfect turn and river, you may be sitting on a straight flush. Suddenly, your AI must go from “I’m screwed” to “I might just win this!”

But first, let’s talk about the opponents, because poker isn’t like solitaire. Your AI must model opponents’ behavior, determining whether they are playing tight or loose, aggressively or passively. It’s like trying to predict if your neighbor will return your lawnmower on time or not. The AI employs neural networks, Bayesian methods, and even something called particle filtering (whatever that is). I suppose it’s like sprinkling some magic dust to predict your opponent’s future move.

To make things even cooler, the AI creates game trees that simulate all conceivable outcomes. It’s like planning every conceivable conversation you may have with your employer – it’s useful but exhausting. These trees assist the AI in making the optimal decisions by calculating the value of each action. Raise, fold, and call – it’s all laid out.

Finally, the goal is to make AI smarter and more adaptive. It learns from millions of hands and becomes a poker master. It’s like seeing your child suddenly improve at video games after hours of practice. Or, say, your cat has finally mastered the art of knocking items off the table just perfectly.

Look at this figure. The chart displays VPIP (Voluntarily Put Money In Pot) values for each player cluster. It’s a clever way of expressing, “Who’s the sucker that’s always betting?” It turns out that the smaller the stakes, the more people want to see the flop, as if everyone is a dreamer hoping for a miracle river card.

Thus, developing a poker bot is not an easy task. It’s more akin to running a marathon with a fresh math problem handed to you every mile. But what fun is it to see it outsmart humans or pull off cunning moves? Prceless.. Just remember to give your AI quality data (we have billions of these hm2 files), just like you would with a cat: avoid giving it too much at once and steer clear of anything that might come back to bite you.