A Journey Through Regret Minimization and Poker AI

I clearly remember those days when I stumbled upon Richard Gibson’s Ph.D. thesis from the University of Alberta. It was a huge piece of work on regret minimization and strategy stitching in extensive-form games, as specified in the case of three-player limit Texas hold’em. Particularly, this one has a special place in my heart, as it coincided with the most intense time of learning and discovery in my journey to becoming a poker AI developer.

I was deep into my studies at the time, and therefore I didn’t have an understanding of the intricacies of game theory and how it applies in poker. Gibson’s work came in quite handy. His work on Counterfactual Regret Minimization and its application in poker gave a sound theoretical base that I desperately needed. His findings had enormous practical implications, and I was already excited to use these concepts in my own projects.

One evening, after a whole day of coding and reading, I came across Gibson’s Chapter 5. He proposed new algorithms like Probing, Average Strategy Sampling, and Pure CFR that sound not like theoretical novelties but very practical tools to reduce computation times and memory costs. This was a game-changer, considering that I had limited computational resources at my disposal.

His work gave me the impetus to try and build some of these algorithms into my own poker bot. I recall these nights of endless debugging time—a cup of coffee in one hand and Gibson’s dissertation in the other. There was one particular night when everything clicked. My bot, previously muddling through and unable to make profitable decisions, started to show improvement. It was as if Gibson himself was guiding my hand through all the intricacies of CFR and its applications.

The most rewarding thing was to test my bot in a small online poker tournament. Watching it sail through the hands with the newfound efficiency and strategy was indeed thrilling. After all those hours of study, coding, and an act of will, pure will—it all came to this.

It’s pretty amazing how far we have come in poker AI from those days. All those theories and algorithms that were simply living in the domain of academic papers are now part of advanced poker bots. And it all started with the inspiration that works like Gibson’s inspired.

Anybody interested in how the poker AI worked or how those strategies came to be should really dive into the resources, see what’s out there today. And if you ever get lost in the complexity, remember, every great journey in AI starts with a single line of code and a lot of curiosity.

Keep learning, keep coding, and maybe one day your project will be one of the trend-setters in the world of poker AI.

Best regards 😉

Poker Bot Philosophy

So, I am an Artificial Intelligence enthusiast. Of course, developing AI that plays poker means—more than once—you are banking into some of the deeper philosophical questions of our time. Not about the meaning of life or if pineapple belongs on pizza—it does, by the way—but actually something a bit more important: the nature of poker and how to model it for a bot.

First off, let’s break down poker. Obviously, it’s a game. To a bot engineer, however, poker is all about systems—systems of rules and interactions. Poker doesn’t represent cards or chips but the players. And it really doesn’t make much sense to have a poker game without a player, much like having a computer and never putting it on the Internet.

It’s a blend of game situation, history by opponents, own mood—and the phase of the moon. Yes, some players are that superstitious. And that is exactly what makes modeling a poker player interesting and challenging—the mix of logical and illogical.

The comforting thought, in all of this, for engineers is that poker really does represent a finite number of states. Every player begins with some chips; there’s only so many cards are in the deck; at any time during a game, the number of possible moves is limited. It’s a godsend when trying to model a game. The idea of trying to model an infinite—it’s like trying to find the end of the internet.

Now, this is where it gets interesting. Players don’t do the same action in the same situation all of the time. They mix up their choice: make decisions like “I’ll fold 45 percent of the time and call 55 percent of the time.” It’s this randomness that your poker bot has to emulate since in poker, when you become predictable, you’re dead.

But that brings up a difficult-sounding concept: If players can make mixed moves, that must mean I have to model an infinity of possibilities. Thankfully no. While it sounds daunting, it’s really an issue of accuracy. As I take small changes in a player’s mixed strategy, I get small changes in expected profit to that player. It’s not about modeling infinity here, but how much reality one can cope with without going nuts.

Now, let us take a look at some strategies that make things simpler. Whether a player’s head is full of brains, sawdust, or algorithms, their strategy can always be reduced to being representable by a Look-Up Table of decisions. That is, the table says what to do given any situation. Now, when several players with their LUTs sit at the table, this goes down into an increasingly exciting strategic interplay with expected profit.

The temptation of infinity had mixed moves. Let us kill this myth at once. Imagine planning to play an inﬁnite number of hands. It is just impractical. We think in terms of ﬁnite sessions and look at the expeeCTed profit over these sessions.

Consider, if you call, profit may be $40, and in case you fold, it could be -$10. Mixing your move leads to profit calculation, like: 0.4 * $40 + 0.6 * -$10 = $10. Small changes in these mixed strategies will alter expected profit only slightly, proving it about precision, but definitely not infinity.

To drive the point home, let’s consider some practical examples. Suppose, for example, that you sit down at the table behind a stack of $665. You are against a superstitious opponent who always folds when he sees the number 666. Suddenly, what had, to this point, seemed to you perhaps like a small difference in your stack size may very well make a huge difference in your expected profit. It’s those kinds of quirky little details that make poker so fascinating and maddening all at once.

There you go. Modeling a poker bot includes finiteness of the game, introducing randomness into mixed moves, and simplification of strategies into something our bots can handle. A mix of philosophy, math, and a dash of humor is added.

Miximax-based betting approach

I’m recalling about a fantastic manuscript (Davidson A. – Opponent Modeling in Poker: Learning and Acting in a Hostile and Uncertain Environment) that I discovered years ago. Back then, I was immersed in coding projects, and each new understanding felt like a mini-revelation. Now I’m recalling some of the highlights, but with a rather blurry lens.

Davidson compared Miximax-based betting approach against three other programs: FBS-Poki, SBS-Poki, and ArtBot. ArtBot, as it turns out, is a really loose player, almost passive. Imagine playing against someone who is difficult to read because they are defensive and unpredictable. Winning against ArtBot results in little pots, not the massive victories you might hope for. ArtBot outperforms FBS-Poki, who is a bit of a pushover, with roughly +0.35 small bets per hand.

It’s like having a variety of sparring partners to put your movements to the test, each with their own unique style. Davidson’s Miximax player was put to the test here, starting from zero and occasionally using pre-built strategy from prior games. This Miximax isn’t your typical min-max player; it’s a Miximix, a slightly modified version that doesn’t always go for the highest EV move. Consider it as adding a little randomness to avoid becoming too predictable – a necessary quirk to keep things interesting and open up all options.

So, what are the results? Miximax suffers a little during the first few thousand hands, much like a new student navigating their first few games. However, once it understands the opponent’s playing style, it begins to accumulate chips. Compared to the FBS and SBS tactics, it averages +0.4 to +0.5 small bets per hand. ArtBot struggles more, with results ranging from +0.1 to +0.2 small bets every hand. ArtBot’s fast-changing style appears to uncover flaws in Miximax’s context trees.

Here’s an interesting part of the thesis: Davidson adds that this is only scraping the surface. The AI needs to learn faster, either by enhancing the context trees or taking ideas from previous opponent models. The biggest challenge is expanding this to multiplayer games, where the game tree becomes extremely complex. Davidson implies that managing more players may be difficult unless some major trimming is done. This was difficult to manage when desktop PCs could only run at 1GHz and 4GB RAM was considered a lot of capacity. Now it’s not an issue.

I recall this brilliant ending in which Davidson quotes Josh Billings: “Life consists not in holding good cards, but in playing well those you do hold”. It serves as a reminder that in poker, as in life, making the most of what you have frequently outweighs pure luck.

Miximax

There’s this graph that displays how Miximax performs versus various opponents. It’s very clear that as the AI learns more about its opponents, it adjusts and improves its win rate. It’s unstable at first, but it settles down as it collects more data, much like how humans adapt via experience.

So, Davidson’s work demonstrates that developing a poker AI is more than just crunching numbers. It’s about navigating through a veil of uncertainty and making educated estimates based on trends and probabilities. If you enjoy poker and AI, or simply seeing how machines can play games, this is a must-read.

Poker AI for Texas Hold’em

I looked into some old research on poker AI, but after a long day and a few beers, my brain is a little fuzzy. Anyway, I read this article about AI for Texas Hold’em, and it made me think about how we can train machines to bluff better than the average Joe.

First and foremost, poker is more than just a card game. It’s a magical combination of statistics, psychology, and a touch of voodoo. Have you ever tried to guess if the player across the table will raise or fold? It’s like guessing what my cat wants for dinner: impossible. So, naturally, we turn to artificial intelligence to help us make sense of this pandemonium.

Now, developing a poker AI is similar to teaching a toddler to play chess, but with more snacks and less screaming (typically). Start with the basics: educate it to recognize strong hands. However, poker differs from chess in that it involves randomization and participants do not reveal their cards to the public. You need your AI to deal with incomplete information, make educated predictions, and not look foolish when someone pulls a quick one.

Assume you have this hand: A♣ Q♥. The flop shows up: 3♦ 4♠ J♥. Your AI must figure out, “What are the chances this hand is any good?” It’s like determining whether the leftover pizza is still safe to eat after a week in the fridge. Spoiler: it probably isn’t, and your AI has a ~58.5% chance of being correct. However, when the number of players increases, the odds decrease faster than your WiFi during a critical Zoom call. Against five opponents, your fancy A-Q may only win ~6.9% of the time. Ouch.

Then there’s the concept of “potential” – for example, your current hand may be bad, but with a little luck, it may transform into a royal flush. It’s like wagering that your startup will succeed despite the odds. Consider holding 6♦ 7♦ with a flop of 5♦ A♠ 8♦. Does it not look good? However, with the perfect turn and river, you may be sitting on a straight flush. Suddenly, your AI must go from “I’m screwed” to “I might just win this!”

But first, let’s talk about the opponents, because poker isn’t like solitaire. Your AI must model opponents’ behavior, determining whether they are playing tight or loose, aggressively or passively. It’s like trying to predict if your neighbor will return your lawnmower on time or not. The AI employs neural networks, Bayesian methods, and even something called particle filtering (whatever that is). I suppose it’s like sprinkling some magic dust to predict your opponent’s future move.

To make things even cooler, the AI creates game trees that simulate all conceivable outcomes. It’s like planning every conceivable conversation you may have with your employer – it’s useful but exhausting. These trees assist the AI in making the optimal decisions by calculating the value of each action. Raise, fold, and call – it’s all laid out.

Finally, the goal is to make AI smarter and more adaptive. It learns from millions of hands and becomes a poker master. It’s like seeing your child suddenly improve at video games after hours of practice. Or, say, your cat has finally mastered the art of knocking items off the table just perfectly.

Look at this figure. The chart displays VPIP (Voluntarily Put Money In Pot) values for each player cluster. It’s a clever way of expressing, “Who’s the sucker that’s always betting?” It turns out that the smaller the stakes, the more people want to see the flop, as if everyone is a dreamer hoping for a miracle river card.

Thus, developing a poker bot is not an easy task. It’s more akin to running a marathon with a fresh math problem handed to you every mile. But what fun is it to see it outsmart humans or pull off cunning moves? Prceless.. Just remember to give your AI quality data (we have billions of these hm2 files), just like you would with a cat: avoid giving it too much at once and steer clear of anything that might come back to bite you.