Poker-AI.org :: View topic - DeepStack: Expert-Level Artificial Intelligence in No-Limit

One thing confuses me though, they say that they ignore the opponent's actual action when doing the recalc. Does that mean they ignore the opponent's bet size as well, and then just map it to one of the "2 or 3 bet/raise actions" post-recalc? Why not consider the actual size as an optional path?

Excited to finally see them going for the much more scalable approach of online solving combined with trained models for look-ahead, instead of sticking to precomputed strategies.

One thing confuses me though, they say that they ignore the opponent's actual action when doing the recalc. Does that mean they ignore the opponent's bet size as well, and then just map it to one of the "2 or 3 bet/raise actions" post-recalc? Why not consider the actual size as an optional path?

Also for their own bets, I don't see any mention of bet sizing, which leads me to believe they used the same ½P, P, 2P, All-in sizings they used for training the networks(?) Again it sounds like they're leaving an unnecessary amount of chips on the table.

Either way, impressive results!

The value-function (the neural network precedent trained) return the counterfactual utility approximation of any possible hands for the opponents taking as input only the pot-size and the deep-stack range. So during the simulation that algorithm doesn't consider the precedent action or the size but only the pot-size.
The abstraction is implicit and continuos in the network that produce the value-function but they don't map anything in an explicit way

With regards to the look-ahead network computed from the next public state, I totally understand. However, my comment was concerning the live resolve of the current street. I don't see how the opponent's action can be ignored there.

Lets say the opponent bets 2/3 pot on the turn. DeepStack can't possibly ignore that action and just consider the pot, stack size and resulting sub-game after the bet, or can it?

In this case, my understanding was that they solved the entire street (the turn) with CFRM, using the look-ahead network to get regrets or value for the river. With the inputs to the simulation being the opponent's regrets, DeepStack's range, the pot and the stack size. Then after the solve, they select an action based on whatever the opponent actually did.

It seems, that they do ignore the opponent action. In the video on the DeepStack site, M. Bowling said, that the counterfactual values of the last resolve are an upper bound to the counterfactual values after the opponent action, so they can be used.

One thing, that I don't udnerstand though is, how do they innitialize the counterfactual values in the root? They said, that they initialize it to the value of being dealt the hand, but what does that mean? I kind of assume that they use counterfactual values, that they computed from a full cfr solution from one of their earlier bots or something, but on the other hands that would be weird.

Author:	botishardwork [ Tue Jan 10, 2017 4:34 pm ]
Post subject:	DeepStack: Expert-Level Artificial Intelligence in No-Limit
DeepStack: Expert-Level Artificial Intelligence in No-Limit by Matej Moravc, Martin Schmid, Neil Burch, Viliam Lisy, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael Bowling Artificial intelligence has seen a number of breakthroughs in recent years, with games often serving as significant milestones. A common feature of games with these successes is that they involve information symmetry among the players, where all players have identical information. This property of perfect information,though, is far more common in games than in real-world problems. Poker is the quintessential game of imperfect information, and it has been a longstanding challenge problem in artificial intelligence. In this paper we introduce DeepStack, a new algorithm for imperfect information settings such as poker. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition about arbitrary poker situations that is automatically learned from selfplay games using deep learning. In a study involving dozens of participants and 44,000 hands of poker, DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold’em. Furthermore, we show this approach dramatically reduces worst-case exploitability compared to the abstraction paradigm that has been favored for over a decade. Link: https://arxiv.org/pdf/1701.01724.pdf

Author:	pulser [ Thu Jan 12, 2017 6:47 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
Excited to finally see them going for the much more scalable approach of online solving combined with trained models for look-ahead, instead of sticking to precomputed strategies. One thing confuses me though, they say that they ignore the opponent's actual action when doing the recalc. Does that mean they ignore the opponent's bet size as well, and then just map it to one of the "2 or 3 bet/raise actions" post-recalc? Why not consider the actual size as an optional path? Also for their own bets, I don't see any mention of bet sizing, which leads me to believe they used the same ½P, P, 2P, All-in sizings they used for training the networks(?) Again it sounds like they're leaving an unnecessary amount of chips on the table. Either way, impressive results!

Author:	spears [ Thu Jan 12, 2017 10:51 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
Quote: One thing confuses me though, they say that they ignore the opponent's actual action when doing the recalc. Does that mean they ignore the opponent's bet size as well, and then just map it to one of the "2 or 3 bet/raise actions" post-recalc? Why not consider the actual size as an optional path? Good question. As usual there is more than one thing that confuses me in UoA papers. The overall idea of doing a simulation over the next few actions and then using an estimate of expected values to represent the remainder of the game has been used in game playing programs since the beginning of time. Clever to use it in poker.

Author:	AlephZero [ Fri Jan 20, 2017 10:44 am ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
Report my post from another 3d: The test was not done in the best way, only top 3 of 30 players went itm so the humans were encouraged to gamble and not to play their real A-game in a cash table. Anyway 45bb is truly amazing. I would try to reproduce the deep-stack algorithm. The bulk of the cost is to reproduce 10M size training set of random situation solved by cfr for training the network with 3500 hidden units. They've runned 6144 cpu for 11 days. I've estimated that would cost 50k euro. I can at best do 1M samples with 5k of investement so i was thinking to start solving some poker game less deep, like husng were i can reproduce 3-5M samples or start doing some data expansion using multiple examples from the same solved game but it's inappropriate for deep-stack resolving mechanism. pulser wrote: Excited to finally see them going for the much more scalable approach of online solving combined with trained models for look-ahead, instead of sticking to precomputed strategies. One thing confuses me though, they say that they ignore the opponent's actual action when doing the recalc. Does that mean they ignore the opponent's bet size as well, and then just map it to one of the "2 or 3 bet/raise actions" post-recalc? Why not consider the actual size as an optional path? Also for their own bets, I don't see any mention of bet sizing, which leads me to believe they used the same ½P, P, 2P, All-in sizings they used for training the networks(?) Again it sounds like they're leaving an unnecessary amount of chips on the table. Either way, impressive results! The value-function (the neural network precedent trained) return the counterfactual utility approximation of any possible hands for the opponents taking as input only the pot-size and the deep-stack range. So during the simulation that algorithm doesn't consider the precedent action or the size but only the pot-size. The abstraction is implicit and continuos in the network that produce the value-function but they don't map anything in an explicit way. In other words the value-function is a method to give a value being in a certain position during the game. The exploatability of deep-stack goes to zero (so his strategy converge to a Nash equilibrium) if the aproximation error of the network goes to zero. This is not possible but judging from the test the error is small enough to can't be exploatable by humans.

Author:	pulser [ Sun Jan 22, 2017 12:51 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
AlephZero wrote: The value-function (the neural network precedent trained) return the counterfactual utility approximation of any possible hands for the opponents taking as input only the pot-size and the deep-stack range. So during the simulation that algorithm doesn't consider the precedent action or the size but only the pot-size. The abstraction is implicit and continuos in the network that produce the value-function but they don't map anything in an explicit way With regards to the look-ahead network computed from the next public state, I totally understand. However, my comment was concerning the live resolve of the current street. I don't see how the opponent's action can be ignored there. Lets say the opponent bets 2/3 pot on the turn. DeepStack can't possibly ignore that action and just consider the pot, stack size and resulting sub-game after the bet, or can it? In this case, my understanding was that they solved the entire street (the turn) with CFRM, using the look-ahead network to get regrets or value for the river. With the inputs to the simulation being the opponent's regrets, DeepStack's range, the pot and the stack size. Then after the solve, they select an action based on whatever the opponent actually did.

Poker-AI.org http://poker-ai.org/phpbb/

DeepStack: Expert-Level Artificial Intelligence in No-Limit http://poker-ai.org/phpbb/viewtopic.php?f=25&t=3009	Page 1 of 1

Author:	AlephZero [ Sun Jan 22, 2017 1:06 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
pulser wrote: With regards to the look-ahead network computed from the next public state, I totally understand. However, my comment was concerning the live resolve of the current street. I don't see how the opponent's action can be ignored there. Lets say the opponent bets 2/3 pot on the turn. DeepStack can't possibly ignore that action and just consider the pot, stack size and resulting sub-game after the bet, or can it? In this case, my understanding was that they solved the entire street (the turn) with CFRM, using the look-ahead network to get regrets or value for the river. With the inputs to the simulation being the opponent's regrets, DeepStack's range, the pot and the stack size. Then after the solve, they select an action based on whatever the opponent actually did. I think, but it's just my opinion in this moment and their paper is not clear, that deep stack use the counterfactual value for the pot after-call in case of call action, the pot + raise in case of raise or the actual pot for a fold. In other words a call took in a situation described by pot+call call amount, a raise in pot+raise amount. In your example starting pot is 1, pot considered for a call is 1+2/3+2/3=7/3 and 1+2/3+raise ammount for a raise. I think that the utility is easily reconstructed from the sub-games counterfactual utility of the player less the player contribute to the starting pot. Just a my supposition, i don't completely understand the deep-stack algotithm yet.

Author:	DreamInBinary [ Sat Jan 28, 2017 10:47 am ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
pulser wrote: With regards to the look-ahead network computed from the next public state, I totally understand. However, my comment was concerning the live resolve of the current street. I don't see how the opponent's action can be ignored there. Lets say the opponent bets 2/3 pot on the turn. DeepStack can't possibly ignore that action and just consider the pot, stack size and resulting sub-game after the bet, or can it? I do not think they ignore the action, they're just not explicitly using the action. Instead they after the opp has acted they update opp's range according to "Opponent Ranges in Re-Solving" section in Appendix. Then they use that range (& theirs) to solve what is remaining of the street. At least that's the way I see it. What I am bothered about is that they imply that they implemented the CFR-D on GPU which sounds like quite a daunting task tbh...

Author:	Abrriston [ Tue Mar 21, 2017 3:34 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
I guess Artificial Intelligence is out of question nowadays, I suppose almost everybody believes in it and consider it as a miracle!

Author:	LektorDonz [ Sun Oct 15, 2017 8:32 am ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
For those who do not want to read the paper: Here is the website of deepstack https://www.deepstack.ai including a nice overview talk by Michael Bowling and some videos of games that deepstack played against humans.

Author:	AlephZero [ Sat Oct 21, 2017 6:24 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
Here a lua-torch implementation for leduc poker https://github.com/lifrordi/DeepStack-Leduc

Author:	HontoNiBaka [ Sat Mar 17, 2018 5:37 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
It seems, that they do ignore the opponent action. In the video on the DeepStack site, M. Bowling said, that the counterfactual values of the last resolve are an upper bound to the counterfactual values after the opponent action, so they can be used. One thing, that I don't udnerstand though is, how do they innitialize the counterfactual values in the root? They said, that they initialize it to the value of being dealt the hand, but what does that mean? I kind of assume that they use counterfactual values, that they computed from a full cfr solution from one of their earlier bots or something, but on the other hands that would be weird.

Author:	optimizer [ Tue Mar 20, 2018 5:35 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
HontoNiBaka wrote: It seems, that they do ignore the opponent action. In the video on the DeepStack site, M. Bowling said, that the counterfactual values of the last resolve are an upper bound to the counterfactual values after the opponent action, so they can be used. One thing, that I don't udnerstand though is, how do they innitialize the counterfactual values in the root? They said, that they initialize it to the value of being dealt the hand, but what does that mean? I kind of assume that they use counterfactual values, that they computed from a full cfr solution from one of their earlier bots or something, but on the other hands that would be weird. These are just values at the root of the game computed with the same algorithm, so for example for Kuhn poker these would be [-1/3, -1/9, 7/18].

Author:	HontoNiBaka [ Sat Mar 24, 2018 4:10 pm ]
Post subject:	Re: DeepStack: Expert-Level Artificial Intelligence in No-Li
Yea makes sense, thx.

Page 1 of 1	All times are UTC
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/