Hi there ! First post
I'm trying to implement the MCTS algorithm described in many papers.
I already have a simple MonteCarlo algorithm working where I do the selection and a bunch of expansions to get the average value of each actions and choose the best one.
Basically, I have a GameState class and to find the best action, I do:
Code:
foreach (action in state.PossibleActions)
newState = state.Clone();
for i=1 to 1000
while (!newState.IsFinished)
newState.ChooseActionOrBoardCard
value = newState.PayoutForMe
deduce the averageValueForMove
And basically choose the move that gives the best average value.
After reading a few papers on MCTS, I think I roughly understand the algorithm (this one was the best:
http://www.ai.rug.nl/~mwiering/Tom_van_ ... Thesis.pdf)
My question is about how to implement it ? (I've looked a bit at
https://code.google.com/p/opentestbed, but didn't understand it all)
1) Is it best to have a Node<-DecisionNode, OpponentNode, ChanceNode hierarchy of classes that each wrap one GameState clone ?
2) If every node has a copy of a GameState, it's going to take a lot of memory, is that normal ?
3) I'm thinking about putting the OpponentModels inside the GameState as they will be updated everytime the GameStates change. Is that the best way ? I see that OpenTestBed puts them in a shared Config and calls Config.Model.AssumeTemporarily(GameState), I don't really understand what it's doing exactly there...
4) In OpenTestBed, I cannot see the ChanceNode, has it been optimized away ?
5) In the AkiRealbot (from the DarmstadtPokerBots), I see they have a hierarchy of nodes AND of arcs between the nodes. Any advantage to using arcs ? It seems wasteful to me, the nodes should have the arc information.
6) I understand that the main advantage of MCTS over my simple algorithm is that it will balance exploitation vs exploration. Any other advantages ?
I'm trying to get my head around it, and hopefully avoid some annoying design mistakes here.
I'm using C#, but I believe a Java viewpoint should be just as helpful.
Thanks guys,
Vlad