Poker-AI.org

Poker AI and Botting Discussion Forum
It is currently Mon Nov 13, 2023 1:25 pm

All times are UTC




Post new topic Reply to topic  [ 1 post ] 
Author Message
PostPosted: Mon Aug 12, 2013 7:48 pm 
Offline
Junior Member

Joined: Mon Aug 12, 2013 7:28 pm
Posts: 16
Hi there ! First post

I'm trying to implement the MCTS algorithm described in many papers.
I already have a simple MonteCarlo algorithm working where I do the selection and a bunch of expansions to get the average value of each actions and choose the best one.
Basically, I have a GameState class and to find the best action, I do:
Code:
foreach (action in state.PossibleActions)
  newState = state.Clone();
  for i=1 to 1000
    while (!newState.IsFinished)
      newState.ChooseActionOrBoardCard
    value = newState.PayoutForMe
  deduce the averageValueForMove
And basically choose the move that gives the best average value.

After reading a few papers on MCTS, I think I roughly understand the algorithm (this one was the best: http://www.ai.rug.nl/~mwiering/Tom_van_ ... Thesis.pdf)
My question is about how to implement it ? (I've looked a bit at https://code.google.com/p/opentestbed, but didn't understand it all)

1) Is it best to have a Node<-DecisionNode, OpponentNode, ChanceNode hierarchy of classes that each wrap one GameState clone ?
2) If every node has a copy of a GameState, it's going to take a lot of memory, is that normal ?
3) I'm thinking about putting the OpponentModels inside the GameState as they will be updated everytime the GameStates change. Is that the best way ? I see that OpenTestBed puts them in a shared Config and calls Config.Model.AssumeTemporarily(GameState), I don't really understand what it's doing exactly there...
4) In OpenTestBed, I cannot see the ChanceNode, has it been optimized away ?
5) In the AkiRealbot (from the DarmstadtPokerBots), I see they have a hierarchy of nodes AND of arcs between the nodes. Any advantage to using arcs ? It seems wasteful to me, the nodes should have the arc information.
6) I understand that the main advantage of MCTS over my simple algorithm is that it will balance exploitation vs exploration. Any other advantages ?

I'm trying to get my head around it, and hopefully avoid some annoying design mistakes here.
I'm using C#, but I believe a Java viewpoint should be just as helpful.

Thanks guys,
Vlad


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group