Poker-AI.org • View topic - Parallel/Multi-threaded Clustering?

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Parallel/Multi-threaded Clustering?

Page 1 of 1

[ 8 posts ]

Print view

Previous topic | Next topic

Author

Message

cantina

Post subject: Parallel/Multi-threaded Clustering?

Posted: Sat Sep 28, 2013 9:19 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Does anybody know of an algorithm available that can cluster large datasets in a multi-threaded fashion with comparable results to Xmeans? I found the below paper on PXM, but haven't found it implemented anywhere.

ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6324625&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6324625

I found this code in regard to multi-threaded k-means, but Xmeans is much better/stabler:
www.codethinked.com/multi-threaded-k-me ... -in-net-40

Top

ibot

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Thu Nov 21, 2013 12:24 am

Regular Member

Joined: Tue Mar 05, 2013 9:19 pm
Posts: 50

Any luck? About to start working more on clustering now. How are you working with the data?

Top

cantina

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Fri Nov 22, 2013 5:18 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

I think I just went with the single-threaded implementation of Xmeans for the flop and FarthestFirst for the turn, using Weka. The data was based on past/future statistical projections for each hand, as mentioned in another thread. It didn't work well for me -- maybe too many dimensions to the data?

FarthestFirst is a fast but weak clusterer. Xmeans might have taken a month to cluster the turn.

Top

flopnflush

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Tue Nov 26, 2013 11:46 am

Junior Member

Joined: Sat Nov 02, 2013 2:21 pm
Posts: 26

I have just adapted the KMeansPlusPlusClusterer from Apache Commons Math to support multithreading and point-frequencies (e.g. for suit isomorphisms). If anyone needs it: https://github.com/flopnflush/kmeans

Top

proud2bBot

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Tue Nov 26, 2013 9:43 pm

Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216

If you are using KMeans/KMeans++ you should perform multiple runs (using different seeds) and choose the best clustering from all runs. I guess its easier to run the different runs in a thread as you dont need to change any exisiting algorithms.

Top

flopnflush

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Tue Nov 26, 2013 11:04 pm

Junior Member

Joined: Sat Nov 02, 2013 2:21 pm
Posts: 26

Yes, that would have been the easier solution, but I have done it now anyway. Might still be the better solution if I run it on amazon ec2 instances with many threads. 'And I think you also need more ram if you perform multiple runs simultaneously.

I have added the MultiKMeansPlusPlusClusterer class to my code now, which performs multiple runs and chooses the best solution.

How many runs do you perform usually?

Top

proud2bBot

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Wed Nov 27, 2013 12:18 am

Senior Member

Joined: Mon Mar 11, 2013 10:24 pm
Posts: 216

i always performed like 10-20 different runs, but it mainly depends on the data, so you can't name a good parameter before...

Top

flopnflush

Post subject: Re: Parallel/Multi-threaded Clustering?

Posted: Thu Nov 28, 2013 11:04 pm

Junior Member

Joined: Sat Nov 02, 2013 2:21 pm
Posts: 26

The cool thing is that I can now run each trial in it's own ec2-instance with 32 threads. This reduces the time needed for clustering tremendously.

Top

Page 1 of 1

[ 8 posts ]

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum