Poker-AI.org • View topic - Iteratively Modeling Irregularly Distributed Data Streams?

View unanswered posts | View active topics

Board index » Public Forums » AI Research

All times are UTC

Iteratively Modeling Irregularly Distributed Data Streams?

Page 1 of 1

[ 17 posts ]

Print view

Previous topic | Next topic

Author

Message

cantina

Post subject: Iteratively Modeling Irregularly Distributed Data Streams?

Posted: Thu May 16, 2013 1:54 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

The title says it all... How do you deal with regressing a model from data that is not uniformly distributed? In poker, for example, you see a lot of hands with an average value, but comparatively fewer hands with a great value. If I were to train a model to recognize hand strength given some pattern in the cards as they're observed at random, the function that is learned would be "compacted" towards the frequently observed, average hands in a nonlinear fashion. <-- What is the best way to avoid this? What if the real distribution isn't completely known? What if the frequency of values changes over time?

Top

Magnum

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 3:31 am

New Member

Joined: Sun Mar 10, 2013 12:18 am
Posts: 3

I think the exact method will depend on what type of model you are using. You can either generate duplicates of the "great hands" or remove a portion of the "average value hands". For a regression problem you might be able to do something like the following...

1. Find the distribution of your training data
2. Take a random sample of your training data, weighted by the inverse of this distribution

afaik as long as your training data is uniform, you should be able to predict the same regardless of what the "real" distribution is.

Top

longshot

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 4:33 am

Junior Member

Joined: Thu Apr 11, 2013 10:13 pm
Posts: 22

I'm not really clear on what you're trying to predict here. What are the desired inputs and outputs of your regression model?

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 5:22 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

longshot wrote:

What are the desired inputs and outputs of your regression model?

The inputs are numbers from 0..1, outputs are numbers from 0..1.

I thought about caching the stream, taking maybe 100k instances at a time, then using a model/training method that considered the global error rate of the data, like annealing or RPROP.

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 5:30 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Magnum wrote:

1. Find the distribution of your training data
2. Take a random sample of your training data, weighted by the inverse of this distribution

Well, it's a stream, so I can't do that outright. And, the distribution changes. Caching is the best thing I could think of ATM. I saw various papers on using interpolation methods for irregular point data distributions.

Top

longshot

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 6:40 am

Junior Member

Joined: Thu Apr 11, 2013 10:13 pm
Posts: 22

Nasher wrote:

longshot wrote:

What are the desired inputs and outputs of your regression model?

The inputs are numbers from 0..1, outputs are numbers from 0..1.

I thought about caching the stream, taking maybe 100k instances at a time, then using a model/training method that considered the global error rate of the data, like annealing or RPROP.

I don't see how annealing or RPROP would help with this. Based on my 30s reading of the wikipedia page, it seems like RPROP is good when you have the correct polarity and frequency of training, but maybe the actual signal is noisy. That doesn't seem to be your problem.

So if I understand, you're trying to predict an opponent's HS using a neural network, where you take in some features about the hand and output what the expected HS is. So what are you doing with folded hands? Are you just ignoring them, which is what's creating your bias?

If so, then what you want is to basically marginalize over all the possible hands. You probably would want to adjust your backprop learning rate to alpha * p(HC | History). Wouldn't that correct for the bias?

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 7:21 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

longshot wrote:

I don't see how annealing or RPROP would help with this.

It works better than something like simple back-prop.

longshot wrote:

So if I understand, you're trying to predict an opponent's HS using a neural network, where you take in some features about the hand and output what the expected HS is.

No, it's a different problem.

I might try splitting up the model. But, I really hate to do that. Maybe just keep a set of slots for the various distribution intervals and wait for each slot to be filled? But, again, I don't know what the upper/lower bounds will be, and they change.

Top

longshot

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 7:37 am

Junior Member

Joined: Thu Apr 11, 2013 10:13 pm
Posts: 22

Nasher wrote:

No, it's a different problem.

So... the $64,000 question: what's the problem?

Nasher wrote:

I might try splitting up the model. But, I really hate to do that. Maybe just keep a set of slots for the various distribution intervals and wait for each slot to be filled? But, again, I don't know what the upper/lower bounds will be, and they change.

So in the general case, you have some skewed distribution and you want to learn the regression model for the uniform distribution, online, in an incremental way so that you can do some small computation after seeing each sample from the skewed distribution. Is that right?

If so, then why not simply correct it using a fictitious sampling approach like Magnum suggested?

Top

spears

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 7:38 am

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Weight the data according to pot or winnings?

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 3:28 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Good idea, spears. But, the winnings may not be uniformly distributed based on hand value. Or would it... :twisted:

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 3:42 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

longshot wrote:

I think you're misunderstanding, think of it like various points on a 2D map. I get x/y coordinates, and the associated elevation for that point, one at a time (in a streamed fashion). I'm trying to model the elevation based on the coordinates, however, most of the samples are from one small area of the map, and the others are from the various surrounding region. Lets say, 90% of the point data represents 10% of the area. <-- That's the irregular distribution I'm talking about. Now, consider that my point data is from the melting polar cap in Antarctica, where the elevation slowly changes over time. Also, consider that the sample locations change as well, so no longer is that 90% that represents 10% in the same x/y region it was before. Now, also consider that we're in a different Universe, where point data isn't 2D but 34D and elevation isn't 1D but 12D. This is the problem I'm working on.

Top

trojanrabbit

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 4:46 pm

Junior Member

Joined: Fri Apr 05, 2013 2:21 am
Posts: 11

Maybe use kNN with the weights decreasing with distance?

Tysen

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 7:07 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Will that work for streamed data?

I suppose I could: cache, cluster, train.

Last edited by cantina on Fri May 17, 2013 12:23 am, edited 1 time in total.

Top

sn0w

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 7:54 pm

Junior Member

Joined: Wed Mar 06, 2013 3:58 am
Posts: 10

You could use Kalman Filter (or any other Bayes Filter: variations of KF, particle filters, grid estimators, etc.), it works pretty good with streamed data. However, all those smart filters require motion/observation model of a system which is hard to be found analytically. But even with poor (e.g. constant or linear) motion model, you will get a good estimation of posterior distribution in some cases just because of the nature of recursive estimation.

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Thu May 16, 2013 10:12 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

I don't receive a stream of information about a single point, rather I receive a stream of information about random points. So, I don't see how I would apply the Kalman filter? It would eventually just give me the (weighted) average of all point coordinates, would it not? The point coordinates (inputs) are not noisy, they are precise, they're just unevenly distributed.

Top

ibot

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Sat May 18, 2013 10:58 pm

Regular Member

Joined: Tue Mar 05, 2013 9:19 pm
Posts: 50

trojanrabbit wrote:

Maybe use kNN with the weights decreasing with distance?

Tysen

An Adaptive Nearest Neighbor Classiﬁcation Algorithm for Data Streams
Had a quick glance over but seems quite clear and may have some useful information.
What about PCA or similar techniques relating to dimensionality problems?

Looks like the problem is with the rarity of some data - also looks like the main research in the cases of rare data comes from the medical side of things.
Mining With Rare Cases is more general but has a few ideas that could be looked into.

Top

cantina

Post subject: Re: Iteratively Modeling Irregularly Distributed Data Stream

Posted: Sat May 18, 2013 11:42 pm

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

Found this:
http://sourceforge.net/projects/moa-datastream/

Top

Page 1 of 1

[ 17 posts ]

Board index » Public Forums » AI Research

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum