Poker-AI.org Poker AI and Botting Discussion Forum 2013-05-18T23:42:21+00:00 http://poker-ai.org/phpbb/feed.php?f=24&t=2494 2013-05-18T23:42:21+00:00 2013-05-18T23:42:21+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4203#p4203 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> http://sourceforge.net/projects/moa-datastream/

Statistics: Posted by cantina — Sat May 18, 2013 11:42 pm


]]>
2013-05-18T22:58:08+00:00 2013-05-18T22:58:08+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4202#p4202 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> trojanrabbit wrote:

Maybe use kNN with the weights decreasing with distance?

Tysen

An Adaptive Nearest Neighbor Classification Algorithm for Data Streams
Had a quick glance over but seems quite clear and may have some useful information.
What about PCA or similar techniques relating to dimensionality problems?

Looks like the problem is with the rarity of some data - also looks like the main research in the cases of rare data comes from the medical side of things.
Mining With Rare Cases is more general but has a few ideas that could be looked into.

Statistics: Posted by ibot — Sat May 18, 2013 10:58 pm


]]>
2013-05-16T22:12:50+00:00 2013-05-16T22:12:50+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4194#p4194 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Statistics: Posted by cantina — Thu May 16, 2013 10:12 pm


]]>
2013-05-16T19:54:35+00:00 2013-05-16T19:54:35+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4192#p4192 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Statistics: Posted by sn0w — Thu May 16, 2013 7:54 pm


]]>
2013-05-17T00:23:08+00:00 2013-05-16T19:07:02+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4191#p4191 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]>
I suppose I could: cache, cluster, train.

Statistics: Posted by cantina — Thu May 16, 2013 7:07 pm


]]>
2013-05-16T16:46:32+00:00 2013-05-16T16:46:32+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4189#p4189 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]>
Tysen

Statistics: Posted by trojanrabbit — Thu May 16, 2013 4:46 pm


]]>
2013-05-16T15:42:46+00:00 2013-05-16T15:42:46+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4186#p4186 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> longshot wrote:

So in the general case, you have some skewed distribution and you want to learn the regression model for the uniform distribution, online, in an incremental way so that you can do some small computation after seeing each sample from the skewed distribution. Is that right?

If so, then why not simply correct it using a fictitious sampling approach like Magnum suggested?


I think you're misunderstanding, think of it like various points on a 2D map. I get x/y coordinates, and the associated elevation for that point, one at a time (in a streamed fashion). I'm trying to model the elevation based on the coordinates, however, most of the samples are from one small area of the map, and the others are from the various surrounding region. Lets say, 90% of the point data represents 10% of the area. <-- That's the irregular distribution I'm talking about. Now, consider that my point data is from the melting polar cap in Antarctica, where the elevation slowly changes over time. Also, consider that the sample locations change as well, so no longer is that 90% that represents 10% in the same x/y region it was before. Now, also consider that we're in a different Universe, where point data isn't 2D but 34D and elevation isn't 1D but 12D. This is the problem I'm working on.

Statistics: Posted by cantina — Thu May 16, 2013 3:42 pm


]]>
2013-05-16T15:28:45+00:00 2013-05-16T15:28:45+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4184#p4184 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]>

Statistics: Posted by cantina — Thu May 16, 2013 3:28 pm


]]>
2013-05-16T07:38:36+00:00 2013-05-16T07:38:36+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4176#p4176 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Statistics: Posted by spears — Thu May 16, 2013 7:38 am


]]>
2013-05-16T07:37:03+00:00 2013-05-16T07:37:03+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4175#p4175 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Nasher wrote:

No, it's a different problem.


So... the $64,000 question: what's the problem?

Nasher wrote:

I might try splitting up the model. But, I really hate to do that. Maybe just keep a set of slots for the various distribution intervals and wait for each slot to be filled? But, again, I don't know what the upper/lower bounds will be, and they change.


So in the general case, you have some skewed distribution and you want to learn the regression model for the uniform distribution, online, in an incremental way so that you can do some small computation after seeing each sample from the skewed distribution. Is that right?

If so, then why not simply correct it using a fictitious sampling approach like Magnum suggested?

Statistics: Posted by longshot — Thu May 16, 2013 7:37 am


]]>
2013-05-16T07:21:59+00:00 2013-05-16T07:21:59+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4173#p4173 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> longshot wrote:

I don't see how annealing or RPROP would help with this.

It works better than something like simple back-prop.

longshot wrote:

So if I understand, you're trying to predict an opponent's HS using a neural network, where you take in some features about the hand and output what the expected HS is.

No, it's a different problem.

I might try splitting up the model. But, I really hate to do that. Maybe just keep a set of slots for the various distribution intervals and wait for each slot to be filled? But, again, I don't know what the upper/lower bounds will be, and they change.

Statistics: Posted by cantina — Thu May 16, 2013 7:21 am


]]>
2013-05-16T06:40:14+00:00 2013-05-16T06:40:14+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4172#p4172 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Nasher wrote:

longshot wrote:
What are the desired inputs and outputs of your regression model?

The inputs are numbers from 0..1, outputs are numbers from 0..1. :)

I thought about caching the stream, taking maybe 100k instances at a time, then using a model/training method that considered the global error rate of the data, like annealing or RPROP.


I don't see how annealing or RPROP would help with this. Based on my 30s reading of the wikipedia page, it seems like RPROP is good when you have the correct polarity and frequency of training, but maybe the actual signal is noisy. That doesn't seem to be your problem.

So if I understand, you're trying to predict an opponent's HS using a neural network, where you take in some features about the hand and output what the expected HS is. So what are you doing with folded hands? Are you just ignoring them, which is what's creating your bias?

If so, then what you want is to basically marginalize over all the possible hands. You probably would want to adjust your backprop learning rate to alpha * p(HC | History). Wouldn't that correct for the bias?

Statistics: Posted by longshot — Thu May 16, 2013 6:40 am


]]>
2013-05-16T05:30:27+00:00 2013-05-16T05:30:27+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4171#p4171 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Magnum wrote:

1. Find the distribution of your training data
2. Take a random sample of your training data, weighted by the inverse of this distribution

Well, it's a stream, so I can't do that outright. And, the distribution changes. Caching is the best thing I could think of ATM. I saw various papers on using interpolation methods for irregular point data distributions.

Statistics: Posted by cantina — Thu May 16, 2013 5:30 am


]]>
2013-05-16T05:22:08+00:00 2013-05-16T05:22:08+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4170#p4170 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> longshot wrote:

What are the desired inputs and outputs of your regression model?

The inputs are numbers from 0..1, outputs are numbers from 0..1. :)

I thought about caching the stream, taking maybe 100k instances at a time, then using a model/training method that considered the global error rate of the data, like annealing or RPROP.

Statistics: Posted by cantina — Thu May 16, 2013 5:22 am


]]>
2013-05-16T04:33:30+00:00 2013-05-16T04:33:30+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4168#p4168 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]> Statistics: Posted by longshot — Thu May 16, 2013 4:33 am


]]>
2013-05-16T03:31:45+00:00 2013-05-16T03:31:45+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4167#p4167 <![CDATA[Re: Iteratively Modeling Irregularly Distributed Data Stream]]>
1. Find the distribution of your training data
2. Take a random sample of your training data, weighted by the inverse of this distribution

afaik as long as your training data is uniform, you should be able to predict the same regardless of what the "real" distribution is.

Statistics: Posted by Magnum — Thu May 16, 2013 3:31 am


]]>
2013-05-16T01:54:34+00:00 2013-05-16T01:54:34+00:00 http://poker-ai.org/phpbb/viewtopic.php?t=2494&p=4166#p4166 <![CDATA[Iteratively Modeling Irregularly Distributed Data Streams?]]> Statistics: Posted by cantina — Thu May 16, 2013 1:54 am


]]>