longshot wrote:
So in the general case, you have some skewed distribution and you want to learn the regression model for the uniform distribution, online, in an incremental way so that you can do some small computation after seeing each sample from the skewed distribution. Is that right?
If so, then why not simply correct it using a fictitious sampling approach like Magnum suggested?
I think you're misunderstanding, think of it like various points on a 2D map. I get x/y coordinates, and the associated elevation for that point, one at a time (in a streamed fashion). I'm trying to model the elevation based on the coordinates, however, most of the samples are from one small area of the map, and the others are from the various surrounding region. Lets say, 90% of the point data represents 10% of the area. <-- That's the irregular distribution I'm talking about. Now, consider that my point data is from the melting polar cap in Antarctica, where the elevation slowly changes over time. Also, consider that the sample locations change as well, so no longer is that 90% that represents 10% in the same x/y region it was before. Now, also consider that we're in a different Universe, where point data isn't 2D but 34D and elevation isn't 1D but 12D. This is the problem I'm working on.