And You Don't Stop, a dmourati Production: May 2008

Sunday, May 25, 2008

Where Da Cash At

In the business world, the applications are more grown-up and mission-critical. One example is churn prediction - i.e. finding out if a (say, wireless) customer would stay loyal with the current provider, or move on ("churn") to competitor (in which case, the current provider could try to entice him/her to stay with appropriate promos). The customer data used for such churn prediction applications contains categorical (e.g. gender, education, occupation) and numerical (e.g. age, salary, fico score, distance of residence from a metro) attributes/columns in a table. The data in these numeric columns will be widely dispersed, across different scales. For e.g. values within salary can be from 10s of thousands to several millions. Two numerical attributes will be in different scales - example salary (30K - 2 mil) vs age (1-100). Such disparity in scales, if left untreated, can throw most mining algorithms out of whack - the attributes with higher range of values will start outweighing those in the lower range during the computation of prediction. For such algorithms, the numerical data is normalized to a smaller range [-1, 1] or [0, 1] using the z-transform, to enable uniform handling of numerical data by the algorithm.

Sphere: Related Content

Saturday, May 31, 2008

Sequoia Century

Monday, May 26, 2008

Luck (PIC)

Sunday, May 25, 2008

Where Da Cash At

Sunday, May 18, 2008

Snowmachine Races Snowcats Downhill [pic]

Subscribe

Search This Blog

Reading Material

dmourati on flickr

read dmourati on twitter

FeedCount

And You Don't Stop Archive

Tags

Strobist