Saturday, May 31, 2008

Sequoia Century

Today I helped register people for the Sequoia Century ride put on by Western Wheelers. Tomorrow, I do the ride.

Sphere: Related Content

Monday, May 26, 2008

Luck (PIC)

Ok, so now how do we get out of here?

read more | digg story

Sphere: Related Content

Sunday, May 25, 2008

Where Da Cash At

In the business world, the applications are more grown-up and mission-critical. One example is churn prediction - i.e. finding out if a (say, wireless) customer would stay loyal with the current provider, or move on ("churn") to competitor (in which case, the current provider could try to entice him/her to stay with appropriate promos). The customer data used for such churn prediction applications contains categorical (e.g. gender, education, occupation) and numerical (e.g. age, salary, fico score, distance of residence from a metro) attributes/columns in a table. The data in these numeric columns will be widely dispersed, across different scales. For e.g. values within salary can be from 10s of thousands to several millions. Two numerical attributes will be in different scales - example salary (30K - 2 mil) vs age (1-100). Such disparity in scales, if left untreated, can throw most mining algorithms out of whack - the attributes with higher range of values will start outweighing those in the lower range during the computation of prediction. For such algorithms, the numerical data is normalized to a smaller range [-1, 1] or [0, 1] using the z-transform, to enable uniform handling of numerical data by the algorithm.

Sphere: Related Content