Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Serdica Mathematical Journal, Vol. 35, No 1, (2009), 109p-116p
Locally Linear Embedding (LLE) has gained prominence as a tool in unsupervised non-linear dimensional reduction. While the algorithm aims to preserve certain proximity relations between the observed points, this may not always be desirable if the shape in higher dimensions that we are trying to capture is observed with noise. This note suggests that a desirable first step is to remove or at least reduce the noise in the observations before applying the LLE algorithm. While careful denoising involves knowledge of (i) the level of noise (ii) the local sampling density and (iii) the local curvature at the point in question, in most practical situations such information is not easily available. Under the model we discuss, a simple averaging of the neighboring points does reduce the noise and is easy to implement. We consider the Swiss roll example to illustrate how well this procedure works. Finally we apply these ideas on biological data and perform clustering after such a 2-step procedure of denoising and dimension reduction.