Usenet.com

www.Usenet.com

Group Index

Sci Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: MDS with different matrices



Aleks Jakulin wrote:
"Rich Ulrich" <[EMAIL PROTECTED]> wrote:

Has somebody cared enough about MDS  to update the
computer programs?  It's long been my impression that
'marketing' was using MDS.  From google, it also seems
like MDS  sometimes is included in the tools of data mining.


MDS has become slightly outdated, but there has been some good work into
the same direction recently, primarily in the direction of locally linear
embedding. The problem with MDS is globality: in most stress functions
short dissimilarities are approximated with a similar precision as large
dissimilarities. In reality, what a human analyst expects from MDS is more
a structure alike clustering: one that would identify groups of similar
objects.

The 'old' solution was Shepard's non-metric MDS. NMDS attempts to modify
the dissimilarity matrix so that the distance rankings are maintained,
rather than metric deviations. For example, we are not interested in the
exact distances between A, B and C, but we do want distance A-B to be
greater than A-C if C is more dissimilar from A than is B. It does not work
well in practice, unfortunately. Today, new methods have emerged. For
example, locally linear embedding instead only evaluates the distances from
an object to K of its nearest neighbors. This yields nice results.

Good starting points for further exploration are
http://www.cs.toronto.edu/~roweis/lle/
http://basis.stanford.edu/carrie-web/

Aleks



Forrest Young still distributes the Fortran code and Dos executable of Alscal, the descendent of KYST, the Bell Labs NMDS program. MDS, for some reason, is often used as an acronym for NMDS. I follow those who restrict MDS to Shepard's metric scaling, which is identical to Gower's Principal coordinates analysis.
http://forrest.psych.unc.edu/


MDS still has huge advantages, if you have access to the case by variables matrix. There are a variety of transformations that can be used to transform the data matrix prior to doing the metric scaling. Pierre Legendre & I wrote a paper in 2001 that describes a number of these transformations. Pierre provides code on his web page for mac & pc for doing the transformations and PCA's, and I provide the Matlab 4 & Matlab 6 code for doing the same:
http://www.es.umb.edu/edgwebp.htm#LegGallMat6
Strictly speaking the MDS model (not the MDS model) can have problems with non-metric distances, producing negative eigenvalues. Legendre & Legendre (1998) Numerical Ecology, 2nd ed. review 3 solutions to this problem, and their algorithms are programmed in Pierre & my programs.
Often it is not the ordination of the cases that is important but explaining why the cases take the positions they do in low-dimension space. The Gabriel Euclidean distance biplot and correlation biplot are important tools for interpreting the low-dimension ordination. Gower's book 'biplots' is the best overall description of the process.
MDS has also been revived by the use of constrained ordination techinques. Canonical correspondence analysis can impose conditions that either the distances among cases in low-dimension space must be linear functions of a set of external explanatory variables, or uncorrelated with a set of covariates (partial canonical correspondence analysis). When preserving Euclidean distances, as in the MDS model, the technique is called redundancy analysis. Both redundancy analysis and canonical correspondence analysis (not to be confused with canonical correlation analysis) are available in the CANOCO package. Using algorithms presented by ter Braak or by Legendre & Legendre, the basic CANOCO or redundancy analysis models can be programmed in languages such as Matlab.
BTW, you can perform an MDS, aka Gower Principal coordinates analysis, on a correlation matrix after converting it to a distance matrix. This distance matrix will not be Euclidean, so anticipate negative eigenvalues. The standard NMDS programs have an option to specify whether the matrix entered is a distance or similarity matrix.
Gene Gallagher





<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com



Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.