What's New?

Version 1.3:


pars.inequ=1 switchtes to INEQUALITIES instead of EQUALITIES. In other words this means, that local distances can shrink, but never grow. With some data sets this results in better unfolding.


parameters can now be passed on as strings.

e.g. mvu(D,k,'inequ','maxiter',20);

is equivalent to pars.inequ=1; pars.maxiter=20; mvu(D,k,pars);


The code is less memory consuming. All diag, eye commands are now spdiags, eyes. Thanks, Ashley, for pointing this out to me.


1. Download

In order to run MVU, you need to download the following files:

original MVU Matlab code (v 1.3)

•NEW!!! landmark-MVU Matlab code (faster) (v 1.3) NEW!!!


•Either one of the following SDP solvers:

CSDP (version 4.8 or higher) Fastest solver!!!


SeDuMi (version 1.05 or higher)

SDPT3

(Note: CSDP and SDPT3 are about 4 times faster than SeDuMi)


2. Installation

First download all the files mentioned above (only one SDP solver is required).

Installing MVU Matlab code

The installation of the MVU Matlab code should be straightforward. Unpack the tar-ball and make sure that the mvu directory is in the matlab path.

Installing SeDuMi Solver

Installing SeDuMi is rather uncomplicated. Follow the instructions on the SeDuMi homepage and make sure that the SeDuMi directory is in the matlab path.



Installing CSDP Solver

Installing CSDP is slightly trickier:

•First download and untar the csdp files. If you didn't go for the binaries, you still have to compile them

•Locate the full path of the csdp executable (something like "/home/username/csdp4.8p4/bin/csdp" )

•Edit the csdp.m file and change the "system" and "dos" calls of csdp to the actual "csdp-root-path"/bin
e.g. info=dos(['csdp ' fname '.dat-s' ' ' fname '.sol'],'-echo');
becomes
info=dos(['~/csdp4.8p4/bin/csdp ' fname '.dat-s' ' ' fname '.sol'],'-echo');

•Make sure csdp.m is in the matlab path




Installing SDPT3 Solver


•First download the SDPT3 files from the SDPT3 homepage.

•Download and unzip the SDPT3Patch into the SDPT3-3.02 directory.

•Execute ./patchscript

•Startup Matlab and execute Installmex

•Make sure to always run startup before you use MVU!!


You can now test MVU running mvudemo.

Hopefully everything is working fine.


3. Instructions


After unzipping the MVU Matlab code, you should obtain the following files:

•mvu.m -- canonical MVU

•mvuIncXL.m -- An incremental solver for MVU (used for very large data sets)

•distance.m -- A little helper function used to compute the euclidean distance matrix of input vectors

•readme.htm -- essentially this file


mvu.m

mvu.m does semidefinite embedding. It is optimized for data sets with up to 1500 data points. mvu(DD,k,pars)

Required Input parameters:

DD

NxN symmetric Distance Matrix where DD(i,j) corresponds to the distance between the i-th and the j-th input vector



Optional Input parameters:



K

Minimum size of local neighborhood (default K=3)

pars.solver

chooses the MVU solver:

pars.solver=0 CSDP (default)

pars.solver=1 SeDuMi

pars.slack

enables slack variables:

pars.slack=0 slack variables disabled (default)

pars.slack=1 slack variables enabled

pars.factor

Weight of Slack variables (only if pars.slack=1) the very closer to 1 ther harders do the slack variables get pars.factor=0.9999 (default)

pars.verify

pars.verify=1 (default) increases K if graph is unconnected

pars.verify=0 does not check if graph is unconnected

pars.angles

pars.angles=1 does preserve angles in local neighborhood (default)

pars.angles=0 does not preserve angles in local neighborhood

pars.repell

Mx2 matrix of pairs of indices of input points that repell each other

pars.repell=[] (*default*)

e.g. pars.repell=[1 3; 6 9; 9 200]; causes point 1 and 3, 6 and 9, 9 and 200 to have a pairwise repelling forces between them

pars.rweight

only if pars.repell~=[]

factor between 0 and 1, of how much the repelling force should weight compared to the normal objective function.

pars.rweight=0.3 (default)


Output:


Y

NxN matrix where Y(1:d,i) is the d-dimensional embedding of the i-th input vector

details.D

List of eigenvalues

details.info

info field fromm the SDP solver

details.pars

identical to pars (only useful for batch jobs to automatically save all parameters)

details.K

actual neighborhood size used in computation (could be higher than the input K, if graph was unconnected and pars.verify=1)



Demo code:

Th=linspace(0,2*pi,100);

X=[Th;sin(Th)];

clf;

subplot(2,1,1);

scatter(X(1,:),X(2,:),'o',Th,'filled');

[Y,details]=mvu(distance(X),3 );

subplot(2,1,2);

scatter(Y(1,:),Y(2,:),'o',Th,'filled');




mvuIncXL.m

mvuIncXL.m is an incremental approximation for semidefinite embedding for very large data sets. It finds a low dimensional embedding of some small subset of the input data and adds all remaining input points incrementally.

[Y,details]=mvuInc(X,K,FirstSet,d,pars);

Required Input Parameters


X

DxN input matrix where each column corresponds to an input vector

K

Minimum size of local neighborhood

FirstSet

Size of the initial subset

d

dimensionality of output (it is recommended to let d be slightly higher than needed)



Optional Input parameters:


d

dimensionality of output (it is recommended to let d be slightly higher than needed)

pars.solver

chooses the MVU solver:

pars.solver=0 CSDP (default)

pars.solver=1 SeDuMi

pars.slack

enables slack variables:

pars.slack=0 slack variables disabled (default)

pars.slack=1 slack variables enabled

pars.factor

Weight of Slack variables (only if pars.slack=1) the very closer to 1 ther harders do the slack variables get pars.factor=0.9999 (default)

pars.verify

pars.verify=1 (default) increases K if graph is unconnected

pars.verify=0 does not check if graph is unconnected

pars.angles

pars.angles=1 does preserve angles in local neighborhood (default)

pars.angles=0 does not preserve angles in local neighborhood

pars.repell

Mx2 matrix of pairs of indices of input points that repell each other

pars.repell=[] (*default*)

e.g. pars.repell=[1 3; 6 9; 9 200]; causes point 1 and 3, 6 and 9, 9 and 200 to have a pairwise repelling forces between them

pars.rweight

only if pars.repell~=[]

factor between 0 and 1, of how much the repelling force should weight compared to the normal objective function.

pars.rweight=0.3 (default)


Output:


Y

dxN matrix where Y(1:d,i) is the d-dimensional embedding of the i-th input vector

details.SY

MxM (M=FirstSet) matrix where details.SY(1:d,i) is the d-dimensional embedding of the i-th subsampled input vector (i.e. the result of MVU applied to the subsample)

details.DS

list of eigenvalues from the subsample calculation

details.time

time needed for computation



Demo code:

tt=linspace(0,4*pi,100);

X=[tt.*sin(tt);tt.*cos(tt)];

figure;

pars.slack=1;


try

fprintf('Computing distances...');

Dis=distance(X);

fprintf('done\n');

catch

error('ERROR! Are you sure distance.m is in the path?');

end;



subplot(2,1,1);

scatter(X(1,:),X(2,:),60,tt,'filled'); axis equal;

title('Original');

drawnow;

try

pars.solver=0;

[Y,D]=mvu(Dis,3,pars); % CSDP

fprintf('\n\nCSDP is working!\n');

catch

pars.solver=1;

[Y,D]=mvu(Dis,3,pars); % SEDUMI

fprintf('\n\nCSDP does not seem to be installed correctly.\n');

fprintf('SeDuMi is working!\n');

end;

subplot(2,1,2);

scatter(Y(1,:),Y(2,:),60,tt,'filled'); axis equal;

title('Reduced Dimensionality');



distance.m

DD=distance(X);
Returns the distance matrix DD of the column vectors of X.
DD(i,j)=sum((X(:,i)-X(:,j)).^2);


This material is based upon work supported by the National Science Foundation under Grant No. 0238323.

 

Maximum Variance Unfolding

(formerly known as sde)