Practical on Kernel Methods!

This webpage contains the online material for the kernel method practical.

Setup

To get started use your account to log into the practical server (password : mlss379)

ssh -X mlssXXX@172.16.172.16

where XXX is from your assigned username (eg mlss013). Now run

~lwk/setup.sh

Reading this line? Great, you are ready and good to go!

Part 1: Kernel Approximations

The first part of this exercise concerns the quality of the random fourier approximations. Start matlab by typing

./kernelApprox1.sh

Now type

exercise1

and have a look at the output.

Question 1: How does the error depend on the number of dimensions D? Does it go up or down? Is that expected?

OK, apparently the approximation is working then. Let us see it in action now. We classify a simple circle, points with radius smaller than one are negative, radius bigger is positive. Start the first run of this classification problem. Start

./kernelApprox2.sh

then run

exercise2

Question2 : Make sure you understand the plots and the text output. Do the performance numbers make sense to you? Which classifier is best? How does the linear SVM perform? Why?

Question3 : The linear SVM does not perform well. Maybe an embedding into a higher dimensional space in the following way works? (do not type this in, read on for the rest of the question first.)

%no need to type this in the sheel, read on
w = randn(D,size(X,1));
X = w*X;

What do you expect? Think first! Then set the variable do_embed to one and re-run exercise2, this will apply the above mentioned transformation

do_embed = 1;
exercise2

Does this match your expectations?

Question4 : Change the number of training points to 200 and re-run exercise2

num_train=200;
exercise2

What happens with the runtimes of the SVM trainings? Why do you think that is? Is there something special about the dataset?

Question5: Shogun supports specialized linear SVM solvers and those are much faster. Let us change to use a different solver by

linear_solver=1;
exercise2

Is the runtime better now?

Question6 Appreciate the speed gain by changing the number of training examples to something very high, say 100k and re-run

num_train=100000;
exercise2

Which method is the best in terms of speed/accuracy?

Question 7 Exit matlab:

exit;

Bonus Question (if you have time before the next part) : In case you know what a Neural Network is, do you see a connection? How could these random fourier features also be interpreted? Do you see where backpropagation can be used? Is the overall system still convex?

Slides

Literature

This is not the only kernel approximation there is. This literature list gives you some links to other works on the same topic. Some other approximations (Nystroem) are usually more accurate but slower, some are coarser but faster (Fastfood)

Williams & Seeger, Using the Nyström Method to Speed Up Kernel Machines, NIPS 2001
Rahimi & Recht, Random features for large-scale kernel machines, NIPS 2008 pdf
Le, Sarlos, and Smola, Fastfood - Approximating Kernel Expansions in Loglinear Time, ICML 2013, pdf
references therein...

In case you are interested in different svm solvers I would recommend to check out the following webpages that contain state-of-the-art solvers for SVMs

Shogun Toolbox binds to almost all svm solvers
liblinear
libsvm
svmlight and svmstruct for structured output learning

Part 2: Model Selection for kernel-based regularization

The second part of this exercise concerns model selection for a kernel-based regularization method. Use the following slides as a reference.

Slides

Model Selection Challenge

Train a Kernel Ridge Regression model and select the kernel and regularization parameter so as to achieve the best test performances!

Setup: If you are within a Matlab session, exit Matlab first. To get started for this part, run the script

./MSchallenge.sh

This will open a Matlab session and ask you to provide a name for your team. Next, it will load the dataset, train a baseline linear least squares model, and save your first submission.

Challenge: Now, you should write your own script to improve over the baseline and, possibly, win the competition! You can use the function krr.m to train a Kernel Ridge Regression model for several values of the regularization parameter. Type

 help krr

for a description of the interface. You will need to design a proper model selection mechanism for KRR.

Editing: If for some reason you cannot use the Matlab GUI, you can resort to editors such as vim, joe, pico, or emacs. Alternatively, you can transfer the files to your local computer and work with a local text editor. You will need to transfer the files back to the MLSS machine in order to be evaluated. If you have a linux/macos system, you can use the command

scp

to transfer the files. If you use Windows, please download and install WinSCP.

Hints: If you do not know where to start, edit one of the two incomplete example scripts rrval.m or krrrbfGCV.m and try to address the TODOs in the code. Make sure to save the results using the function writeoutput.m

LeaderBoard: We will update the LeaderBoard below once in a while (obviously, not too often!) by reading your output files and computing your performance score. Keep an eye on it to see how well you are doing. Again, note that this is not a real-time leaderboard to prevent you from optimizing the performance score! Remember that FIT = 1-NormalizedRMSE (the higher, the better)

Practical on Kernel Methods!

Setup

Part 1: Kernel Approximations

Slides

Literature

Part 2: Model Selection for kernel-based regularization

Slides

Model Selection Challenge

Leaderboard