This webpage contains the online material for the kernel method practical.

./kernelApprox1.shNow type

exercise1and have a look at the output.

**Question 1**: How does the error depend on the
number of dimensions D? Does it go up or down? Is
that expected?

OK, apparently the approximation is working then. Let us see it in action now. We classify a simple circle, points with radius smaller than one are negative, radius bigger is positive. Start the first run of this classification problem. Start

./kernelApprox2.shthen run

exercise2

**Question2** : Make sure you understand the plots
and the text output. Do the performance numbers make
sense to you? Which classifier is best? How does
the linear SVM perform? Why?

**Question3** : The linear SVM does not perform
well. Maybe an embedding into a higher dimensional space
in the following way works? (do not type this in,
read on for the rest of the question first.)

%no need to type this in the sheel, read on w = randn(D,size(X,1)); X = w*X;What do you expect? Think first! Then set the variable

do_embed = 1; exercise2Does this match your expectations?

**Question4** : Change the number of training
points to 200 and re-run exercise2

num_train=200; exercise2What happens with the runtimes of the SVM trainings? Why do you think that is? Is there something special about the dataset?

**Question5**: Shogun supports specialized
linear SVM solvers and those are much faster. Let
us change to use a different solver by

linear_solver=1; exercise2Is the runtime better now?

**Question6** Appreciate the speed gain by
changing the number of training examples to
something very high, say 100k and re-run

num_train=100000; exercise2Which method is the best in terms of speed/accuracy?

**Question 7** Exit matlab:

exit;

**Bonus Question** (if you have time before
the next part) :
In case you know what a Neural Network is, do you
see a connection? How could these random fourier
features also be interpreted? Do you see where
backpropagation can be used? Is the overall
system still convex?

- Williams & Seeger,
*Using the Nyström Method to Speed Up Kernel Machines*, NIPS 2001 - Rahimi & Recht,
*Random features for large-scale kernel machines*, NIPS 2008 pdf - Le, Sarlos, and Smola,
*Fastfood - Approximating Kernel Expansions in Loglinear Time*, ICML 2013, pdf - references therein...

- Shogun Toolbox binds to almost all svm solvers
- liblinear
- libsvm
- svmlight and svmstruct for structured output learning

**Setup**: If you are within a Matlab session, exit Matlab first. To get started for this part, run the script

./MSchallenge.shThis will open a Matlab session and ask you to provide a name for your team. Next, it will load the dataset, train a baseline linear least squares model, and save your first submission.

**Challenge**: Now, you should write your own script to improve over the baseline and, possibly, win the competition!
You can use the function *krr.m* to train a Kernel Ridge Regression model for several values of the regularization parameter. Type

help krrfor a description of the interface. You will need to design a proper model selection mechanism for KRR.

**Editing**: If for some reason you cannot use the Matlab GUI, you can resort to editors such as *vim*, *joe*, *pico*, or *emacs*. Alternatively, you can transfer the files to your local computer and work with a local text editor. You will need to transfer the files back to the MLSS machine in order to be evaluated. If you have a linux/macos system, you can use the command

scpto transfer the files. If you use Windows, please download and install WinSCP.

**Hints:** If you do not know where to start, edit one of the two incomplete example scripts *rrval.m* or *krrrbfGCV.m* and try to address the TODOs in the code. Make sure to save the results using the function *writeoutput.m*

**LeaderBoard**: We will update the LeaderBoard below once in a while (obviously, not too often!) by reading your output files and computing your performance score. Keep an eye on it to see how well you are doing. Again, note that this is not a real-time leaderboard to prevent you from optimizing the performance score! Remember that FIT = 1-NormalizedRMSE (the higher, the better)