6.825 Exercise 3: SVMs Due Thursday December 8th at 5pm Please turn in to the box outside Tomas's door This exercise asks you to go through some small experiments with support vector machines. For more information on support vector machines, see the svm_links.html page also available in this directory. The main idea here is to look at using SVMs as a way to try to find a classifier to separate two classes based on their feature values. For example, you might want to classify whether an image is of a woman or a man based on the 16x16 pixels that make up an image (and so here the features are the pixels and the class is "man" or "woman"). In this exercise you are asked to look at learning a classifier on a set of training examples, and then test this classifier on a set of test data. We want you to explore how different kernels will allow you to learn different sorts of classifiers: a linear kernel only allows the classifier to learn a line that separates the input examples, but other kernels allow more complicated, nonlinear functions to separate the data. Here we are using SVM light to run our experiments. See http://svmlight.joachims.org/ for more information on SVM light. I have set it up in my Public athena account and you are welcome to use it there or download and install it. There are 2 example directories in my Public athena account, 6825 directory http://web.mit.edu/emmab/Public/6825 ex1 ex2 Inside each is a training and testing set of data, as well as a picture of the training data in a pdf file. To train using svmlight, run ./svm_learn options input-data-file output-model-filename Useful options include -t 0 = linear kernel -t 1 -d 2 = poly kernel of degree 2 -t 2 -g 10 = RB kernel with sigma = 1/10 -c 1 = set C to be 1 (penalty of classification errors) Note that the number of SVs and the misclassifications on the training set, as well as the C used (if you don't set it) will be written to stdout To classify a test set and check accuracy, run /svm_classify test-filename model-filename prediction-filename Note that the accuracy level will be written to stdout ** Important note: You can either run the code directly from my Athena public directory from the command line, or you can download and install SVM light. If you choose to run the code in my Public directory, then your commands will look something like ./svm_learn options input-data-file ~/...output-model-filename /svm_classify test-filename ~/...model-filename ~/...prediction-filename (Aka you can run the executables in my Public directory but you don't have write permission to my directory, so the model and prediction files you create must be written back to your own directory) For each example: Pick 3 different kernels, train and test and report results of how many SVs there were how many misclassifications on the training set Percent accuracy on the test set For at least 1 kernel (do for all 3 if you have time), choose 3 values of C (such as C=0.1, C=1, C=20, but you can pick whatever you want-- just try to look and see if there are interesting differences), and again train, test and report results as above. Please report all these results in a table form similar to the following so that it is easy to compare the between kernel and C values: example table: Kernel C # SVs # Misclassifc on Training % Test Accuracy poly 2 1 15 0 97% rbf sig=.5 2 ... Also please answer the following questions (briefly): As you change C does this alter the number of support vectors? As you change the kernel does this change the number of support vectors? How does changing C affect the number of training misclassification and test accuracy rate in these 2 examples (if at all)? How does changing the kernel affect the test accuracy and the number of training misclassifications? For each example, also state: - What you thought the best kernel and value of C was out of your experiments - Why do you pick the kernel you did? (generalization as reflected by test accuracy, number of support vectors, etc.)