Reference no: EM132392844
CS5087701 Machine Learning
Question. Analyze the Adult dataset from UCI and the Spooky Author Identification from Kaggle.
Question 2.1. Derive the gradient descent learning rule for a single perceptron where the prediction function is given by
h(x)= h(x1,x2,x3,..... , xn) = w0 + w1x1 + w1x12 + ..... + wn xn + wnxn2.
Question 2.2. Consider the alternative error function given by for a single layer network. Derive the gradient descent learning rule for this error function. Show that it can be implemented by multiplying each weight by some constant before performing the standard gradient learning rule.
E(w) = 1/2 (ΣiΣk (yk(i)-hk(x(i)))2) +γ Σjwj2.
Question 2.3. Continue analyzing the Adult and Spooky Author Identification (SAI) datasets, but using Artificial Neural Networks this time. Similarly, you should give all necessary details of your study in a mini report of around two to four pages with a discussion section.
In your report, you should include the following items:
(a) All the experimental settings in the experiments.
(b) The prediction accuracy with cross-validation and possible different data partitions.
(c) Can we use ANN for both datasets with acceptable performance?
(d) If we multiply some of the input attributes with a big number, say, for the Adult dataset, will we obtain similar result afterwards?
(e) Give the reasons why the result is good (or bad) for different numbers of iterations in the ANN training. Do you observe any overfitting?
(f) Will we obtain different result if we try different numbers of attributes as the input.
(g) Provide at least one case study that is about how to choose a good ANN structure, such as choosing the number of hidden layers, the number of hidden nodes in the hidden layer(s) to name a few. Do you observe any overfitting?
(h) Given the SAI dataset, can you suggest an alternative way to feed the inputs to the network so that we may have a better result?
(i) Do you have a thought on the information stored on the hidden nodes? Can you give any explanation on the hidden node values given an input? Answer the question regarding to one of the Adult and SAI datasets or both.