Reference no: EM132489029
Security and Privacy Issues in Analytics
Pass Task 1: Attack Classification using Naïve Bayes Algorithm
Overview
An Intrusion Detection System (IDS) is a system that monitors network traffic for suspicious activity and issues alerts when such activity is discovered. Supervised learning techniques have been proven very effective for intrusion detection.
Task Description
Instructions:
This task is a binary classification (2-class problem). Follow the below steps to complete the task. Once you have the results and reports, compile in a PDF.
1. Load the train dataset (of the NSL-KDD data) from the files KDDTrain, KDDTest:
Once uploaded, you may check the data distribution by selecting the class attribute and it will appear as Figure 1.
2. Now apply "Naïve Bayes" classification algorithm from the "Classify" tab.
3. Check the results with a 10-fold cross validation.
4. Now, upload the test dataset and check the classification results.
5. Compare the results between 10-fold cross validation and the one obtained using the test dataset. Use confusion matrix to explain the results. Also, include a brief description on 10-fold cross validation.
Pass Task 2: Intrusion Detection using Supervised Learning Techniques
Task Description
Instructions:
This task is a binary classification. Follow the below steps to complete the task. Once you have the results and reports, compile in a PDF and submit it.
Load the train dataset (of the NSL-KDD data) from the files KDDTrain, KDDTest.
Once uploaded, you may check the data distribution by selecting the class attribute and it will appear as Figure 1.
2. Now apply "Naïve Bayes" classification algorithm from the "Classify" tab.
3. Check the results with a 10-fold cross validation.
4. Now, upload the test dataset and check the classification results.
5. Compare the results between 10-fold cross validation and the one obtained using the test dataset. Use confusion matrix to explain the results. Also, include a brief description on 10-fold cross validation.
*The above 5-steps are similar to the pass task 1. Therefore you can use the results from there while preparing this report. Address carefully the below.
6. Similar to the "Naïve Bayes", apply at least 5 other supervised classification techniques and compare their performance. To report the performance create a table and present the following measures. Then compare the outcome of your nominated 5 algorithms. You can choose any 5. However try to consider high performing algorithms.
• TP Rate
• FP Rate
• Precision
• Recall
• F-Measure
• ROC Area
7. Some algorithms may have tuning parameter. Consider the SMO based SVM algorithm. You can try different kernel trick as shown below. Change the kernels to "PolyKernel" and ensure that the filter normalize the training data as shown in the figure. If you start the task, it will take too much time on this large dataset. So you need to reduce the sample size of the dataset to make it manageable (note: it may impact on the performance).
8. Now perform classification task based on SVM classifier (SMO) using POLY and RBF kernels and report the confusion matrices and computation time.
Pass Task 3: Taxonomy of Attacks, Defenses, and Consequences in Adversarial Machine Learning
Task Description
Suppose you are working in an organization who are developing a report on the vulnerabilities of machine learning models due to adversarial attacks. Your manager has asked you to provide a 600 word report to submit within the next week. His expectation is that the 600 word report will cover the attack taxonomies, defense mechanisms and consequences.
2. Read the NIST article
3. Identify five important attack types. Summarize in approx. 300 words.
Summarize the defense mechanisms for the attack types you identified in step 1.
Attachment:- Security and Privacy Issues in Analytics.rar