Implement benchmark classification algorithms

Assignment Help Data Structure & Algorithms
Reference no: EM133058123

Task Description

Instructions:

In the industry/real-world, you need to communicate either with your manager, client, other stakeholders and/or IT team to understand the source of data and to gather it.

Here, the teaching team already gathered data for you.

In this task, you need to perform experiments on TWO DATASETS.

1. The first dataset "NSL-KDD" can be obtained from the data folder, go to the "Week_5_NSL-KDD-Dataset" subfolder.

2. The second dataset is "Processed Combined IoT dataset"

Step 1: Predictive Modelling (Prediction of the classes)

Dataset 1:

The DecisionTreeClassifier has been implemented for you. Now, you need to implement other techniques and compare. Please do the following tasks:
1. Implement at least 5 benchmark classification algorithms.
2. Tune the parameters if applicable to obtain a good solution.
3. Obtain the confusion matrix for each of the scenarios (Use the test dataset).
4. Calculate the performance measures for the each of the classification algorithms that includes Precision (%), Recall (%), F-Score (%), False Alarm- FPR (%)

You need to compare the results following the table below. Create one table for each algorithm (Use the test dataset).

Attack Class

Precision (%)

Recall (%)...

...

...

...

...

DoS

 

 

 

 

 

 

Norm

al

 

 

 

 

 

 

Prob

 

 

 

 

 

 

R2L

 

 

 

 

 

 

U2R

 

 

 

 

 

 

Finally, you summarize the results similar to the below table (Use the test dataset).:

Algorithms

Accu

racy (%)

Precision (%)

Recall (%)...

...

...

...

...

Alg 1

 

 

 

 

 

 

 

Alg 2

 

 

 

 

 

 

 

...

 

 

 

 

 

 

 

...

 

 

 

 

 

 

 

...

 

 

 

 

 

 

 

Dataset 2:
A sample Random Forest implementation is given to you. Repeat the procedure as mentioned in datset 1. The only difference will be "you need to consider 70:30 train-test split (70% for train and 30% for test)" for testing as there is no separate test set file. Please note, k-fold cross validation is also acceptable. However, as k-fold cross validation will take a huge amount of time, we have not made it mandatory.

Comparison of Results:
Your results need to be comparable against benchmark algorithms. For example, see the below results obtained from a recent article "An Adaptive Ensemble Machine Learning
Model for Intrusion Detection" published in IEEE ACCESS, July 2019 for Dataset 1

For Dataset 2, please see the article "TON_IoT Telemetry Dataset: A New Generation Dataset of IoT and IIoT for Data-Driven Intrusion Detection Systems" for Dataset 2.

It will not be exactly same and nothing to be worried about that. Your target will be to select the best performing algorithms that you can and achieve a comparable results.

Step 2: Data Visualization

Perform the following tasks for both of the datasets:

1. Visualize and compare the accuracy of different algorithms.
2. Plottheconfusionmatrixforeachscenario.

Step 3: Results delivery:

Once you have completed the data analysis task for your security project, you need to deliver the outcome. (PLEASE NOTE, the results obtained from the above steps need to be submitted as a REPORT format rather than just a screenshot).

Here, you need to write a report (at least 3500 word)based on the outcome and results you obtained by performing the above steps. The report will describe the algorithms used, their working principle, key parameters, and the results. Results should consider all the key performance measures and comparative results in the form of tables, graphs, etc.

Attachment:- Predictive Modelling.rar

Reference no: EM133058123

Questions Cloud

What is the npv of the acquisition : RiverRocks, whose WACC is 11.9%, is considering an acquisition of Raft Adventures (whose WACC is 14.9%). The purchase will cost $100.3 million
Describe several reasons for studying finance : 1. Identify and briefly describe several reasons for studying finance. (should be your own idea and not copied from the internet)
Prepare a reconciliation of the total of the list of balance : Prepare a reconciliation of the total of the list of balances on the customers' personal accounts to the corrected balance on trade receivables control account
What is the npv of the project : Problem 1 After extensive research, Riden Tires Inc. has recently developed a new tire, the SuperTread, and must decide whether to make the investment necessary
Implement benchmark classification algorithms : Communicate either with your manager, client, other stakeholders and/or IT team to understand the source of data and to gather it.
What are the monthly payments : The negotiated price of a car is 40,000$. Monthly payments are made at an APR of 7%. The residual value being 11,000. What are the monthly payments?
Prepare the journal entries on December : On December 31, 2021, when its Allowance for Doubtful Accounts had a debit balance of $1,400, Prepare the journal entries on December
Long-term corporate bonds and earned : You invested in long-term corporate bonds and earned 6.8 percent. During that same time period, large-company stocks returned 12.6 percent, long-term government
Prepare a multiple-step income statement for Meilleur : Prepare a multiple-step income statement for Meilleur Merchants for the year ended December 31, 2021

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  A sorting algorithm is described as stable

A sorting algorithm is described as stable if equal elements are in the same relative order in the sorted sequence as in the original sequence.

  Creating relational database about music performers

Create a relational database having information about music performers, their recordings, and the composers of the music they recorded.

  What clustering algorithms are good for big data

Compare and contrast five clustering algorithms on your own. Provide real-world examples to explain any one of the clustering algorithm. In other words, how is.

  Find a spanning tree with minimal weight

Describe an algorithm for finding a spanning tree with minimal weight containing a specified set of edges in a connected weighted undirected simple graph.

  Explain the sequential search algorithm

Update the website program to reflect the following changes: Use the sequential search algorithm to locate the credit card number entered by the user.

  Solve the problem using the selection sort

Write a program that will create an unsorted array with 10 integer elements. you may prompt the user for the elements or you may populate the elements.

  Explain the basic framework of a data analytics project

Explain the basic framework of a data analytics project using SAS and Demonstrate the basics of using SAS to perform descriptive analytics

  Determining what insights industry wants and sourcing data

Determining what insights industry wants and sourcing data and Where will you source your data? How do you anticipate that the data will help answer

  Define descendant and an ancestor

Show that if G is a directed graph and T is a spanning tree constructed using depth-first search, then every edge not in the spanning tree is a forward edge.

  How to calculate h function for next selection

how to calculate h function for next selection - generate matching degree and solve time randomly in advance after this we have to use multi objectives

  Telephone number as a string

Write a program that inputs a telephone number as a string in the form (555) 555-5555. The program should use an object of class StringTokenizer to extract the area code as a token, the first three digits of the phone number as a token and the las..

  Describe file system and metadata thorough clear and concise

Describe file system and metadata thorough, clear, and concise. How is metadata used in the file system (a) File Allocation Table (FAT) 32, (b) New Technology File System/Master File Table (NTFS/MFT), and (c) i-Node file system [Hint: Dr. Scoggins..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd