NIT3202 Data Analytics for Cyber Security Assignment

Assignment Help Programming Languages

Reference no: EM132827551

NIT3202 Data Analytics for Cyber Security - Victoria university

In this assessment, you will apply supervised machine learning methods to classify Twitter spam (i have attached the twitter spam dataset) using the provided dataset. Table 1 shows the features description of the dataset

Follow instructions, complete all the tasks and organize your answers into an essay. R script, R screenshot, your results and explanations should be covered for each question.The answers should be written in the form of an essay based on the requirement of the assessment. e.g., introduction on the assessment, conclusion of the data exploration, and the detailed explanations on the task you have done.

Please save the all-r script file under my name "Noyanna NIT3202" so that it is visible in the screenshots, make sure there is screenshots for each steps showing the commands and the results and explain the tasks and output.

Here are your tasks:

1. Load dataset into R Studio, and randomly split the dataset to training dataset and testing dataset with the ratio of 9:1.

2. Use training dataset to train a machine learning model with the random forest algorithm for Twitter spam classification

3. Use testing dataset to test and evaluate the model trained in step 2 and print the confusion matrix.

4. Use training dataset to train another machine learning model with the K nearest neighbours algorithm.

5. Use testing data to test and evaluate the model trained in step 4 and print the confusion matrix.

6. Comparing the performance of Twitter spam classifiers established in step 2 and step 4, which algorithm can achieve better prediction results for this Twitter spam detection task? Why?

7. Change the ratio of training dataset and testing dataset to 8:2 and retrain random forest model and K nearest neighbours algorithm. Compare the performance with the classifiers established in step 2 and step 4. Which ratio can achieve better prediction results? Why?

Attachment:- Data analytics for cyber security.rar

Reference no: EM132827551

Questions Cloud

Developing and implementing a marketing plan : In this project, your client has tasked your team with creating a marketing plan with the goal of increasing sales to other businesses in the United States.

What is journal entry to record mills investment in bonds : The market interest rate (yield) was 4% for bonds of similar risk and maturity. What is journal entry to record Mills investment in bonds

Analyze the case using a systems approach : Analyze the case using a systems approach, taking into consideration both family and community systems. Complete and submit the "Dissecting a Theory and Its.

Supply chain management and logistics : Explain the relationship between supply chain management (SCM) and logistics. Identify the differences and similarities.

NIT3202 Data Analytics for Cyber Security Assignment : NIT3202 Data Analytics for Cyber Security Assignment Help and Solution, Victoria university - Assessment Writing Service

Do believe that tax system is fair and why or why not : Did you discover any revelations, surprises or concerns about our tax system? Do you believe that our tax system is fair? Why or why not?

Differences between vertical integration and outsourcing : Explain the differences between vertical integration and outsourcing. explain how each position can be used to help supply chain strategy.

Explain the occurrence of violence and abuse in adulthood : Present an argument about which theory/theories you believe are best to help explain the occurrence of violence and abuse in adulthood.

Assess netflix key resources : Using the VRIO framework, assess Netflix's key resources, capabilities and competencies. How do they support or detract from the company's strategy?

User Account

All Pages