Discuss about seeding trends in brief

Assignment Help Other Subject
Reference no: EM132759795

Assignment: R Analysis and Report (maximum 3000 words)

In this assignment you are going to simulate data from an area chosen by yourself. It can be cyber related, healthcare, industrial, financial/credit card fraud, commerce - anything. However, run your ideas past me first before diving in. If you recall from the dplyr tutorials we were able to simulate small amounts of data based on several dataframes. We then linked the data we required using join() commands, etc. We then obtained summaries of the data and could use ggplot2 to highlight trends, etc.

1. Carefully, choose your domain. Give a rationale for simulating it.

2. Define your data frames, generate them using sample_n and/or other commands. There is a package called charlatan you may find useful for generating personal names and other values. About 4-5 dataframes will suffice.

3. Think about seeding trends and patterns in your simulated data that you can "detect" later.

4. Use dplyr to extract the columns you need from the dataframes.

5. Use some sort of analysis such as summaries to get statistics on your data. Break it down by a category variable such as e.g. time, gender, fraudulent V normal, etc.

6. In the write-up, I will expect to see an introduction section, methods, and then sections for Simulation of data and transforming data, Analysis of data; marks for plots should of course be in the Analysis section.

Part 1: Analysis of the Data

You will need to develop R code to support your analysis, use dplyr where possible to get the numeric answers. Regarding ggplot2, be careful as to what type of plot you use and how you use them as you have many records and want the charts to be readable. You should place the R code in an appendix at back of the report (it will not add to word count). Section each piece of code with # comments and screenshots of outputs.

• Simulation of data

• Transforming data

• Analysis of data and plots

• Write-up of the data analysis (similar format of my R tutorials)

Part 2: Scale-up Report

The second part will involve writing a report. Now assuming your Part 1 was an initial study for your organisation, what are the issues when you scale it up and start using it in practice?

• Discussion of Cyber security, big data issues, and GDPR issues

• Structure of report, neatness, references. Applies to both Part 1 and Part 2

Penalties: Do not go over word limit of 3,000 (other than ±10%) as loss of marks will occur according to the university guidance on penalties.
Output: Submit PDF electronic copy to Canvas before the deadline, along with a file containing your R code. The data should be generated from the R code, so do not submit any data.

Reference no: EM132759795

Questions Cloud

What is maximum number of comparisons your algorithm makes : Design and write an algorithm to find all the common elements in two sorted lists of numbers. For example, for the lists 2, 5, 5, 5 and 2, 2, 3, 5, 5, 7.
Human resources manager for public organization : As the human resources (HR) manager for a public organization, you are conducting a New Hire Orientation session for all new employees.
Relative merits of internal versus external recruitment : What types of business strategies might best be supported by recruiting externally, and what types might call for internal recruitment?
What a qualifying taxpayer should : What a qualifying taxpayer should? A self-employed taxpayer may be eligible to deduct amounts paid for medical insurance for themselves
Discuss about seeding trends in brief : In this assignment you are going to simulate data from an area chosen by yourself. It can be cyber related, healthcare, industrial, financial/credit card fraud.
What is the predetermined factory overhead rate : During February, actual direct labor cost totaled $160,000, and factory overhead cost incurred totaled $283,900. What is the predetermined factory overhead rate
Legal critical thinking exercise : Evaluate whether management's human resource plan for the immediate plant closing is the correct plan, including legal implications of the WARN act
What amount does chris report for total rental real estate : Rental real estate activity. He has no other passive income or losses. What amount does Chris report for his total rental real estate and royalty income?
Tourism industry for secluded or remote areas : 1. What will you propose to develop and promote further the tourism industry for secluded or remote areas?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd