What is the average and median duration of the journeys

Assignment Help Other Subject
Reference no: EM131761942

Data Cleaning

Problem 1 :
The dataset is missing a lot of data, suggest an explanation for
(a) missing payment card numbers
(b) prices 1 and
(c) end addresses
Finally, (d) calculate what percentage of prices are missing and suggest a way to deal with both missing prices and other missing data before using the dataset for analysis
Whatever you decide to do in (d), apply it to the given dataset - including sorting out not missing but inconsistent and missing data data - and save it as stage1.csv in your final submission
Descriptive Statistics
For this problem include in your latex file itself which R-code you used for answering each part of the question.

Problem 2
a) What is the average journey cost
b) Which journey (between which addresses) is the most popular and what is the median time for this journey?
c) What is the average and median duration of the journeys? Hint: You may find it helpful to create and addition column in excel to hold the calculated journey times, and read the resultant file into R, before doing the analysis.
Modelling
For this problem describe in your latex file itself which R-code you used for answering each part of the question.

Problem 3
The prices are not missing due to the corresponding transaction being paid in cash 1
a) Create a plot of the relationship between journey time and cost
b) Is there a linear relationship between these variables? Show your reasoning for this answer, mentioning the type of model you use to answer this question
c) Can you suggest a rough set of categories by which journeys can be clustered? Suggest the model that you can use to find this out, and the values representative of the clusters 2. Furthermore, explain what these clusters can be understood to represent conceptually with respect to the journeys
d) Run the appropriate validation test for your clustering model, and explain how this affects your certainty about your categories
Question Refinement and Hypothesis Testing
For this problem describe in your latex file itself which R-code you used for answering each part of the question.

Problem 4 : As a data scientist hired by Uber you have been asked to simply figure out ways to reduce costs. However, you only have the attached customer data as input. Uber has said that this customer's behaviour is representative of a important sector of the market in the Al Naseem area.

Your task is to figure out if the question of 'how costs can be reduced', can be answered by the given data. Your initial consultation with someone in finance reveals that troublesome customers, defined as undecisive customers that keep cancelling their ubers after ordering them without the 5 minutes elapsing, are an increasing cost.

Your further discussion with the product engineering manager shows that there is an idea for creating a private rating of uber users based on this troublesome behaviour: users who cancel a large percentage of their trips will be given low ratings. And users with low ratings will not be 'actually assigned ubers' (even though the application may show otherwise) until a few minutes after they have ordered the uber.

a) Explain briefly how costs can be reduced with such a rating system
b) Suggest a refined question about saving costs, and what you expect to benefit from answering this question
c) What would be a way to answer this question with the given data
d) Suggest a hypothesis test, stating the null and alternate hypothesis. Assume here that if the user cancels 30 percent or more of their rides then they will get low ratings
e) Perform the test on the attached dataset, are you inclined to accept the null or alternate hypothesis explain your choice
f) Given the user data you analysed is representative of 1000 users, and assuming that cancellations within 5 minutes cost on average 3 SAR, how much money do you think you can save and over how many months?

Presentation

Problem 5: Communicate your problem, question, refined question, statistical test results and overall conclusions from Problem 4 to your manager using the necessary visualisations. You should use your results from the prior problems to inspire or encourage your final argument 3

Reference no: EM131761942

Questions Cloud

What are the experimental units in the study : A farmer is conducting an experiment to determine which variety of apple tree, Fuji or Gala, will produce more fruit in his orchard.
Discuss what is the balance in the unearned revenue account : The company uses the accrual method of accounting. What is the balance in the Unearned Revenue account as of December 31
Compare several treatments using the double-blind method. : Two essential features of all statistically designed experiments are (a) compare several treatments; use the double-blind method.
What percent of seeds weigh more in mg : Seed weights (2.2) Biological measurements on the same species often follow a Normal distribution quite closely. The weights of seeds of a variety of winged.
What is the average and median duration of the journeys : What is the average and median duration of the journeys? Hint: You may find it helpful to create and addition column in excel to hold the calculated journey
Case strudy-foster care versus orphanages : Foster care versus orphanages Do abandoned children placed in foster homes do better than similar children placed in an institution?
Discuss problem related to frozen batteries : Frozen batteries Will storing batteries in a freezer make them last longer? To find out, a company that produces batteries takes a random sample.
Discuss the ethical considerations for information privacy : how these considerations should be addressed with a corporate policy. Provide support for your rationale
What will be the total labor cost for the month of august : Each widget requires 1.3 hours of unskilled labor. What will be the total labor cost for the month of August

Reviews

len1761942

12/12/2017 4:45:38 AM

But please I need good work please ... I give your answer for help ... check only correct or not and give me feedback as file after change it ... put please don't forget presentation 3 slide not more this last request in file ... please change please. .. take 4 hrs

Write a Review

Other Subject Questions & Answers

  Define how does language influence the way people think

Tell me some ways that we can improve our thinking? How is intelligence measured, and how does language influence the way people think

  Discuss about the digital technology media

Can Social Networking Turn Disaffected Young Egyptians Into a Force for Democratic Change? Summarize the important points of the article.

  Follow the same procedure as described in assessment item

Follow the same procedure as described in Assessment Item 1 and the use the same SWOT Analysis Template to analyse and evaluate a new ICT product or service.

  Discuss the positive and negative symptoms of schizophrenia

Discuss the positive and negative symptoms of schizophrenia.

  Virtual reality equipment eventually

Imagine that virtual reality equipment eventually becomes so good that it can simulate sound, smell, taste, and touch as well as sight when you are wearing it. Imagine, too, that the programming becomes so elaborate that you can have week-long virtua..

  Define methods used by timothy mcveigh to conduct his attack

Define methods used by Timothy McVeigh to conduct his attacks, the results of the bombing, his motivation(s), and how he was able to conduct his attack against the Murrah Federal Building without detection by the FBI.

  The purpose of the second draft is to build upon the first

purposethe purpose of the second draft is to build upon the first draft that you have already written and to move your

  Discuss the impact that zebra mussels have on fish community

Discuss the impact that zebra mussels have on fish communities, and describe what remedies are available.

  What exactly is a mystic trying to do that others adhering

Both Islam and Judaism are religions that seem to look at Abraham and the prophets as starting points. Could someone comment on some other similarities between the two traditions?

  Supplement enforcement tools and mechanisms

In an effort to acknowledge the need to supplement enforcement tools and mechanisms, OSHA has created voluntary compliance programs. It is reported that this initiative has decreased the number of injuries at workplaces between 1998 and 2003.

  Origins of jainism and its key beliefs

What are the origins of Jainism and its key beliefs and ethical practices. Are there any of the same characteristics of Sikhism and, if so, what do they share?

  Definition of community health education theory

Definition of community health education theory. Then explain how community health education theory is different from other types of theories used in public health

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd