Reference no: EM132755604
CIS7031 Programming for Data Analysis - Cardiff University
Learning Outcome 1: Critically analyse and evaluate various statistical and computational techniques for analysing datasets and determine the most appropriate technique for a business problem;
Learning Outcome 2: Critically evaluate, develop and implement solutions for processing datasets and solving complex problems in various environments using relevant programming paradigms;
Learning Outcome 3: Evaluate and apply key steps and issues involved in data preparation, cleaning, exploring, creating, optimizing and evaluating models;
Learning Outcome 4: Evaluate and apply aspects of data science applications and their use.
EDGE
The Cardiff Met EDGE supports students in graduating with the knowledge, skills, and attributes that allow them to contribute positively and effectively to the communities in which they live and work.
This module assessment provides opportunities for students to demonstrate development of the following EDGE Competencies:
Assessment Requirements / Tasks (include all guidance notes)
This assignment will use COVID-19 Data from our world in data data source. This dataset is updated daily and includes data on confirmed cases, deaths, hospitalizations, and testing, as well as other variables of potential interest. The description of the fields can be found here.
For this assignment students will undertake a data analysis and machine learning approach to reveal the .
1. Data processing
1.1. Download the dataset from the link above and create a dataframe that contains Europe data only.
1.2. Check for any null value or outlier. If found then treat appropriately (replace with zeros, means, drop rows).
1.3. Drop the columns related to handwashing , testing, smoking and continent.
Show the result.
2. Data analysis
For each question provide graph/chart along with your own interpretation (~ 50 words)
2.1. Which country has reported the highest and lowest covid cases over the period?
2.2. Which country has the has the highest and lowest deaths per million residents?
2.3. Which date was the highest number of cases reported and by which country?
2.4. Which date did UK report lowest number of cases?
3. Visual analysis
Create a dynamic scatter/bubble plot showing the total cases per country over the period using Plotly express. Write your interpretation of the findings (~100 words).
4. PCA/Correlation
4.1. Undertake a PCA (PC=2; columns should be like PC1, PC2) and produce a scatter plot. Write your interpretation about the plot and in relation to the analysis of section 2 & 3 (explain the variance.)
4.2. Is there a correlation between the number of deaths and age? Support your answer with discussion and plots.
5. Clustering (k means & hierarchical)
5.1. Using the hospital beds and GDP per capita ,undertake a K means clustering analysis (K=2 & 3) and identify countries cluster together. Write your own interpretation (~100 words).
5.2. Using the same dataset (5.1) create a hierarchical cluster. Compare the cluster with k means clusters (~100 words).
6. Discussion
Provide a brief discussion (~ 300 words) on the COVID deaths with respect to Countries economic activity and demographics based on the data analysis results.
Attachment:- Programming for Data Analysis.rar
Calculate the depreciation tax shield for project in year
: Your firm needs a computerized machine tool lathe that costs $50,000, Calculate the depreciation tax shield for this project in year 1
|
Advent of the gig economy
: Uber is largely hailed as the advent of the gig economy, which is the idea that people will not work for any one employer,
|
What maturity bonds must purchase
: If the company wants to immunize its obligations with a portfolio of zero-coupon bonds, what maturity bonds must it purchase?
|
At what price is investor willing to buy Cipamingkis bond
: If the investor asks for a 12% yield, at what price is the investor willing to buy the PT Cipamingkis bond and what is the effective annual yield of the bond
|
CIS7031 Programming for Data Analysis Assignment
: CIS7031 Programming for Data Analysis Assignment Help and Solution, Cardiff University - Assessment Writing Service - develop and implement solutions
|
What is the correlation between the two stocks
: What is the correlation between the two stocks? Interpret the result. Your portfolio has had a variance of 25% over the past few years.
|
What is the journal entry recorded for the issuance
: On January 1, Staple Company issued bonds with a face value of $200,000. What is the journal entry recorded for the issuance
|
Compute the correct cash in bank balance
: The cash in bank balance per ledger is P400,000. You determined that the entity recorded a P50,000 check payment as P5,000. Compute correct cash in bank balance
|
Business organization as compared to partnership
: Describe briefly two advantages and two disadvantages of a corporate form of business organization as compared to a partnership.
|