Develop and implement solutions for processing datasets

Assignment Help Data Structure & Algorithms
Reference no: EM133005762

CIS7031 Programming for Data Analysis - Cardiff University

The aim of this module is to help students acquire skills for job roles of Data Scientist, Data Modellers and Data Analyst. Students taking this module will have the opportunity to understand and implement various statistical and computational techniques for analysing datasets using various industry standard software and programming languages.

Learning Outcome 1: Critically analyse and evaluate various statistical and computational techniques for analysing datasets and determine the most appropriate technique for a business problem;

Learning Outcome 2: Critically evaluate, develop and implement solutions for processing datasets and solving complex problems in various environments using relevant programming paradigms;

Learning Outcome 3: Evaluate and apply key steps and issues involved in data preparation, cleaning, exploring, creating, optimizing and evaluating models;

Learning Outcome 4: Evaluate and apply aspects of data science applications and their use.

Production Planning Analytics Data Challenge

Overview

Production planning is one of the major activities carried out by the planning department of every garment factory. This dataset contains 6 months' production planning data and actual production related data for the same period. The planning data set is given inside the "Plan" folder, and the actual production data is given inside the "Production Quantities" folder.

"Plan" Folder
There are 49 files, which gives production planning data for four-line sections (LC Sec 1, LC Sec 2, LC Sec 3, LC Sec 4) for different periods. For example, according the file name, LC Sec 1- 01.02-01.12, it gives line section 1, planning data for the period 01.02 to 01.12.
"Production Quantities" Folder

There are 118 files, which gives actual production data for the same 6 months' period, which are relevant to the planning data period. Each file represents the actual production data for each day. For example, PR 01.02.2018 - D051, gives the actual production data of 01.02.2018
Following information is also provided for you explore the datasets:
• S/O - Sales Order
• LI - Line Item - An item appearing on a single line with unique color, size, etc. LI differs from one to another with the color, size and other features.
• S/O and LI combined as a key will be a unique key
• SMV - Standard Minute Value (Standard Time taken to finish a particular product)
• Style - Product
• Efficiency = Standard Hours/ Work Hours

As a data scientist, your task will be to clean, normalise and transform these data into R compatible formats and undertake an extensive data mining using Machine Learning. The main objective of this data challenge is to develop Machine Learning model to identify various data patterns, and forecast the actual production depend on the plan. Report on any interesting patterns, (for example, order patterns), that you may reveal from the data analysis and possible visualizations.
In your discussion you will provide a critical synopsis of the challenges of data analysis, integration and visualisation you faced during this exercise. You will provide relevant assumptions you made with valid justifications during this exercise.

Assignment Tasks

a. Provide detailed description of each datasets, their properties and relationships

b. Read data from csv files to R environment for processing

c. Clean any outliers, exceptional values from the datasets

d. Normalizations, Scaling

e. Merge the datasets

f. Create training and test datasets, if required

g. Training a model on the data

h. Apply different Machine Learning approaches and discuss

i. Accuracy of each different models

j. Alternative ways of normalizations, model building, and their performances

k. Patterns identified and their visualizations

l. Describe a detailed comparative analysis between the scaling, Machine Learning approaches - strengths, limitations, uniqueness

m. Comparative analysis should be in relation to integration, transformation, visualization and data mining

n. Provide a brief discussion about the knowledge gained

Attachment:- Programming for Data Analysis.rar

Reference no: EM133005762

Questions Cloud

Compile a detailed report on the nature of an excess : Compile detailed report on the nature of an excess, how it should accounted for and effects of its recognition on subsequent consolidated financial statements
Assignment on covid-19 crisis : COVID-19 crisis: Response management guide and your own research to support the evaluation.
Record the transactions in the general journal : Using the periodic inventory system, record the above transactions in the general journal of WeAreFashion for the month of January
What are the consequences of the hr programs : What are the consequences of the HR programs that don't have good External & Internal Fit?
Develop and implement solutions for processing datasets : Develop and implement solutions for processing datasets and solving complex problems in various environments using relevant programming paradigms
Have federal antidiscrimination laws gone too far : Have federal antidiscrimination laws gone too far? Should public policy in the untied states seek a return to employment-at-will
Analyze the major components of comprehensive quality : Could you please analyze the major components of comprehensive quality assurance and risk management organization?
What was its operating profit margin : Last year Electric Autos had sales of $195 million and assets at the start of the year of $340 million. What was its operating profit margin
Basis of unfair use of test scores for selection : -A male candidate scored three points lower than a female candidate on a selection test. The female candidate was hired. The male candidate filed a reverse disc

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Write a function that implements kreskass algorithm

Kreskas's algorithm finds the spanning tree of minimal cost in a weighted graph. It is a simple modification of the algorithm.

  What is the smallest aa-tree

Suppose that the level data member in an AA-tree is represented by an 8-bit byte. What is the smallest AA-tree that would overflow the level data member.

  The heuristic evaluation of a user interface design

principles used in the heuristic evaluation of a user interface design.

  Write and execute a menu based java program

COMP 20016 - Create a class for queue in java to check whether the string entered by user is palindrome or not by exploiting the functionality of dequeue

  Compute the number of different spanning trees

Compute the number of different spanning trees of Kn for n = 1, 2, 3, 4, 5, 6. Conjecture a formula for the number of such spanning trees.

  Draft a mission statement for willowbrook school

Draft a mission statement for Willowbrook School, based on information provided in the first two chapters and does a strong business case exist in the case of Willowbrook School? Discuss why or why not.

  What is complexity of the gnome sort for the average case

What is the complexity of the gnome sort for the average case? Justify your answer. The justification can be based on approximate calculations.

  Finding equation has no solutions mod m

Let the equation ax = b mod m, where x is unknown and a, b and m are given. Illustrate that this equation has either no solutions mod m, or d solutions mod m.

  Create each table and specify appropriate column data types

Create each table and specify appropriate column data types, primary keys, foreign keys, and any special column characteristics in the Access database implementation.

  Create a linked list structure music that contains the data

Create a linked list structure Music that contains the data fields Name, Artist, Number_of _Songs, and a pointer to the list. Create the structure with 3 members and fill in data for each member.

  Write down an algorithm draw a flow chart and write a java

question write an algorithm draw a flow chart and write a java program to accept quiz 10 midterm 30 project 15

  How to move from any spanning tree to other spanning tree

Illustrate that it is possible to move from any spanning tree T to any other spanning tree T0 by performing series of edge-swaps, that is, by moving from neighbor to neighbor.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd