Develop a product from the initialstage of requirement

Assignment Help Other Subject

Reference no: EM132931146

5011CEM Big Data Programming Project - Coventry University

Learning Outcome 1: COMPUTATION THINKING:develop and understand algorithms to solve problems; measure andoptimise algorithm complexity; appreciate the limits of what may bedone algorithmically in reasonable time or at all.

Learning Outcome 2: PROGRAMMING:create working solutions to a variety of computational and real world problems using multiple programming languages chosen asappropriate for the task.

Learning Outcome 3: DATA SCIENCE:work with (potentially large) datasets; using appropriate storagetechnology; applying statistical analysis to draw meaningfulconclusions; and using modern machine learning tools to discoverhidden patterns.

Learning Outcome 4: SOFTWARE DEVELOPMENT: develop a product from the initialstage of requirement / analysis all the way through development toits final stages of testing / evaluation.

Learning Outcome 5: PROFESSIONAL PRACTICE:understand professional practices of the modern IT industry whichinclude those technical (e.g. version control / automated testing) butalso social, ethical & legal responsibilities.

Learning Outcome 6: TRANSFERABLE SKILLS:apply a wide variety of degree level transferable skills including time management, team working, written and verbal presentation to bothexperts and non-experts, and critical reflection on own and otherswork.

Learning Outcome 7: ADVANCED WORK:apply the above to advanced topics selected according to theinterests of individual students.

Assessment Overview

Over the course of this module you have been introduced to a range of techniques that may be used for programming a big data project. This assessment allows you to pull together these techniques in a realistic scenario to complete a big data analysis project.Below is a realistic project scenario. By using the techniques presented during class you are to carry out the project and write a final project report for your client.

Project Scenario
You have been approached by a client who analysis atmospheric science and climate model data. They have developed a new analysis technique, but it takes too long to run for them to use it. They have asked you to investigate the use of big data techniques to reduce the processing time.

They have a large volume of data to process, and the analysis needs to be repeated frequently. They have the following basic requirements:

1. Current analysis time is approximately 2.5 hours to analyse the climate model output data for a 1-hour time period.

2. The data for a single day of model output is approximately 250MB. However, they have over 100 years' worth of data to analyse making a total of over 9TB.

3. Each day, they need to analyse the new data set for that day, so they wish to complete the analysis of the data for a 24-hour period (25 data sets) in under 2 hours.

4. It is not possible to hold on this in memory at one time, so the new process should load only 1 hour of data for processing at a time. If parallel processing is to occur, then 1 hour of data per worker can be loaded as needed.

You have been tasked with investigating the use of parallel processing to achieve the analysis speed required, with the following expectations:

1. Test and compare the processing speed of sequential and parallel processing

2. Extrapolate your findings to indicate the number of processors required to achieve the target processing time.

3. Test how your code responds to common errors, e.g. data that is text instead of numeric, use of NaN in the data as an error code.

4. Run automated tests that allow your client to set the tests running and return later to see the results, without user intervention.

Attachment:- project_report_brief.rar

Reference no: EM132931146

Questions Cloud

Can he switch to the standard mileage method : If Ariff originally chooses the actual method and elects a Section 179 expense deduction, can he switch to the standard mileage method

Calculate accounts receivable turnover ratio for each firm : Sensitron and Douglas Tools manufacture, Calculate the accounts receivable turnover ratio for each firm for year 2010, 2009, 2008.

How would that affect our hpc requirements : How would you need to adapt your code to work with carbon monoxide (CO) for example - How would that affect our HPC requirements, e.g. number of processors

What means production with best technological specifications : What is the relation of Managerial Economics to Other Disciplines? What means production with best technological specifications?

Develop a product from the initialstage of requirement : Develop a product from the initialstage of requirement/analysis all the way through development toits final stages of testing/evaluation

What two fields are applied to managerial economics : What problems are Managerial Economics and Traditional Economics concerned with? What two fields are applied to managerial economics?

What are the objectives of a business firm : What does Managerial economics help in taking decision on following subjects? What are the objectives of a business firm? What does Managerial economics help in

Difference between marginal and full absorption costing : What the difference between marginal costing and full absorption costing? What is standard costing

What deals with the problems of an individual firm : What does MicroEconomic Analysis help in studying what is going on within the firm? What deals with the problems of an individual firm, industry etc?

User Account

All Pages