5011CEM Big Data Programming Project Assignment

Assignment Help Other Subject
Reference no: EM132931129 , Length: word count:2000

5011CEM Big Data Programming Project - Coventry University

Learning Outcome 1: COMPUTATION THINKING:develop and understand algorithms to solve problems; measure andoptimise algorithm complexity; appreciate the limits of what may bedone algorithmically in reasonable time or at all.

Learning Outcome 2: PROGRAMMING:create working solutions to a variety of computational and real world problems using multiple programming languages chosen asappropriate for the task.

Learning Outcome 3: DATA SCIENCE:work with (potentially large) datasets; using appropriate storagetechnology; applying statistical analysis to draw meaningfulconclusions; and using modern machine learning tools to discoverhidden patterns.

Learning Outcome 4: SOFTWARE DEVELOPMENT: develop a product from the initialstage of requirement / analysis all the way through development toits final stages of testing / evaluation.

Learning Outcome 5: PROFESSIONAL PRACTICE:understand professional practices of the modern IT industry whichinclude those technical (e.g. version control / automated testing) butalso social, ethical & legal responsibilities.

Learning Outcome 6: TRANSFERABLE SKILLS:apply a wide variety of degree level transferable skills including time management, team working, written and verbal presentation to bothexperts and non-experts, and critical reflection on own and otherswork.

Learning Outcome 7: ADVANCED WORK:apply the above to advanced topics selected according to theinterests of individual students.

Assessment Overview

Over the course of this module you have been introduced to a range of techniques that may be used for programming a big data project. This assessment allows you to pull together these techniques in a realistic scenario to complete a big data analysis project.Below is a realistic project scenario. By using the techniques presented during class you are to carry out the project and write a final project report for your client.

Project Scenario
You have been approached by a client who analysis atmospheric science and climate model data. They have developed a new analysis technique, but it takes too long to run for them to use it. They have asked you to investigate the use of big data techniques to reduce the processing time.

They have a large volume of data to process, and the analysis needs to be repeated frequently. They have the following basic requirements:

1. Current analysis time is approximately 2.5 hours to analyse the climate model output data for a 1-hour time period.

2. The data for a single day of model output is approximately 250MB. However, they have over 100 years' worth of data to analyse making a total of over 9TB.

3. Each day, they need to analyse the new data set for that day, so they wish to complete the analysis of the data for a 24-hour period (25 data sets) in under 2 hours.

4. It is not possible to hold on this in memory at one time, so the new process should load only 1 hour of data for processing at a time. If parallel processing is to occur, then 1 hour of data per worker can be loaded as needed.

You have been tasked with investigating the use of parallel processing to achieve the analysis speed required, with the following expectations:

1. Test and compare the processing speed of sequential and parallel processing

2. Extrapolate your findings to indicate the number of processors required to achieve the target processing time.

3. Test how your code responds to common errors, e.g. data that is text instead of numeric, use of NaN in the data as an error code.

4. Run automated tests that allow your client to set the tests running and return later to see the results, without user intervention.

Assignment Brief 2

Learning Outcome 1: DATA SCIENCE:work with (potentially large) datasets; using appropriatestorage technology; applying statistical analysis to drawmeaningful conclusions; and using modern machinelearning tools to discover hidden patterns.

B6: PROFESSIONAL PRACTICE:understand professional practices of the modern ITindustry which include those technical (e.g. versioncontrol / automated testing) but also social, ethical &legal responsibilities.

B7: TRANSFERABLE SKILLS:apply a wide variety of degree level transferable skillsincluding time management, team working, written andverbal presentation to both experts and non-experts, andcritical reflection on own and others work.

VIVA TASK

The VIVA will take the form of a submission of a recorded presentation of your work.

The recording should be an informal, meeting-likepresentation and should be considered as an opportunity to showcase your work. The aim is for you to present your work clearly and effectively to your client.

You are allowed 5 minutes to deliver your main content.You will then answer the questions below where you are allowed up to1 minute per answer. Poor timing will affect your grade.

VIVA Questions

Following the presentation of your work, please verbally answer the following questions.Keep your answers brief and concise and take account of the timing indicated for each.

1. You have tested your code using ozone (o3). We have many chemical species to analyse, how would you need to adapt your code to work with carbon monoxide (CO) for example.

2. If we wanted to analyse multiple chemical species at the same time, how would that affect our HPC requirements, e.g. number of processors?

3. One of our measuring instruments uses different text entries for errors, e.g. "Instrument Error", "Communication Error" as an error code, not NaN. How might you adapt your code to check and report errors?

Attachment:- Assessment Overview.rar

Reference no: EM132931129

Questions Cloud

What equipment will be included on the consolidated balance : What equipment will be included on the consolidated balance sheet at? Perth Corporation acquired a 100% interest in Sansone Company for $1,600,000.
Determine the amount of net income allocated to each partner : The Articles of Partnership specified that each partner should withdraw no more than $1,000 per month, Determine amount of net income allocated to each partner
What goodwill would be : The book value of Salaby's net assets on July 1, 2014 was equal to the fair value. On a consolidated balance sheet prepared at July 1, 2014, goodwill would be
Describe a business that dream of running someday run : Describe a business that you dream of running someday or already run. Using several sentences, discuss which of the inventory costing methods
5011CEM Big Data Programming Project Assignment : 5011CEM Big Data Programming Project Assignment Help and Solution, Coventry University - Assessment Writing Service - Develop and understand algorithms
Determine the value of inventory on hand to be yet : You determine the value of inventory on hand to be $23,500 yet according to the Inventory account in the ledger the balance should be $35,000.
What is jake and anna green adjusted gross income : What is Jake and Anna Green's adjusted gross income? What is Jake and Anna Green's taxable income? What is Jake and Anna Green's income tax owed?
Why do businesses always seem to focus on eighty percent : Why do businesses always seem to focus on the 80 percent of the customers? You need to realize that 20 percent of your customers provide 80 percent.
How much is the total cash outlay in relation to acquisition : The prevailing interest in December 31, 2012 is at 9%. How much is the total cash outlay in relation to the acquisition

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd