Perform a data analysis of a data set

Assignment Help Other Subject
Reference no: EM133180774

Big Data Analytics

Repeat Assignment Worth 60% of Module Grade

1 Description and Submission Format

In this assignment you are tasked to perform a data analysis of a data set with the use of R language. You should submit a PDF document that should be generated from your RMarkdown.

2 Data set

The data set that should be used for the analysis is the Student Performance Data Set

Tasks

Task 1

Your first task is to perform exploratory analysis of the data set. That should give you some basic understanding of the data. For that you should load you data from a file, then clean the data as much as possible that the further analysis is easier. Finally, perform exploratory analysis by visualising and summarising the data. You should also look at the relationships between variables and you should check the "strength" of those relationships. Your report should include some of the plots and summaries with explanations.

Task 2

Second task is quite open. You have done preliminary exploration of the data set. At this point you should understand the domain of your data set, and you should have seen how di?erent attributes of the data look. Your final goal is to report some findings (or lack of them). You should have proofs that these are statistically correct. The following points are just hints of what might be interesting to do/take a look at:
• Take a look at plots you have created in the first part - what conclusions can be drawn based on them? These could be your hypotheses.

• Data contains categorical variables - is there a di?erence between instances belong- ing to one category and the other? Even if you do not see clear di?erences, you could perform a statistical test checking if some properties change over categories.

Task 3

• Perform linear regression with multiple variables to predict the student grade. Normalize the data and repeat the process of performing Linear Regression with Multiple Variables on normalized data to predict the student grade. Highlight the di?erence in prediction accuracy with both data sets.
• Perform classification to classify an appropriate categorical variable. Normalize the data and repeat the process of performing classification on normalized data. Highlight the di?erence in prediction accuracy with both data sets.

Submission

Write your code in an R Markdown document to present your preliminary data analysis in the form of report. Do not put all of the plots in the report, decide what might be useful, what might be interesting to explore. Use multidimensional plots to present multiple variables.

• You can also get up to 10 points for clarity and quality of the report and the source code.
• Acceptable file format: Knit your Markdown document in pdf output. Use the submission link on Moodle to upload your final pdf report.

Reference no: EM133180774

Questions Cloud

Post the journal entries to the ledger accounts : Grete Rodewald formed a dog grooming and training business called Grete Kanines on September 1, 2021. Post the journal entries to the ledger accounts
Workout an acceptable compromise with the superior : John was just promoted as a shift officer. The promotion became effective when his immediate superior Mike was out of town for a few days.
What is the equivalent annual cost of the econo-cool model : Econo-Cool air conditioners cost R500 to purchase, result in electricity bills of R250 per year, What is the equivalent annual cost of the Econo-Cool model
Collective agreement regarding the topic : Three courier drivers and one dispatcher from the ABC Courier Company went out for a beverage after work. In the course of their conversation, they decided that
Perform a data analysis of a data set : Perform a data analysis of a data set with the use of R language. You should submit a PDF document that should be generated from your RMarkdown
What facts in the case support the discipline imposed : 1. What facts in the case support the discipline imposed by the employer?
What is the minimum lease payment : What is the minimum lease payment that would make purchasing a precision manufacturing machine and writing a 4-year lease contract on it
Method of departmentalization : Draw the organization chart of this company. What is the method of departmentalization used at each level on the chart?
Promoting the health and safety of employees : What are the most significant challenges and how can employers bring a culture promoting the health and safety of their employees?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd