Clarify analysis what sort of people were likely to survive

Assignment Help Data Structure & Algorithms
Reference no: EM13934799

Data Science Project Report

Submit the files listed below in a single ZIP file:

• Titanic.rmd - R Markdown document used to generate your Data Science Project Report. An initial Sample_RMD.rm template file has been provided for you.

• Titanic.html - standalone HTML document (embedded images and code) generated in R Studio using Knitr and your Titanic.rmd R Markdown file.

• /data directory with your dataset files

Reproducible Research

For your Data Science Project Report you are expected to meet the criteria of a reproducible research project. Your Project report will document your analysis of the Titanic dataset. It will include your initial data exploration, model building and evaluation and your final predicted outcomes for the test dataset. For your research to be considered reproducible you must provide:

• The data used for your analysis

• All final code files, with appropriate comments

• A report of your analysis which includes background information explaining the question you are trying to answer, a discussion of the analysis and conclusions reached for your project with appropriate supporting explanations and figures.

To comply with this final requirement, your final report will be a standalone HTML document created using R Studio with Knitr & R Markdown tools. Using Knitr with R Markdown allows you to create a report that interweaves your discussion with your code and figures. See R Markdown - Dynamic Documents for R in the list of online resources provided below for further information.

Data Analysis Project

This assessment is based on a Kaggle competition. For this assignment you are asked to predict which of the Titanic's passengers survived the disaster. More information on the competition is available at the Kaggle competition site: Titanic: Machine Learning from Disaster

[https://www.kaggle.com/c/titanic].

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.

One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.

In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy. (Kaggle 2012)

Project Report Outline

Please use the project report outline provided below as a general guide to the specific sections and content that you should include in your project report.

1. Background

Introduce and discuss the background and purpose of your project. What information does the dataset provide? What question(s) are you trying to answer?

2. Exploratory analysis

Conduct exploratory analysis to discover which of the independent variables are most informative. You are required to explore and report on at least four variables. Three of the four must be Age, Sex and Class. You are free to explore and report on any other independent variables in the dataset. Your discussion should include at least one table or figure for each variable illustrating the relationship between each variable and passenger survival.

3. Building and evaluating the model

a. Discuss your choice of model. Explain why you've chosen this specific model. What are its strengths? What are its limitations?

b. Evaluate your model. The discussion for the evaluation section should include answers to following questions: How well does your model predict? Is it overfitting to the training set? Do you trust this model?

c. This section should include at least 2 tables or figures to summarize/ illustrate your discussion.

4. Predicting passenger survival

Finally, use the model you've built to predict the outcomes for the test set and compare these results to your training data. Optionally, I encourage you to submit your predictions to the Kaggle competition site and include your results in your report.

5. Conclusions

Discuss the conclusions you've drawn based on your analysis.

List of online resources

• Titanic: Machine Learning from Disaster

Kaggle competition site.

https://www.kaggle.com/c/titanic

• R Markdown -Dynamic Documents for R https://rmarkdown.rstudio.com/

• Getting Started with R: Kaggle's Titanic Competition:

List of 4 excellent tutorials for using R to compete in the Titanic competition. https://www.kaggle.com/c/titanic/details/new-getting-started-with-r

• Kaggle and DataCamp R Tutorial on Machine Learning

Interactive tutorial by Kaggle and DataCamp which provides coding exercises to help you predict the passenger survival rates for Kaggle's Titanic competition.

https://www.datacamp.com/courses/kaggle-tutorial-on-machine-learing-the-sinking-of-thetitanic

References

Titanic: Machine Learning from Disaster 2012, Kaggle, viewed 8 Oct 2015, https://www.kaggle.com/c/titanic.

Reference no: EM13934799

Questions Cloud

Examples of organs in which mitosis is frequent : What are some examples of organs in which mitosis is frequent, less or absent
A sample database for hotel reservation transactions develop : A sample database for hotel reservation transactions developed in Microsoft Access is shown next, but the Web site may have a more recent version of this database for this exercise. Develop some reports that provide information to help management mak..
What job numbers likely relate to the balance : What is Cost of Goods Sold? What job numbers likely relate to the balance in Cost of Goods Sold?
Analyze the changing landscape of the health care system : Analyze the changing landscape of the health care system. Differentiate the various places health care is delivered. Analyze what impact cultural demographics have on the health care market. Analyze the targeted audience of the clinic or office bas..
Clarify analysis what sort of people were likely to survive : In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy.
Write a program which randomly chooses an integer : Write a program which randomly chooses an integer from 1 to 100. The program should then tell the user.The program should then ask the user to complete the puzzle such that each row and each column consists of the letters
Research and development project scheduled : Your program has a research and development project scheduled to start in January 2017 which is expected to take 40 months to complete. The project is expected to cost a total of $150 million (then-year dollars), with cost expected to be incurred as ..
Aircraft production contract planned for award fiscal year : Time now is February 2015. You have been asked to determine the amount that should be included in the Air Force's FY 2017 budget request for an aircraft production contract planned for award in that fiscal year. The contractor estimates that the cost..
Arithmetic unit and related self testing in mips : Write an arithmetic unit and related self testing in MIPS assembly as following. Put all the source codes in a directory and compress them into a zip file and upload. Grader should be able to download your zip file, unzip it and directly load it i..

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Finding the values of queuefront and queuerear

Assume that queue is a queue type object and the size of the array-implementing queue is 100. Also, assume that the value of the queueFront is 25 and the value of queueRear is twenty-five.

  Complete binary tree

Think about an n-node complete binary tree T, where n=2^d - 1 for some d. Each node v of T is labeled with a real number x_v.

  Prepare the pseudo code for given algorithm

They alternate: dark, light, dark, light, and so on. You want to get all the dark disks to the right-hand end, and all the light disks to the left-hand end.

  Data and process modeling

The next phase in the project development cycle is to develop a logical model of the system based on the system requirements. The first step is about the "what" step. We need to show what the system will do, without worrying about how it will do..

  Converting arithmetic expression in reverse polish notation

Convert the following numerical arithmetic expression into reverse Polish notation and show the stack operations for evaluating the numerical result.

  Design of sample and hold amplifiers for 100 msps by using n

The report is divided into four main parts. The introduction about sample, hold amplifier and design, bootstrap switch design followed by simulation results.

  Implementation and application of data structures

Implementation and Application of Data Structures

  Design algorithms to implement stack operations

How to design algorithms to implement stack operations. Write down the program to multiply any two matrices. (Using Basic).

  Create a work plan

Design a dynamic programming algorithm to find the value of the optimal plan. Implement your algorithm using any programming language you prefer. Describe the recurrence relation used by your algorithm at the top of your program or in a separate f..

  Computing entropy of plaintext message

Compute the entropy of the plaintext message?

  How do a bubble sort in mips?

How do a bubble sort in MIPS?

  Ambiguity in proposed algorithm-in representation algorithm

Describe distinction between the ambiguity in proposed algorithm and ambiguity in representation of the algorithm.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd