Reference no: EM132627956
Titanic Disaster Survivability Prediction using Machine Learning Project Location:
https://www.kaggle.com/c/titanic
And please upload CSV and .ipynb file here.
For each step, you need to explain what are the codes below, what they achieve and why you chose certain options/preprocessing steps/model selection/data processing etc. and then include code for that part. If a certain step is not necessary for your case, please also mention why you omitted that step.
STEPS:
1. Describe the data used and how you got the data.
2. Initial Exploration of the data (print data, visualize data). Find number of attributes, number of records in the data. Also find if there is any attribute which has missing data.
3. Find correlation of data attributes and target variable (whether survived or not) to explore which attributes have most effects on output (target variable).
4. Compute any new attribute/feature from existing attributes [if necessary]. Explain how these new features contribute to the prediction of target variable.
5. Select your classification model (try at least three model to find best one)
6. Perform cross validation of training data in all cases and run test data to decide which model is best.
7. Try techniques like Grid Search etc to further fine tune your model. Examine whether fine tuning improve your prediction performance.