Python data analysis project | Computer Science homework help

Introduction

Use five-fold mode and separate the data into training and testing sets.

 

Save your time - order a paper!

Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines

Order Paper Now

 

Data Dictionary description

 

Variable      Definition                                           Key
survival       Survival                                               0 = No, 1 = Yes
pclass          Ticket class                                         1 = 1st, 2 = 2nd, 3 = 3rd
sex              Sex              
Age             Age in years                   
sibsp           # of siblings/spouses
parch          # of parents/children
fare             Passenger fare              
embarked  Port of Embarkation                         C = Cherbourg, Q = Queenstown, S = Southampton
 
 

Variable Notes

pclass: A proxy for socio-economic status (SES)
1st = Upper
2nd = Middle
3rd = Lower
age: Age is fractional if less than 1. 
sibsp: The dataset defines family relations in this way...
Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way...
Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children travelled only with a nanny, therefore parch=0 for them.
 

Analysis requirement:

1. How is the survival chance related to the gender?
2. How is the survival chance related to age?
3. How is the survival chance related to socio-economic status?
5. What are the first three most important factors related to the survival chance?
6. What is average prediction accuracy?