Python data analysis project | Computer Science homework help
Introduction
Use five-fold mode and separate the data into training and testing sets.
Save your time - order a paper!
Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines
Order Paper Now
Data Dictionary description
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings/spouses
parch # of parents/children
fare Passenger fare
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
Variable Notes
pclass: A proxy for socio-economic status (SES)
1st = Upper
2nd = Middle
3rd = Lower
age: Age is fractional if less than 1.
sibsp: The dataset defines family relations in this way...
Sibling = brother, sister, stepbrother, stepsister
Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way...
Parent = mother, father
Child = daughter, son, stepdaughter, stepson
Some children travelled only with a nanny, therefore parch=0 for them.
Analysis requirement:
1. How is the survival chance related to the gender?
2. How is the survival chance related to age?
3. How is the survival chance related to socio-economic status?
5. What are the first three most important factors related to the survival chance?
6. What is average prediction accuracy?