Happiness Classification
Happiness Classification Data Science Project
Introduction to Supervised Learning with scikit-learn

Happiness Classification

We will be using the Happiness Classification Dataset to practice practice the basic steps of a classification machine learning model with scikit-learn. This Dataset is based on a survey conducted where people rated different metrics of their city on a scale of 5 and answered if they are happy or unhappy.

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.


Separate the target and the features into two variables.

Store the features in X and the target in y. Both variables should be dataframes and keep the headers.

Despite there being various ways to solve this exercise, the results must be dataframes in order to be considered correct.


Use train_test_split to split the data into training and testing sets. Split the dataset in 80% training, 20% testing, and random_state=0.

Set the random_state parameter to a desired integer value for reproducibility.

Store the values in the variables in X_train,X_test,y_train and y_test.


Logistic Regression

Create an instance of the Logistic Regression and store the model in lr.


Train a LogisticRegression classifier

It's time to train the LogisticRegression using the training dataset.


Make predictions on the test set

Use the trained model to make predictions on the test data. Store the prediction in y_pred.

Happiness ClassificationHappiness Classification

Verónica Barraza

This project is part of

Introduction to Supervised Learning with scikit-learn

Explore other projects