Blood transfusion service
Blood transfusion service  Data Science Project
Classification in Depth with Scikit-Learn

Blood transfusion service

In this project, you will practice concepts related to overfitting and underfitting. You will explore techniques to mitigate overfitting and underfitting issues, enhancing your understanding of model generalization.

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Use train_test_split to split the data into training and testing sets. Split the dataset in 25% testing, and random_state=42.

Store the variables in X_test, X_train, y_train and y_test.

codevalidated

Log normalization

For this task implement the log normalization to the Monetary (c.c. blood), and store the new variable called monetary_log and the other features in X_train_normed and X_test_normed.

multiplechoice

Impact of Variable Transformation on Decision Tree

In the previous page, we performed a variable transformation on the numerical features (logarithmic transformation) to improve their distribution.

Based on this scenario, select the correct statement:

multiplechoice

Model Performance Evaluation

Evaluate the model's performance on the training and testing sets. Based on this performance, select the correct statement.

multiplechoice

Model Fit Evaluation

Evaluate the model's fit based on its performance and select the correct term that corresponds to the given scenario.

multiplechoice

Validation Curve Evaluation

To assess model performance and find the optimal hyperparameter value, we can plot a validation curve. Based on this concept, select the correct statement:

multiplechoice

Validation Curve

Based on the following figure, identify the best max_depth hyperparameter to train a decision tree.

validation

codevalidated

Compute precision, recall and f1-score using the test dataset

Store the precision, recall, and f1-score of the positive class in the variables precision, recall, and f_1_score

Blood transfusion service Blood transfusion service
Author

Verónica Barraza

This project is part of

Classification in Depth with Scikit-Learn

Explore other projects