Performance Metric - A Simple Practice
Performance Metric - A Simple Practice Data Science Project
Introduction to Supervised Learning with scikit-learn

Performance Metric - A Simple Practice

During this lab, you will practice from training a model to evaluating it's performance using a real dataset.

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.


Separate the target and the features into two variables.

We will not working with 'Unnamed: 32' and 'id' variables.

Store the features in X and the target y.


Use train_test_split to split the data into training and testing sets. Split the dataset in 80% training, 20% testing, and random_state=0.

Set the random_state parameter to a desired integer value for reproducibility.

Store the values in the variables in X_train,X_test,y_train and y_test.



Create an instance of the KNeighborsClassifier and store the model in knn. Use the argument for defect.


Train a KNeighborsClassifier

It's time to train the KNeighborsClassifier using the training dataset.


Make predictions on the test set

Use the trained model to make predictions on the test data. Store the prediction in y_pred.


Evaluation metric

Calculate the f1-score of the testing set and run the code in a Jupyter Notebook.

Store the results in the variable f1_score_test .

Performance Metric - A Simple PracticePerformance Metric - A Simple Practice

Verónica Barraza

This project is part of

Introduction to Supervised Learning with scikit-learn

Explore other projects