Classifying Legendary Pokemons using Naive Bayes and Decision Trees

multiplechoice

How many columns of our data are numeric columns?

multiplechoice

How many Pokémons are legendary?

Filter the legendary Pokémon from the original dataset and store them in a new variable called legendaries_df.

multiplechoice

Pokémon stats and lengedary status

If we take a quick look at the data we can identify clear differences between Pokémon stats in their relation to legendary status.

What's the average points of Attack on Legendary Pokémons?

multiplechoice

Legendary Pokémons strength

Legendary Pokémons are more solid than normal Pokémons. Can you validate that?

What is the average percentage of extra Defense points that Legendary Pokémons have compared to normal Pokémons?

multiplechoice

Which are the most common types for Legendary Pokémons?

multiplechoice

Did you find any missing value?

multiplechoice

Did you find any duplicated value?

multiplechoice

Which categorical columns would you remove if you want to reduce noise and improve your model accuracy?

multiplechoice

Which numerical columns would you remove if you want to reduce noise and improve your model accuracy?

multiplechoice

What is the recommended split ratio for train/test data when using scikit-learn?

multiplechoice

Encoding labels

What kind of value we expect to have after encoding labels with LabelEncoder?

multiplechoice

Categorical variables encoding done wrong?

A common mistake is to simply assign a numerical value to each category, ignoring any inherent order. Based on previous discussions, is it a correct decision to use the Label Encoder to transform the "Type 1" and "Type 2" variables?"

multiplechoice

Which sentence is not a benefit of data preprocessing?

multiplechoice

What is the difference between normalization and standardization?

multiplechoice

Evaluating model score

Now that you have predicted whether Pokémons are legendaries or not over the test data, use the accuracy_score function to check your model score.

The score you got is:

multiplechoice

What does a confusion matrix measure?

multiplechoice

Evaluating model score

Having your second model ready, use the accuracy_score function to check your model score over the test data.

The score you got is:

multiplechoice

Which are the most important features on your model?

multiplechoice

Matias Caputti

Project Activities

How many columns of our data are numeric columns?

How many Pokémons are legendary?

Pokémon stats and lengedary status

Legendary Pokémons strength

Which are the most common types for Legendary Pokémons?

Did you find any missing value?

Did you find any duplicated value?

Which categorical columns would you remove if you want to reduce noise and improve your model accuracy?

Which numerical columns would you remove if you want to reduce noise and improve your model accuracy?

What is the recommended split ratio for train/test data when using scikit-learn?

Encoding labels

Categorical variables encoding done wrong?

Which sentence is not a benefit of data preprocessing?

What is the difference between normalization and standardization?

Evaluating model score

What does a confusion matrix measure?

Evaluating model score

Which are the most important features on your model?

Which is a valid reason for our model to get wrong predictions?

Matias Caputti

Classification in Depth with Scikit-Learn

Set Operations using Sakila

LIKE Operator using World

Membership and Range Operators with World Database