Discretization and Binning with Fitness data
Discretization and Binning with Fitness data Data Science Project
Data Wrangling with Pandas

Discretization and Binning with Fitness data

Use data to uncover hidden exercise patterns and fitness secrets in the 'Exercise and Fitness Metrics' Lab! Explore the world of Pandas as you discretize, bin, and create dummies to analyze the data. Get ready to sharpen your analytical skills and help create a healthier future!

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Classify age into categories

To analyze exercise habits across different age groups, we need to categorize the Age column into meaningful groups. Create three age categories:

  • Young: 0-30
  • Adult: 30-60
  • Senior: above 60

Store the result in a new column called AgeGroup. The output should resemble the provided example series.

activity1-answer

codevalidated

Categorize Actual Weight into Weight Categories

To analyze exercise habits across different weight ranges, we need to categorize the Actual Weight into meaningful groups. Create three weight categories:

  • Lightweight: 0-60
  • Normal Weight: 60-80
  • Overweight: above 80

Store the result in a new column called WeightCategory. The output should resemble the provided example series.

activity2-answer

codevalidated

Categorize Exercise Durations into Meaningful Groups

In this activity, we want to categorize exercise durations into three meaningful groups: Sedentary, Moderate, and Active. By doing this, we can analyze exercise habits across different duration ranges such as 0, 25, 50, and inf.

Here are the steps to complete the task:

  1. Create a new column named ExerciseLevel.
  2. Categorize each exercise duration according to the following ranges:

    • Sedentary: duration less than 25 minutes.
    • Moderate: duration between 25 and 50 minutes (inclusive).
    • Active: duration greater than 50 minutes.
  3. Store the categorization result in the ExerciseLevel column.

The output should resemble the provided example series.

activity3-answer

codevalidated

Generate Gender dummy variables with a colon separator

Generate dummy variables for the Gender column and store the result in the variable gender_dummies. Each variable should be prefixed with Gender using a colon : separator.

Note: The resulting dataframe should resemble the example shown below:

activity4-answer

codevalidated

Create Dummy Variables for Weather Conditions Column and Drop First Categorical Level

Create dummy variables for the "Weather Conditions" column. Each variable should be prefixed with "Weather:" and separated by a colon ":". Make sure to drop the first categorical level, but keep it as a level if the row has 0 in all categorical levels.

Store the resulting dummy variables in the variable Weather_dummies.

Note: The resulting dataframe should resemble the example given below:

activity5-answer

codevalidated

Categorize Exercises by Intensity Level

Create a new column called ExerciseIntensityCategory to categorize the exercises into four intensity levels:

  • "Low Intensity" (0-0.25),
  • "Medium Intensity" (0.25-0.50),
  • "Above-Medium Intensity" (0.50-0.75),
  • "High Intensity" (0.75-1)

based on their exercise intensity values. Assign each exercise to an appropriate intensity level using the quantiles 0, 0.25, 0.5, 0.75, and 1.

Note: Ensure that the resulting column matches the provided example series shown in the image:

activity6-answer

codevalidated

Categorize Exercises by Calorie Burn Level

Create a new column called CalorieBurnCategory and assign each exercise to one of three categories based on the CaloriesBurn column:

  • Low Calorie Burn: exercises with a low calorie burn rate
  • Moderate Calorie Burn: exercises with a moderate calorie burn rate
  • High Calorie Burn: exercises with a high calorie burn rate

You should use 3 quantiles to determine the category boundaries. Store the category labels in the calorie_burn_bin_edges variable.

Note: Your results should look similar to this image:

activity7-answer

codevalidated

Categorize exercises into BMI categories with 3 quantiles

Create a new column called BMICategory to categorize exercises as Underweight, Normal Weight, or Overweight based on their BMI. Use 3 quantiles to define the categories. Store the bin edges in the variable bmi_bin_edges.

Note: Your results should look similar to this image:

activity8-answer

codevalidated

Categorize Exercises by HeartRateZone

Create a new column called HeartRateZone and categorize the exercises based on their Heart rate into the Resting Zone, Fat-Burning Zone, or Cardio Zone. Use three quantiles to define the categories. Store the bin edges in the variable heart_rate_bin_edges

Note: Your results should look similar to this image:

activity9-answer

codevalidated

Analyze the relationship between BMI and exercise intensity or fitness metrics and calculate average `Exercise Intensity` for each `BMICategory`

Calculate the average Exercise Intensity for each BMICategory and store the result in the variable bmi_exercise_frequency. This analysis will help us understand the relationship between BMI and exercise intensity or fitness metrics.

Notes:

  • Please ensure that you have completed Activity 8 before attempting this activity.

  • Your result should look similar to this series:

activity10-answer

codevalidated

Create a grouped bar chart to visualize the relation between `CalorieBurnCategory` & `Exercise`

Store the resulting chart in the variable calories_exercises_count_chart & the grouped data in the variable exercises_count_data.

Notes:

  • Make sure to pass activity 7 first before you try this activity.

  • It should be a stacked bar chart with figure size of (10, 6).

  • Your result should look similar to this chart:

activity11-answer

Discretization and Binning with Fitness dataDiscretization and Binning with Fitness data
Author

Anurag Verma

What's up, friends! 👋 I'm a computer science student about to finish my last year of college. 🎓 I LOVE writing code! ❤️ It makes me so happy! 😄 Whether I'm goofing in notebooks 📓 or coding in Python 🐍, writing programs is a blast! 💥 When I'm not geeking out over AI 🤖 with my classmates or building neural networks, 🧠 you can find me buried in statistics textbooks. 📚 I know, what a nerd! 🤓 I'm always down to learn new ways to speak human 🫂 and computer 💻. Making tech more fun is my jam! 🍇 If you want a cheery data buddy 😎 who can make difficult things easy-peasy 🥝 and learning a party 🎉, I'm your guy! 🙋‍♂️ Let's chat codes 👨‍💻, numbers 🧮, and machines 🤖 over coffee! ☕ I'd love to meet more techy humans. 💁‍♂️ Can't wait to talk! 🗣️

What's up, friends! 👋 I'm a computer science student about to finish my last year of college. 🎓 I LOVE writing code! ❤️ It makes me so happy! 😄 Whether I'm goofing in notebooks 📓 or coding in Python 🐍, writing programs is a blast! 💥 When I'm not geeking out over AI 🤖 with my classmates or building neural networks, 🧠 you can find me buried in statistics textbooks. 📚 I know, what a nerd! 🤓 I'm always down to learn new ways to speak human 🫂 and computer 💻. Making tech more fun is my jam! 🍇 If you want a cheery data buddy 😎 who can make difficult things easy-peasy 🥝 and learning a party 🎉, I'm your guy! 🙋‍♂️ Let's chat codes 👨‍💻, numbers 🧮, and machines 🤖 over coffee! ☕ I'd love to meet more techy humans. 💁‍♂️ Can't wait to talk! 🗣️

This project is part of

Data Wrangling with Pandas

Explore other projects