Bartender's Blueprint: Series Operations on Cocktail Concoctions
Bartender's Blueprint: Series Operations on Cocktail Concoctions Data Science Project
Intro to Pandas for Data Analysis

Bartender's Blueprint: Series Operations on Cocktail Concoctions

This project will guide you through mastering vectorized operations and data analysis techniques using a captivating cocktail recipe dataset. You'll explore series manipulation, normalization, and standardization methods to analyze data about various cocktails, their ingredients, and preparation techniques.You'll learn to perform ratio and percentage calculations, use aggregation methods, and gain insights into the world of cocktails through the lens of data science.
Start this project
Bartender's Blueprint: Series Operations on Cocktail ConcoctionsBartender's Blueprint: Series Operations on Cocktail Concoctions
Project Created by

Vidhi Shah

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Convert Titles to Lowercase!

Convert all the titles of the title column in lowercase.

Store your results in a variable named lowercase_titles.

Your result should look something like this :

img1

codevalidated

Extract First Word from Each Title

Create a series named first_words that stores the first word from each cocktail title.

Your result should look something like this :

img2

codevalidated

Calculate the Ingredient Length Ratio

The variety of ingredients in cocktails can vary greatly. Create a new series ingredient_length_ratio by calculating the ratio of the length of each ingredient list (in characters) to the total length of all ingredient lists across all recipes.

This should be done by first calculating the total number of characters in all ingredient lists. Then, for each recipe, divide the length (number of characters) of its ingredient list by the total length.

Your result would look something like this:

img3

multiplechoice

What is the primary advantage of using vectorized operations on a pandas series over applying a function using a loop?

codevalidated

Standardize the Recipe Length.

Create a new series recipe_length_standardized by standardizing the number of ingredients (i.e., recipe length) for each recipe.

To do this, first calculate the recipe length as the number of characters in the ingredients string for each recipe.

Then, standardize the length using the formula:

recipe_length_standardized = (recipe_length - mean(recipe_length)) / std(recipe_length)

Where std : stands for the standard deviation of recipe lengths.

Your result would look something like this:

img5

codevalidated

Find the Glass Popularity Ratio

Not all glass types are equally popular.

Create a series glass_popularity_ratio that computes the ratio of each glass type's usage count to the total number of cocktail entries (rows) in the dataset. This will give insights into how often each glass type is used relative to others.

Note that we are considering all entries, even if some recipes are missing values in the recipe column.

Your result would look something like this :

img6

codevalidated

Calculate the Garnish Effectiveness Index!

Assume that each garnish adds a certain value to the drink, based on its frequency of use.

Create a new series garnish_effectiveness_index by dividing the frequency of each garnish by the total number of garnishes across all recipes, then multiplying by 100 to create a percentage.

Your result would look something like this :

img7

codevalidated

Calculate the Ingredient to Garnish Ratio

Create a series ingredient_to_garnish_ratio by dividing the number of ingredients used in each cocktail by the number of garnishes. The number of ingredients is determined by counting the commas in the ingredients string and adding 1.

Similarly, the number of garnishes is calculated by counting the commas in the garnish string and adding 1.

To handle missing garnishes, fill those entries with -1 before adding 1, and to avoid division by zero, add a small value (0.1) to the denominator.

Your result should look something like this:

img8

codevalidated

Create the `glass_usage_standardized` Series

Standardize the usage of each glass type by calculating the difference between the glass usage count and the mean glass usage count, then dividing by the standard deviation of the glass usage count.

Store the result in a new series glass_usage_standardized.

Your result would look something like this : img9

multiplechoice

Which of the following operations is `NOT` a vectorized operation in pandas?

codevalidated

Calculate the `ingredient_intensity_index` Series

The complexity of a cocktail often reflects its ingredient intensity. Create a new series ingredient_intensity_index by calculating the square of the number of characters in the ingredients string for each recipe.

Your result would look something like this: img11

Bartender's Blueprint: Series Operations on Cocktail ConcoctionsBartender's Blueprint: Series Operations on Cocktail Concoctions
Project Created by

Vidhi Shah

This project is part of

Intro to Pandas for Data Analysis

Explore other projects