Practice Lab: Merge and Joining data with Pandas
Practice Lab: Merge and Joining data with Pandas Data Science Project
Data Wrangling with Pandas

Practice Lab: Merge and Joining data with Pandas

In this lab, you'll explore the merging and joining of datasets using Pandas. You'll practice different types of joins, merging different dataframes to gain insights about movies. Get hands-on experience with merging techniques and tackle interesting questions about the movies.
Start this project
Practice Lab: Merge and Joining data with PandasPractice Lab: Merge and Joining data with Pandas
Project Created by

Mohamed Rawash

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.


Which parameter should we use while using `pd.merge()` method to handle overrapling column names?


Which parameter we should use to determine the type of merge to be performed?


Which of the following is not an option in `how` parameter?


Which of the following is the default option for `how` parameter?


Which of the following indicates the usage of `cross` option in `how` parameter?


Drop duplicate movies based on `title` and keep the first occurence

Perform the dropping on the original dataframe movies_df.


Merge `movies_df` & `ratings_df` with an inner join on `movieId`.

Store the resulting dataframe in the variable movies_ratings_df.


Use the merged `movies_ratings_df` dataframe to calculate the average rating for each movie.

Store the result in the variable avg_ratings.

Your result should look like this (title is unique for all movies as we have already dropped duplicates in activity 6):



What is the average rating of `Toy Story (1995)` Movie?


Merge `movies_df` & `tags_df` with a left join on `movieId`.

Store the resulting dataframe in the variable movies_tags_df.


Use the merged dataframe `movies_tags_df` to select the movies with no tags.

Store the result in the variable movies_with_no_tags.


Merge `tags_df` & `ratings_df` using the movie ID and the user ID

Merge tags_df & ratings_df using an outer join on 'movieId' and 'userId'. Use suffixes '_tags' and '_ratings'.

Store the resulting dataframe in the variable tags_ratings_df.

The result should look something like:


Merge `movies_df` dataframe & `tag_counts` series with the left dataframe on `genres` & the right series on its index.

Store the resulting dataframe in the variable movies_tags_counts_df.

  • Note: use the default option of inner join.

Merge `movies_df` dataframe & `rating_counts` series using `outer` join with the left dataframe on `movieId` & the right series on its index.

Store the resulting dataframe in the variable movies_ratings_counts_df.


Use the `movies_ratings_counts_df` dataframe to select the movies with no ratings.

Store the resulting dataframe in the variable movies_with_no_ratings.

Practice Lab: Merge and Joining data with PandasPractice Lab: Merge and Joining data with Pandas
Project Created by

Mohamed Rawash

This project is part of

Data Wrangling with Pandas

Explore other projects