Practice Lab: Merge and Joining data with Pandas

multiplechoice

Which parameter should we use while using `pd.merge()` method to handle overrapling column names?

multiplechoice

Which parameter we should use to determine the type of merge to be performed?

multiplechoice

Which of the following is not an option in `how` parameter?

multiplechoice

Which of the following is the default option for `how` parameter?

multiplechoice

Which of the following indicates the usage of `cross` option in `how` parameter?

codevalidated

Drop duplicate movies based on `title` and keep the first occurence

Perform the dropping on the original dataframe movies_df.

codevalidated

Merge `movies_df` & `ratings_df` with an inner join on `movieId`.

Store the resulting dataframe in the variable movies_ratings_df.

codevalidated

Use the merged `movies_ratings_df` dataframe to calculate the average rating for each movie.

Store the result in the variable avg_ratings.

Your result should look like this (title is unique for all movies as we have already dropped duplicates in activity 6):

activity7a-answer

multiplechoice

What is the average rating of `Toy Story (1995)` Movie?

codevalidated

Merge `movies_df` & `tags_df` with a left join on `movieId`.

Store the resulting dataframe in the variable movies_tags_df.

codevalidated

Use the merged dataframe `movies_tags_df` to select the movies with no tags.

Store the result in the variable movies_with_no_tags.

codevalidated

Merge `tags_df` & `ratings_df` using the movie ID and the user ID

Merge tags_df & ratings_df using an outer join on 'movieId' and 'userId'. Use suffixes '_tags' and '_ratings'.

Store the resulting dataframe in the variable tags_ratings_df.

The result should look something like:

codevalidated

Merge `movies_df` dataframe & `tag_counts` series with the left dataframe on `genres` & the right series on its index.

Store the resulting dataframe in the variable movies_tags_counts_df.

Note: use the default option of inner join.

codevalidated

Merge `movies_df` dataframe & `rating_counts` series using `outer` join with the left dataframe on `movieId` & the right series on its index.

Store the resulting dataframe in the variable movies_ratings_counts_df.

codevalidated

Use the `movies_ratings_counts_df` dataframe to select the movies with no ratings.

Store the resulting dataframe in the variable movies_with_no_ratings.

Mohamed Rawash

Project Activities

Which parameter should we use while using `pd.merge()` method to handle overrapling column names?

Which parameter we should use to determine the type of merge to be performed?

Which of the following is not an option in `how` parameter?

Which of the following is the default option for `how` parameter?

Which of the following indicates the usage of `cross` option in `how` parameter?

Drop duplicate movies based on `title` and keep the first occurence

Merge `movies_df` & `ratings_df` with an inner join on `movieId`.

Use the merged `movies_ratings_df` dataframe to calculate the average rating for each movie.

What is the average rating of `Toy Story (1995)` Movie?

Merge `movies_df` & `tags_df` with a left join on `movieId`.

Use the merged dataframe `movies_tags_df` to select the movies with no tags.

Merge `tags_df` & `ratings_df` using the movie ID and the user ID

Merge `movies_df` dataframe & `tag_counts` series with the left dataframe on `genres` & the right series on its index.

Merge `movies_df` dataframe & `rating_counts` series using `outer` join with the left dataframe on `movieId` & the right series on its index.

Use the `movies_ratings_counts_df` dataframe to select the movies with no ratings.

Mohamed Rawash

Data Wrangling with Pandas

Set Operations using Sakila

LIKE Operator using World

Membership and Range Operators with World Database