All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Merge DataFrames df_numeric and df_string based on the common column Booking_ID using a left join. Store the resultant dataframe in df variable.
Filter the DataFrame df to display only the rows where the number of adults is greater than 2. Store the resultant dataframe in df_n_adults variable.
Filter the DataFrame df to include only bookings from the arrival year 2018 with a lead time greater than 100. After filtering, sort the results by avg_price_per_room in descending order. Store the resultant dataframe in filtered_df variable.
Group the DataFrame df by room_type_reserved and calculate the average avg_price_per_room for each room type. Store the result in avg_price_by_room.
Enter the name of the room type that has the highest average price per room.
Group the DataFrame df by arrival_month and calculate the count, mean and median of avg_price_per_room for each month. Store the result in monthly_stats variable.
Use groupby() and agg() to calculate sum of avg_price_per_room and the booking count for each combination of arrival_year and arrival_month. Store the resultant dataframe in revenue_summary variable.
Generate dummy variables for the type_of_meal_plan column in the DataFrame df. After creating the dummy variables, drop the original type_of_meal_plan column, if it exists. Set drop_first=True. Store the result in df.
Create a new column price_category in the DataFrame df by binning the avg_price_per_room into three distinct categories: 
Budget - [0 - 180]Standard - (180 - 360]Luxury - (360 - 540]Use the apply() function to create a new column total_nights in the DataFrame df . This column will be the sum of no_of_weekend_nights and no_of_week_nights, providing a total count of nights stayed for each booking.
Create a new DataFrame named string_df and store all the string columns in that DataFrame. Use the applymap() function to convert all string values in string_df to lowercase.
Create a bar plot that visualizes the average lead_time for each market_segment_type in the df. Set color='orange to make the bars in orange color.
Create a scatter plot to visualize the relationship between lead_time and avg_price_per_room.
Plot a stacked bar chart showing the distribution of canceled vs. not canceled bookings across different market segments.
Plot a line chart showing the number of bookings for each room type over different months of the year.