All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
The acousticness column in the DataFrame represents the acoustic level of each song. To make the column name more descriptive and readable, use the rename() function to change the column name from acousticness to acoustic_level. Set the inplace parameter to True to modify the DataFrame directly without creating a new copy. This renaming operation will update the column name in the original DataFrame df.
Rename multiple columns in the DataFrame df to make them more descriptive, concise, and easily understandable. Change 'danceability' to 'dance_score', 'duration_ms' to 'duration_milliseconds', 'instrumentalness' to 'instrumental', 'liveness' to 'live_performance', and 'speechiness' to 'speech_presence'. Assign the resulting DataFrame with the renamed columns back to the variable df to update the original DataFrame.
Convert the duration_milliseconds column values to seconds and store the result in a new column named duration_seconds.
Note : New column added at the end of the
df
Rescale the values in the popularity column by multiplying them with 0.01 and store the rescaled values in a new column named popularity_score.
Note : New column added at the end of the
df
Create a new column is_popular that contains 1 for rows where the popularity value is greater than 70, and 0 otherwise. Convert the boolean result to integer values, where True becomes 1 and False becomes 0. This new column will indicate whether a song is popular or not, with 1 representing popular songs and 0 representing non-popular songs.
Note : New column added at the end of the
df
Calculate the number of artists for each row by counting the number of commas in the artists column and adding 1, then store the result in a new column named artist_count.
Note : New column added at the end of the
df
Convert the duration_seconds column from seconds to minutes and store the result in a new column named duration_minutes.
Note : New column added at the end of the
df
Increase the values in the popularity column by adding 10 to each value.
Reduce the values in the speech_presence column by multiplying them with 0.8.
Note : Use
df.head().Tfor viewing yourdf.df.head().Tprovides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
Decrease the values in the dance_score column by subtracting 0.1 from each value.
Replace the numerical values in the mode column with textual representations, where 0 is replaced with 'Minor' and 1 is replaced with 'Major'.
Limit the maximum value in the tempo column to 150 by clipping any values above 150 to 150.
Note : Use
df.head().Tfor viewing yourdf.df.head().Tprovides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
Replace the numerical values in the key column with their corresponding note names, the mappings are:
0 → 'C', 1 → 'C#', 2 → 'D', 3 → 'D#', 4 → 'E', 5 → 'F', 6 → 'F#', 7 → 'G', 8 → 'G#', 9 → 'A', 10 → 'A#', 11 → 'B'
Replace the numerical values in the explicit column with textual representations, where 0 is replaced with 'Not Explicit' and 1 is replaced with 'Explicit'.
Note : Use
df.head().Tfor viewing yourdf.df.head().Tprovides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
For rows where the year value is less than 1950, replace the year value with 1950.
Limit the tempo column values between 50 and 150. For values exceeding 150, replace them with 150, and for values below 50, replace them with 50.
Note : Use
df.head().Tfor viewing yourdf.df.head().Tprovides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.