Intro to Pandas DataFrames
Intro to Pandas DataFrames Data Science Project
Intro to Pandas for Data Analysis

Intro to Pandas DataFrames

In this project you'll learn about the most important structure in Pandas: the DataFrame. A tabular structure that is the cornerstone of pandas for Data Analysis. You'll also learn about its index and the relationship between its columns and Series, that has been hightlited before.

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Output the first four rows of the df using the `head` function

Now it's your turn! Try outputting the first four rows of the dataframe using the head function. Store the result in the variable head_first_4.

codevalidated

Output the last six rows of the df using the tail function.

Now try output the last six rows of the dataframe using the tail function. Store the result in the variable tail_last_6.

codevalidated

Select the column Employees

Let's practice these skills, select the column Employees into the variable employees_s. You'll notice that the result of this selection is a Series.

codevalidated

Output the median Employees to the nearest whole number

Now, take it one step further and find the median of each row for the column Employees. Store the result in the variable employees_median

codevalidated

Calculate the mean for columns Revenue and Employees

Lastly, let's calculate the mean for the columns Revenue and Employees. Store the result in the variable r_e_mean.

Your result should be a Series, and it should look something like:

Revenue      XXX
Employees    YYY
dtype: float64
codevalidated

Select the Revenue, Employees & Sector for the companies Apple, Alphabet and Microsoft

Now let's leverage your .loc selection skills. Your task is to select the columns Revenue, Employees & Sector for the companies Apple, Alphabet and Microsoft. Your result should be stored ina variable index_selection and it should be a DataFrame looking something like:

Revenue  Employees                Sector
Apple       274515     147000  Consumer Electronics
Alphabet    182527     135301     Software Services
Microsoft   143015     163000     Software Services
codevalidated

Using Position Selection, select the Revenue, Employees & Sector for the companies Apple, Alphabet and Microsoft

Now it's time to put your iloc skills to the practice. Your task is to select the companies in positions: 2nd, 4th and 6th. And the columns in positions 1st, 2nd and the last one. Store your result in the variable position_selection.

Intro to Pandas DataFramesIntro to Pandas DataFrames
Author

Maria Durkin

This project is part of

Intro to Pandas for Data Analysis

Explore other projects