Intro to Pandas for Data Analysis

Electric vehicles are taking center stage! This project delves into a dataset packed with information about EVs across various locations. By wielding the power of Python's sorting and filtering techniques, we'll analyze distribution, range, and key characteristics of these vehicles. This exploration aims to shed light on the evolving landscape of electric transportation.

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

input

How many distinct car models are included in this dataset?

input

input

I'm curious, how many miles can the car with the longest range go on a single charge based on the data you have?

input

Can you identify the car `Make`

and `Model`

from the data that has the farthest back model year?

Your final answer should have the name of the oldest car make and model. Like so: `BMW l3`

codevalidated

Sort the DataFrame by `Electric Range`

in descending order and select the top five rows.
Save your result in the variable: `top_5_range_vehicles`

.

input

In this question, you will filter the DataFrame into electric and hybrid vehicles, then calculate the average base MSRP separately for each type, and then compare the results. This analysis helps understand the pricing dynamics between these two types of eco-friendly vehicles.

Your final answer should be either `Electric Vehicles`

or `Hybrid Vehicles`

.

codevalidated

Filter the DataFrame to include only the rows where vehicles are eligible for CAFV status, then compare their distribution across different makes using the `.value_counts`

function.

Save your final result in the variable: `make_counts`

input

Filter the dataset to include only Electric Vehicles, then calculates the average electric range for each vehicle make and print the result. Your answer should be written in all CAPS, like so : `DOG`

.

input

Across the counties included in the dataset, where are electric vehicles most concentrated?

codevalidated

Reset index of your final output and store in the variable: `state_top_5`

codevalidated

The `Vehicle location`

column contains geographical information about the vehicles. Simplify future analysis and integration by splitting this column. After splitting, ensure the data is of the correct type

input

Filter the DataFrame to include only Hybrid Electric Vehicles (PHEVs). Then counts the occurrences of each unique city. Finally, find the city name (index) with the highest count, revealing the city with the most PHEVs.

input

First, filter the dataset to include only Electric Vehicle. Then, count the occurrences of each city in this filtered subset. Using the `.idxmax`

function, determine the city with the lowest count of electric vehicles.

input

Sort the dataset by Postal Code and determine which specific zip code has the highest density of Electric Vehicles.
Your final answer should be an integer, like `5985`

.

codevalidated

Filter the DataFrame for vehicles with low electric range (below 50 miles) and see their distribution across cities. Store your final result in the variable: `cities_distribution`

.

input

Group by City and see if any city has a particularly high number of Teslas.

codevalidated

Electric vehicles are becoming an increasingly popular choice for drivers. This analysis aims to explore how this trend.
Filter the DataFrame for electric vehicles and compute the percentage of production for each year. Save your final output in the variable: `electric_vehicle_percentage_by_year`

.

input

Filter the DataFrame for models produced after 2023 and calculate which vehicle type has the highest electric range, Electric(BEV) or Hybrid(PHEV).
Your answer should be `BEV`

or `PHEV`

.

codevalidated

Filter the DataFrame for 2024 vehicles with `Not eligible due to low battery range`

and showcase their Make and Model. Store your final result in the variable: `make_and_model`

.

codevalidated

Sort the DataFrame by the `Model Year`

column to organize records chronologically. Next, concatenate the 'Make' and 'Model' columns to create a combined identifier. Then, compute the frequency of each unique `Make`

and `Model`

combination in the dataset to determine which EV models are most commonly represented. Store your final result in the variable : `make_model_counts`

.

codevalidated

Spot Early Electric Vehicle Adopters.

Filter the DataFrame to include only vehicles manufactured before 2016. From this subset, determine which counties have the highest concentration of these early electric vehicles by counting the number of vehicles per county and sorting them in descending order based on vehicle count. Reset the index of the resulting DataFrame and store it in `sorted_concentration`

.

codevalidated

Sort the DataFrame by `Electric Range`

in descending order, then filter it to include only models with over 200 miles of electric range. From this subset, calculate the distribution of these high-mileage EVs across different cities by counting their occurrences. Store the resulting distribution in the variable :`city_distribution`

.

This project is part of

Explore other projects