Intro to Pandas for Data Analysis

Mersenne primes are a specific type of prime numbers that are extremely important in the field of computing, number theory and cryptography. They're defined as any prime number that can also be written in the form `2^n - 1` (for any integer `n`).
In this project, you'll need to use Pandas DataFrame vectorized operations and filtering to detect which of the first 1M prime numbers are also Mersenne primes.

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

First, we need to find the first 1M prime numbers. But there's no need to waste time writing functions or waiting for computation. There are lists of Prime numbers out there. Let's start by reading the first 1M prime numbers that are stored in the file `primes1.csv`

in the variable `primes_df`

. Your DataFrame should look something like:

codevalidated

Let's start by just generating a list of the first 100,000 sequential integers numbers in a dataframe named `exp_df`

. Just that, a single column named `n`

containing all integers in the range `1`

to `100_000`

.

WARNING: Make sure`100_000`

IS included in your dataframe:

Your dataframe should look something like:

codevalidated

We'll now start approaching the Mersenne formula `2^n - 1`

.

First, create a new column named `2^n`

that contains the results of exactly that formula. The column `n`

elevated to the power of `2`

. You'll need to use vectorized operations to do so.

Your dataframe will look something like:

codevalidated

Now, create the column `2^n - 1`

that completes the formula. Your dataframe should now look something like:

codevalidated

Initialize a new column in `primes_df`

named `is_mersenne_prime`

that contains ONLY `False`

values.

Your dataframe should look something like:

codevalidated

Now is the time to flip the switch! Use the values in `exp_df`

to correctly identify which values in `primes_df`

are ACTUALLY Mersenne prime. You'll need to turn the value of `is_mersenne_prime`

to True for those prime numbers.

**Warning!** If you make a mistake and you have modified your `primes_df`

dataframe, you'll need to start from scratch. So, it might be handy to create a copy of your dataframe first just as a backup. Example:

```
>>> primes_df_backup = primes_df.copy() # Execute this ONLY ONCE!
>>> # I made a mistake and the activity doesn't work, I want to go back!
>>> primes_df = primes_df_backup.copy() # We're back!
>>> # Try again
```

Your dataframe should look something like:

This project is part of

Explore other projects