All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
We have already read the 2 CSVs in to the df1 and df2 variables. Now, use the itertools.product method to create a resulting dataframe df that will contain the product of the two CSVs. The columns should be named CSV 1 and CSV 2.
As we have 266 rows in df1 and 368 in df2, the resulting df will have 97,888 rows (266 * 368), and it'll look something like:

Now apply the function fuzz.partial_ratio to all the companies in df to calculate the distance between them. Store the distance in a new column named Ratio Score. It'll look similar to:

We saw that in CSV1 there's a company AECOM, what's the corresponding value in CSV2?
CSV1 company is Starbucks, what's the corresponding name in CSV2?
CSV1 contains Pinnacle West Capital Corporation, is there a matching in CSV2?
CSV1 contains County of Los Angeles Deferred Compensation Program. How many matching companies seem to be in CSV 2?
CSV1 contains The Queens Health Systems, is there a matching in CSV2?