AI Quiz 2: NumPy & Pandas Data Cleaning

AI Quiz 2: 15 NumPy & Pandas Data Cleaning Questions

This quiz tests your knowledge of data cleaning with NumPy and Pandas, focusing on handling missing values, vectorized operations, and DataFrame manipulation. You'll be tested on data preprocessing techniques essential for data science.

Perfect for Python developers looking to improve their data science skills or assess their knowledge of data cleaning techniques.

Download this quiz as a PDF

0/15 answered

1. Why do we need to clean data?

2. Which of the following is an example of wrong data?

3. Duplicate rows in a DataFrame can cause misleading results.

4. Fill in the blank: Replace missing values (NaN) with 0 in a NumPy array.

import numpy as np
arr = np.array([1, 2, np.nan, 4])
arr[np.isnan(arr)] = ____

5. What does vectorized math mean in NumPy? Give one example.

6. What does dtype=str do when creating a NumPy array?

7. What will the following code output?

import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr + 5)

8. What does np.empty_like(original) create?

9. What is a DataFrame in Pandas?

10. Fill in the missing code: Create a DataFrame from a dictionary.

import pandas as pd
data = {
    "Name": ["Alice", "Bob"],
    "Age": [12, 15]
}
df = pd.____(data)

11. Explain what this line of code does: df["Score"] = df["Score"].fillna(df["Score"].mean())

12. How can we replace missing author names in a DataFrame with "Unknown"?

13. Pandas automatically ignores NaN values when calculating the mean.

14. Which of the following best describes the advantage of Pandas over NumPy for data cleaning?

15. You scraped this dataset:

NameScore
Alice95
BobNone
Charlie88
DavidNone

Write Pandas code to replace missing scores with the average score.