Chapter 3 Exercises: Python Fundamentals I — Variables, Data Types, and Expressions
How to use these exercises: Work through the sections in order. Parts A and B check your understanding of concepts and basic skills. Part C is all about debugging — finding and fixing broken code. Part D applies your skills to realistic data scenarios. Part E pushes you to combine ideas. Part M mixes in review questions from Chapters 1 and 2.
For every "predict the output" question: Write your prediction before running the code. The learning happens in the predicting, not the running.
Difficulty key: ⭐ Foundational | ⭐⭐ Intermediate | ⭐⭐⭐ Advanced | ⭐⭐⭐⭐ Extension
Part A: Conceptual Understanding ⭐
These questions check whether you've internalized the core ideas. Try to answer from memory before checking the chapter.
Exercise 3.1 — Variables as labels
The chapter introduced a threshold concept: variables are labels pointing to values, not boxes containing values. In your own words, explain the difference. Then explain what happens in memory when you execute these two lines:
a = 100
b = a
How many copies of the number 100 exist? How many labels point to it?
Guidance
One copy of `100` exists in memory. Two labels (`a` and `b`) point to it. If variables were boxes, `b = a` would create a copy of 100 and put it in a second box. But since variables are labels, `b = a` just sticks a second label on the same value. For simple types like integers, this distinction doesn't have practical consequences yet — but it becomes critical when you work with lists and dictionaries in [Chapter 5](../chapter-05-data-structures/index.md), where multiple labels pointing to the same object means changes through one label are visible through the other.Exercise 3.2 — Type identification
Without running any code, state the type (int, float, str, or bool) of each value:
| Value | Type |
|---|---|
42 |
|
42.0 |
|
"42" |
|
True |
|
0 |
|
"" |
|
3.14 |
|
"False" |
Answers
| Value | Type | |-------|------| | `42` | `int` | | `42.0` | `float` | | `"42"` | `str` | | `True` | `bool` | | `0` | `int` | | `""` | `str` (empty string, but still a string) | | `3.14` | `float` | | `"False"` | `str` (it's in quotes — a string that happens to spell a boolean keyword) | The tricky ones: `42.0` is a float (the decimal point makes it so, even though it's a whole number). `"False"` is a string, not a boolean — the quotes make it text. And `""` is a string (an empty one), not nothing.Exercise 3.3 — Operator precedence
Write the result of each expression. Show your work by indicating which operation Python performs first.
2 + 3 * 4(2 + 3) * 410 - 6 / 22 ** 3 + 115 // 4 + 15 % 410 > 5 and 3 + 2 == 5
Answers
1. `2 + (3 * 4)` = `2 + 12` = `14` — multiplication before addition 2. `(2 + 3) * 4` = `5 * 4` = `20` — parentheses first 3. `10 - (6 / 2)` = `10 - 3.0` = `7.0` — division before subtraction; note the result is a float because `/` returns float 4. `(2 ** 3) + 1` = `8 + 1` = `9` — exponentiation before addition 5. `(15 // 4) + (15 % 4)` = `3 + 3` = `6` — floor division and modulo have the same precedence as multiplication/division, evaluated left to right, but they're independent here 6. `(10 > 5) and ((3 + 2) == 5)` = `True and (5 == 5)` = `True and True` = `True` — arithmetic first, then comparisons, then `and`Exercise 3.4 — Assignment vs. comparison
Explain the difference between = and == in Python. For each of the following, state whether it's assignment or comparison, and what the result is:
x = 10x == 10y = xy == x
Guidance
1. **Assignment.** Gives the name `x` the value `10`. No output. 2. **Comparison.** Returns `True` if `x` equals `10`, `False` otherwise. (After line 1, this would return `True`.) 3. **Assignment.** Gives the name `y` the same value that `x` points to. 4. **Comparison.** Returns `True` if `y` and `x` have the same value. The key: `=` stores a value. `==` asks a question ("are these equal?").Exercise 3.5 — Truthiness
Without running code, predict what bool() returns for each value:
bool(1)bool(0)bool(-1)bool("")bool(" ")bool("0")bool(0.0)bool(None)
Answers
1. `True` — any nonzero number is truthy 2. `False` — zero is falsy 3. `True` — negative numbers are nonzero, therefore truthy 4. `False` — empty string is falsy 5. `True` — a space is a character, so the string isn't empty 6. `True` — the string "0" is not empty (it contains one character) 7. `False` — zero as a float is still falsy 8. `False` — `None` is falsyExercise 3.6 — Immutability of strings
What is wrong with the following code? What does the programmer probably intend, and how would you fix it?
name = "elena"
name.upper()
print(name)
Guidance
The programmer expects `name` to be `"ELENA"` after calling `.upper()`. But string methods return a *new* string — they don't modify the original. `name.upper()` creates `"ELENA"` and then it's immediately discarded because it isn't saved to any variable. The fix:name = "elena"
name = name.upper()
print(name) # ELENA
Or, if you want to keep the original: `upper_name = name.upper()`.
Part B: Applied Skills ⭐⭐
These exercises ask you to write code. Type every answer into a Jupyter notebook and run it.
Exercise 3.7 — Variable creation
Create variables to store the following information about a dataset. Use descriptive snake_case names. Then print a formatted summary using an f-string.
- The dataset is called "Global Health Observatory"
- It was last updated in 2023
- It has 1,284 rows
- It covers 195 countries
- The average life expectancy across all countries is 73.4 years
Your f-string output should look something like:
Dataset: Global Health Observatory (updated 2023)
Rows: 1,284 | Countries: 195
Average life expectancy: 73.4 years
Guidance
dataset_name = "Global Health Observatory"
last_updated = 2023
row_count = 1284
country_count = 195
avg_life_expectancy = 73.4
print(f"Dataset: {dataset_name} (updated {last_updated})")
print(f"Rows: {row_count:,} | Countries: {country_count}")
print(f"Average life expectancy: {avg_life_expectancy} years")
Note the `:,` format specifier in `{row_count:,}` to add the comma separator.
Exercise 3.8 — Arithmetic with data
A basketball player attempted 82 three-point shots and made 31 of them. Write code to:
- Store the attempts and makes in variables
- Calculate the three-point shooting percentage (makes / attempts)
- Print the result as a percentage with one decimal place using an f-string
Expected output: Three-point percentage: 37.8%
Guidance
three_pt_attempts = 82
three_pt_makes = 31
three_pt_pct = three_pt_makes / three_pt_attempts * 100
print(f"Three-point percentage: {three_pt_pct:.1f}%")
The `:.1f` format specifier means "one decimal place, float format."
Exercise 3.9 — String methods practice
Given the following messy data values (simulating what you might read from a file), clean each one using string methods:
country_raw = " United States "
temp_raw = "98.6 degrees"
code_raw = "us"
- Remove the extra whitespace from
country_raw - Extract just the number part from
temp_raw(hint: use.split()and indexing) - Convert
code_rawto uppercase
Guidance
country_raw = " United States "
temp_raw = "98.6 degrees"
code_raw = "us"
country_clean = country_raw.strip()
temp_number = temp_raw.split(" ")[0] # Splits into ["98.6", "degrees"], takes first
code_upper = code_raw.upper()
print(f"Country: '{country_clean}'")
print(f"Temperature: {temp_number}")
print(f"Code: {code_upper}")
Output:
Country: 'United States'
Temperature: 98.6
Code: US
Note: `temp_number` is still a string (`"98.6"`). If you wanted to do math with it, you'd need `float(temp_number)`.
Exercise 3.10 — String slicing
A dataset uses patient IDs in the format "HOSP-YYYY-NNNNN" where HOSP is a hospital code, YYYY is the year, and NNNNN is a sequence number. Given:
patient_id = "MGH-2024-00142"
Use slicing to extract:
1. The hospital code ("MGH")
2. The year ("2024")
3. The sequence number ("00142")
4. Convert the year to an integer and add 1 to it
Guidance
patient_id = "MGH-2024-00142"
hospital = patient_id[:3]
year_str = patient_id[4:8]
sequence = patient_id[9:]
print(f"Hospital: {hospital}")
print(f"Year: {year_str}")
print(f"Sequence: {sequence}")
year_int = int(year_str) + 1
print(f"Next year: {year_int}")
Alternative approach using `.split("-")`:
parts = patient_id.split("-")
hospital = parts[0] # "MGH"
year_str = parts[1] # "2024"
sequence = parts[2] # "00142"
Exercise 3.11 — Type conversion chain
Start with the string "3.14159" and perform the following conversions, printing the result and type at each step:
- Convert to a float
- Convert the float to an int
- Convert the int back to a string
- Convert the string to a bool
What value and type do you have at each step?
Guidance
step0 = "3.14159"
print(f"Step 0: {step0} ({type(step0).__name__})")
step1 = float(step0)
print(f"Step 1: {step1} ({type(step1).__name__})")
step2 = int(step1)
print(f"Step 2: {step2} ({type(step2).__name__})")
step3 = str(step2)
print(f"Step 3: {step3} ({type(step3).__name__})")
step4 = bool(step3)
print(f"Step 4: {step4} ({type(step4).__name__})")
Output:
Step 0: 3.14159 (str)
Step 1: 3.14159 (float)
Step 2: 3 (int) ← truncated, not rounded!
Step 3: 3 (str)
Step 4: True (bool) ← "3" is a non-empty string, so it's truthy
Key insights: `int()` truncates (3.14159 becomes 3, not 3). And `bool("3")` is `True` because any non-empty string is truthy — even `bool("0")` would be `True` and even `bool("False")` would be `True`!
Exercise 3.12 — Comparison expressions
Given the following variables, predict whether each comparison returns True or False. Then verify in Python.
a = 10
b = 3.0
c = "10"
d = True
a == 10a == ca == int(c)b > 2 and b < 4a != bd == 1type(a) == type(c)
Answers
1. `True` — `a` is 10, and 10 equals 10 2. `False` — `a` is an int, `c` is a string. `10 == "10"` is `False` in Python (no automatic type coercion) 3. `True` — `int("10")` is 10, and `10 == 10` is `True` 4. `True` — 3.0 is greater than 2 and less than 4 5. `True` — `10 != 3.0` is `True` (they're different numbers) 6. `True` — `True` is equal to `1` in Python (booleans are a subtype of integers: `True == 1`, `False == 0`) 7. `False` — `type(a)` is `Exercise 3.13 — f-string formatting
Write f-strings that produce the following outputs, given the variables below:
population = 8045311
growth_rate = 0.02847
city = "new york"
pi = 3.14159265358979
Target outputs:
1. Population: 8,045,311
2. Growth rate: 2.85%
3. City: New York
4. Pi to 4 decimals: 3.1416
Guidance
print(f"Population: {population:,}")
print(f"Growth rate: {growth_rate * 100:.2f}%")
print(f"City: {city.title()}")
print(f"Pi to 4 decimals: {pi:.4f}")
Notes:
- `:,` adds comma separators
- `:.2f` formats as float with 2 decimal places
- `.title()` capitalizes the first letter of each word
- `:.4f` formats with 4 decimal places (and rounds the last digit)
Exercise 3.14 — Augmented assignment
What is the value of x after each line executes? Track the value step by step.
x = 10
x += 5
x *= 2
x -= 7
x //= 4
x %= 3
Answer
x = 10 → x is 10
x += 5 → x is 15 (10 + 5)
x *= 2 → x is 30 (15 * 2)
x -= 7 → x is 23 (30 - 7)
x //= 4 → x is 5 (23 // 4 = 5, remainder discarded)
x %= 3 → x is 2 (5 % 3 = 2, the remainder)
Part C: Debugging ⭐⭐
Every exercise in this section contains buggy code. Find the error, identify the error type (NameError, TypeError, SyntaxError, or ValueError), and fix it.
Exercise 3.15 — Debug this
city_name = "Chicago"
print(City_name)
Answer
**Error:** `NameError: name 'City_name' is not defined` **Cause:** Python is case-sensitive. The variable was defined as `city_name` (lowercase c) but referenced as `City_name` (uppercase C). **Fix:** `print(city_name)`Exercise 3.16 — Debug this
score = "95"
curved_score = score + 5
print(curved_score)
Answer
**Error:** `TypeError: can only concatenate str (not "int") to str` **Cause:** `score` is a string (`"95"`), not a number. You can't add an integer to a string. **Fix:** `curved_score = int(score) + 5`Exercise 3.17 — Debug this
print("The temperature is 72 degrees)
Answer
**Error:** `SyntaxError: EOL while scanning string literal` **Cause:** Missing closing quotation mark before the closing parenthesis. **Fix:** `print("The temperature is 72 degrees")`Exercise 3.18 — Debug this
vaccination rate = 0.73
Answer
**Error:** `SyntaxError: invalid syntax` **Cause:** Variable names cannot contain spaces. Python sees `vaccination` as a variable and then doesn't know what to do with `rate = 0.73`. **Fix:** `vaccination_rate = 0.73`Exercise 3.19 — Debug this
total = 100
average = total / 0
Answer
**Error:** `ZeroDivisionError: division by zero` **Cause:** You can't divide by zero — not in Python, not in math. This often happens when a count variable that's supposed to be the denominator hasn't been properly populated. **Fix:** This depends on the context. You might add a check: `if denominator != 0: average = total / denominator`. Or you might need to trace back to figure out why the denominator is zero.Exercise 3.20 — Debug this: multiple errors
This code has three separate errors. Find and fix all of them.
Patient_count = 450
vacc_rate = "0.82"
city = seattle
result = Patient_count * vacc_rate
print(f"In {city}, approximately {result} patients were vaccinated")
Answer
**Error 1 (line 3):** `NameError: name 'seattle' is not defined` — `seattle` needs quotes: `city = "Seattle"` **Error 2 (line 5):** `TypeError: can't multiply sequence by non-int of type 'str'` would occur if the NameError were fixed — `vacc_rate` is a string. Fix: `vacc_rate = 0.82` (remove the quotes) or convert: `float(vacc_rate)` **Error 3 (minor):** The variable naming is inconsistent — `Patient_count` uses different casing than the other variables. While not a Python error, convention says use `patient_count`. Fixed code:patient_count = 450
vacc_rate = 0.82
city = "Seattle"
result = patient_count * vacc_rate
print(f"In {city}, approximately {result:.0f} patients were vaccinated")
Part D: Real-World Application ⭐⭐⭐
These exercises simulate tasks you'd encounter in actual data work.
Exercise 3.21 — BMI calculator
Body Mass Index (BMI) is calculated as weight in kilograms divided by height in meters squared. Write code to:
- Store a weight of 70 kg and a height of 1.75 m in variables
- Calculate the BMI
- Print the result formatted to one decimal place
- Create a boolean variable indicating whether the BMI is in the "normal" range (18.5 to 24.9)
Guidance
weight_kg = 70
height_m = 1.75
bmi = weight_kg / (height_m ** 2)
print(f"BMI: {bmi:.1f}")
is_normal = bmi >= 18.5 and bmi <= 24.9
print(f"Normal range: {is_normal}")
Output:
BMI: 22.9
Normal range: True
Exercise 3.22 — Temperature conversion
Write code that converts a temperature from Fahrenheit to Celsius using the formula: C = (F - 32) * 5/9. Use the temperature 98.6 F. Print the result with two decimal places. Then verify your answer by converting back to Fahrenheit: F = C * 9/5 + 32.
Guidance
temp_f = 98.6
temp_c = (temp_f - 32) * 5 / 9
print(f"{temp_f}°F = {temp_c:.2f}°C")
# Verify by converting back
verify_f = temp_c * 9 / 5 + 32
print(f"Verification: {verify_f:.2f}°F")
Output:
98.6°F = 37.00°C
Verification: 98.60°F
Exercise 3.23 — Data summary report
You have the following raw data about a survey. Write code that stores each value, performs calculations, and prints a formatted report.
- Survey name: "Public Transit Satisfaction Survey"
- Total respondents: 2,847
- Satisfied: 1,891
- Unsatisfied: 814
- No response: 142
- Survey start date: "2024-01-15"
- Survey end date: "2024-02-28"
Your report should calculate and display: - The satisfaction rate as a percentage - The response rate (respondents who gave an answer / total) - The start year and month extracted from the date string
Guidance
survey_name = "Public Transit Satisfaction Survey"
total = 2847
satisfied = 1891
unsatisfied = 814
no_response = 142
start_date = "2024-01-15"
end_date = "2024-02-28"
responded = satisfied + unsatisfied
satisfaction_rate = satisfied / responded * 100
response_rate = responded / total * 100
start_year = start_date[:4]
start_month = start_date[5:7]
print(f"=== {survey_name} ===")
print(f"Total respondents: {total:,}")
print(f"Satisfaction rate: {satisfaction_rate:.1f}%")
print(f"Response rate: {response_rate:.1f}%")
print(f"Survey period: {start_year}, month {start_month}")
Exercise 3.24 — Course grade calculation
A student's grade is computed as: homework 30%, midterm 30%, final 40%. Given scores of homework=88, midterm=76, final=91, compute the weighted grade. Then determine whether the student passed (grade >= 60) and whether they earned honors (grade >= 90).
Guidance
homework = 88
midterm = 76
final = 91
grade = homework * 0.30 + midterm * 0.30 + final * 0.40
passed = grade >= 60
honors = grade >= 90
print(f"Weighted grade: {grade:.1f}")
print(f"Passed: {passed}")
print(f"Honors: {honors}")
Output:
Weighted grade: 85.6
Passed: True
Honors: False
Exercise 3.25 — Cleaning messy strings
Imagine you've read these values from a badly formatted spreadsheet. Use string methods to clean each one:
name = " dr. elena RODRIGUEZ "
email = "Elena.Rodriguez@Hospital.ORG"
phone = "555-867-5309"
department = "infectious diseases"
Transform them to produce:
- Name as title case with no extra spaces: "Dr. Elena Rodriguez"
- Email as all lowercase: "elena.rodriguez@hospital.org"
- Phone with no dashes: "5558675309"
- Department capitalized: "Infectious Diseases"
Guidance
name = " dr. elena RODRIGUEZ "
email = "Elena.Rodriguez@Hospital.ORG"
phone = "555-867-5309"
department = "infectious diseases"
name_clean = name.strip().title()
email_clean = email.lower()
phone_clean = phone.replace("-", "")
dept_clean = department.title()
print(f"Name: {name_clean}")
print(f"Email: {email_clean}")
print(f"Phone: {phone_clean}")
print(f"Department: {dept_clean}")
Part E: Synthesis and Extension ⭐⭐⭐⭐
These problems require combining multiple concepts.
Exercise 3.26 — Data type detective
Without using type(), write expressions that test whether a variable contains a specific type. For example, to check if x is an integer, you could use x == int(x) — but be careful, that doesn't always work.
For each of the following variables, write a boolean expression that evaluates to True:
a = 42
b = 42.0
c = "42"
d = True
Hint: use isinstance() — Python's built-in function for type checking. Look up how it works, or try isinstance(a, int).
Guidance
print(isinstance(a, int)) # True
print(isinstance(b, float)) # True
print(isinstance(c, str)) # True
print(isinstance(d, bool)) # True
# Interesting edge case:
print(isinstance(d, int)) # Also True! bool is a subclass of int
The fact that `isinstance(True, int)` returns `True` is a Python quirk — booleans are technically integers (`True == 1`, `False == 0`).
Exercise 3.27 — Building a data dictionary
A "data dictionary" is a description of every column in a dataset. Using only variables and f-strings (no lists or dictionaries yet — those come in Chapter 5), create a printed data dictionary for a small dataset with three columns. For each column, store and display:
- Column name
- Data type (as a descriptive string like "numeric" or "text")
- Description
- Example value
Format the output neatly. This is practice for the kind of documentation you'll write alongside every data science project.
Guidance
col1_name = "country"
col1_type = "text"
col1_desc = "Full country name"
col1_example = "Brazil"
col2_name = "year"
col2_type = "numeric (integer)"
col2_desc = "Year of observation"
col2_example = "2023"
col3_name = "vaccination_rate"
col3_type = "numeric (float)"
col3_desc = "Percentage of population vaccinated"
col3_example = "0.73"
print("=== DATA DICTIONARY ===")
print(f"\n{'Column':<20} {'Type':<20} {'Description':<35} {'Example'}")
print("-" * 85)
print(f"{col1_name:<20} {col1_type:<20} {col1_desc:<35} {col1_example}")
print(f"{col2_name:<20} {col2_type:<20} {col2_desc:<35} {col2_example}")
print(f"{col3_name:<20} {col3_type:<20} {col3_desc:<35} {col3_example}")
The `:<20` format specifier left-aligns text in a field 20 characters wide.
Exercise 3.28 — Floating-point exploration
Investigate floating-point precision by answering these questions with code:
- What does
0.1 + 0.2equal in Python? Is it exactly0.3? - What does
0.1 + 0.2 == 0.3return? - What does
round(0.1 + 0.2, 1) == round(0.3, 1)return? - Try
0.1 + 0.1 + 0.1 - 0.3. Is the result exactly zero? - In one or two sentences, explain why this happens and whether it matters for data science.
Guidance
print(0.1 + 0.2) # 0.30000000000000004
print(0.1 + 0.2 == 0.3) # False
print(round(0.1 + 0.2, 1) == round(0.3, 1)) # True
print(0.1 + 0.1 + 0.1 - 0.3) # 5.551115123125783e-17
This happens because computers store floats in binary (base 2), and some decimal fractions (like 0.1) can't be represented exactly in binary — similar to how 1/3 can't be represented exactly in decimal. For data science, this rarely matters because real-world measurements already have far more uncertainty than one quadrillionth. But it's a gotcha when comparing floats with `==`.
Part M: Mixed Review (Chapters 1-2) ⭐
These questions revisit earlier chapters. If you struggle with any of them, revisit the relevant chapter section.
Exercise 3.29 — Data science lifecycle revisited (from Chapter 1)
For each of the Python operations below, identify which stage of the data science lifecycle it most closely corresponds to:
vaccination_rate = vaccinated / total_populationsource_url = "https://data.who.int/"print(f"Vaccination rates range from {min_rate} to {max_rate}")country = country_raw.strip().lower()
Lifecycle stages: Ask, Acquire, Clean, Explore, Model, Communicate
Answers
1. **Explore** (or Model, depending on context) — computing a summary statistic from data 2. **Acquire** — recording the source of data 3. **Communicate** — presenting findings in a readable format 4. **Clean** — standardizing text data by removing whitespace and converting to consistent caseExercise 3.30 — Jupyter workflow (from Chapter 2)
You're working in a Jupyter notebook and encounter this situation: you defined patient_count = 4521 in cell 3, used it in cell 7, and then accidentally deleted cell 3.
- Does cell 7 still work if you run it right now?
- What happens if you restart the kernel and try to run cell 7?
- How would you prevent this kind of problem in the future?