Case Study 1: Maya's Public Health Brief for the City Council
The Setup
Dr. Maya Chen has a problem that no amount of statistics can solve on its own.
She's standing in a conference room at City Hall, laptop connected to the projector, presentation loaded. In twenty minutes, the Riverside County City Council will file in for their monthly public health briefing. They'll be tired, distracted, and thinking about a dozen other agenda items. She has exactly fifteen slides to convince them that ER overcrowding is driven by a web of interconnected factors — not just poverty — and that expanding insurance access and primary care clinics would be more effective than general anti-poverty programs.
Her regression analysis (from Chapter 22) is solid: - Poverty rate is strongly correlated with ER visit rate ($r = 0.96$, $R^2 = 0.92$) - But uninsured rates ($r = 0.94$) and lack of primary care physicians ($r = -0.91$) are also strong predictors - In a multiple regression with all three predictors, poverty rate is no longer significant — it works through the other two variables
The data tells a clear story. But the presentation of that data will determine whether the council makes good policy or bad policy. Maya knows that if she shows the simple scatterplot of poverty vs. ER visits without context, a council member will say, "See? It's poverty. Let's fund anti-poverty programs." That's not wrong, exactly — but it misses the faster, more direct intervention.
She also knows that if she overloads the council with regression tables and p-values, their eyes will glaze over and they'll default to whatever policy was already on the table.
She needs to get this right.
Maya's Communication Strategy
Step 1: The Executive Summary Slide
Maya's first content slide — the one that appears while council members are still settling into their chairs — reads:
ER Overcrowding: What's Driving It and What Will Work Fastest
Finding: Communities with high ER visit rates share two characteristics: low insurance coverage and few primary care physicians. Poverty is correlated with both, but addressing insurance and primary care access directly would reduce ER visits more quickly than general anti-poverty programs.
Recommendation: Pilot a program in 3 communities to expand Medicaid enrollment and recruit primary care physicians. Estimated cost: $2.4M. Estimated ER visit reduction: 15-25%.
This slide contains zero jargon. It states the finding and the recommendation. A council member who read nothing else would have the essential message.
Step 2: The "One Key Chart"
Maya knows that the council will remember at most one chart from her presentation. She makes it count.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
# ============================================================
# MAYA'S KEY CHART: THE ONE THE COUNCIL WILL REMEMBER
# Scatterplot with annotation layer telling the complete story
# ============================================================
np.random.seed(2026)
# Community-level data (from Chapter 22)
communities = pd.DataFrame({
'community': [f'Community {i+1}' for i in range(25)],
'poverty_rate': [8.2, 12.5, 15.3, 5.1, 22.6, 18.4, 9.7, 25.1, 11.3,
19.8, 6.4, 14.7, 21.3, 7.8, 16.9, 28.4, 10.2, 23.7,
13.6, 20.5, 4.3, 17.2, 26.8, 8.9, 15.8],
'er_rate': [145, 198, 225, 110, 310, 265, 155, 345, 175,
280, 120, 215, 295, 138, 240, 380, 165, 320,
200, 290, 95, 250, 365, 148, 230],
'uninsured_pct': [6.5, 10.2, 14.1, 3.8, 20.5, 16.8, 7.3, 23.2, 9.8,
18.1, 5.1, 12.9, 19.7, 6.2, 15.3, 25.8, 8.4, 21.6,
11.5, 18.9, 3.2, 15.0, 24.3, 7.1, 13.7],
'pcp_per_1000': [3.2, 2.5, 2.0, 3.8, 1.3, 1.6, 2.9, 1.0, 2.4,
1.5, 3.5, 2.1, 1.4, 3.1, 1.8, 0.8, 2.7, 1.2,
2.3, 1.5, 4.0, 1.9, 0.9, 3.0, 2.1]
})
# Color points by physician access level
communities['pcp_level'] = pd.cut(communities['pcp_per_1000'],
bins=[0, 1.5, 2.5, 5.0],
labels=['Low (<1.5)', 'Medium (1.5-2.5)',
'High (>2.5)'])
# Create the chart
fig, ax = plt.subplots(figsize=(10, 7))
colors = {'Low (<1.5)': '#E74C3C', 'Medium (1.5-2.5)': '#F39C12',
'High (>2.5)': '#27AE60'}
markers = {'Low (<1.5)': 'v', 'Medium (1.5-2.5)': 's', 'High (>2.5)': 'o'}
for level in ['Low (<1.5)', 'Medium (1.5-2.5)', 'High (>2.5)']:
mask = communities['pcp_level'] == level
ax.scatter(communities.loc[mask, 'poverty_rate'],
communities.loc[mask, 'er_rate'],
c=colors[level], marker=markers[level],
s=100, edgecolors='#333333', linewidth=0.5,
label=f'PCP Access: {level}', alpha=0.85, zorder=3)
# Regression line
slope, intercept, r_value, p_value, std_err = stats.linregress(
communities['poverty_rate'], communities['er_rate'])
x_line = np.linspace(3, 30, 100)
y_line = intercept + slope * x_line
ax.plot(x_line, y_line, color='gray', linewidth=1.5, linestyle='--',
alpha=0.5, zorder=1)
# Key annotation: the outlier communities
# Community with high poverty but low ER rate (if any)
# Highlight Community 16 (highest poverty, highest ER)
ax.annotate('Community 16:\nHighest poverty &\nhighest ER visits\n'
'(0.8 PCPs per 1,000)',
xy=(28.4, 380),
xytext=(20, 395),
arrowprops=dict(arrowstyle='->', color='#E74C3C', lw=1.5),
fontsize=9, color='#E74C3C', fontweight='bold',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white',
edgecolor='#E74C3C', alpha=0.9))
# Highlight Community 21 (lowest poverty, lowest ER)
ax.annotate('Community 21:\nLowest poverty,\nhighest PCP access\n'
'(4.0 per 1,000)',
xy=(4.3, 95),
xytext=(8, 70),
arrowprops=dict(arrowstyle='->', color='#27AE60', lw=1.5),
fontsize=9, color='#27AE60', fontweight='bold',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white',
edgecolor='#27AE60', alpha=0.9))
# Title and labels
ax.set_title('ER Visit Rates Rise with Poverty — But Physician Access\n'
'Separates Communities with Similar Poverty Levels',
fontsize=13, color='#333333', fontweight='normal', pad=15)
ax.set_xlabel('Poverty Rate (%)', fontsize=12, color='#555555')
ax.set_ylabel('ER Visits per 1,000 Residents', fontsize=12,
color='#555555')
# Legend
legend = ax.legend(title='Primary Care Physician Access',
title_fontsize=10, fontsize=9,
loc='upper left', frameon=True, framealpha=0.9,
edgecolor='#CCCCCC')
# Clean styling
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_color('#CCCCCC')
ax.spines['bottom'].set_color('#CCCCCC')
ax.grid(True, alpha=0.15)
ax.tick_params(colors='#555555')
ax.set_xlim(0, 32)
ax.set_ylim(50, 420)
# Source
ax.text(0.99, -0.1, 'Source: Riverside County Health District, 2024-2025',
transform=ax.transAxes, fontsize=8, color='gray', ha='right')
plt.tight_layout()
plt.savefig('maya_council_chart.png', dpi=300, bbox_inches='tight')
plt.show()
print("=" * 60)
print("MAYA'S CHART DESIGN DECISIONS")
print("=" * 60)
print()
print("1. SCATTERPLOT: Shows the actual data points, not just a")
print(" summary. The council can see each community.")
print()
print("2. COLOR + SHAPE: Physician access encoded with both color")
print(" AND shape (triangles, squares, circles) for accessibility.")
print()
print("3. ANNOTATIONS: Two communities highlighted — the worst case")
print(" and the best case — to anchor the narrative.")
print()
print("4. TITLE STATES THE FINDING: Not 'Poverty and ER Visits' but")
print(" 'Physician Access Separates Communities with Similar Poverty'")
print()
print("5. REGRESSION LINE: Shown lightly (dashed, gray) to show the")
print(" overall trend without dominating the chart.")
print()
print("6. NO CHARTJUNK: Clean spines, minimal gridlines, clear labels.")
Step 3: The "Two Audiences" Problem
Maya faces a classic challenge: the city council is not a homogeneous audience. Some members have data backgrounds; most don't. Some want the bottom line; others want to drill into the numbers. Some will read her written brief carefully; others will remember only what she says aloud.
She solves this with the layered approach:
-
Layer 1 (what she says aloud): "Poverty is correlated with ER overcrowding. But when we look deeper, the real drivers are insurance access and primary care availability. Communities with similar poverty levels have very different ER rates depending on how many doctors they have."
-
Layer 2 (what's on the slide): The annotated scatterplot above — the visual that stays on screen while she speaks.
-
Layer 3 (the written brief they take home): A two-page document with the executive summary, the key chart, and a detailed table showing each community's poverty rate, uninsured rate, physician access, and ER visit rate.
-
Layer 4 (the appendix): Full regression output, methodology notes, data sources, and caveats. For the one council member who will read it at 11 PM.
# ============================================================
# MAYA'S WRITTEN BRIEF: THE TAKE-HOME DOCUMENT
# ============================================================
# Summary table for the written brief
print("=" * 70)
print("RIVERSIDE COUNTY PUBLIC HEALTH BRIEF")
print("ER Overcrowding: Drivers and Intervention Opportunities")
print("Prepared by: Dr. Maya Chen, County Health Department")
print("Date: March 2026")
print("=" * 70)
print()
print("EXECUTIVE SUMMARY")
print("-" * 70)
print("ER overcrowding in Riverside County correlates strongly with")
print("poverty (r = 0.96), but deeper analysis reveals that insurance")
print("coverage and primary care access are the direct mechanisms.")
print("In a model with all three factors, poverty alone is not")
print("significant — it works through insurance and physician access.")
print()
print("We recommend piloting an intervention in 3 high-ER communities")
print("focused on: (1) expanding Medicaid enrollment and (2) recruiting")
print("primary care physicians. Estimated 15-25% ER visit reduction.")
print()
print("KEY FINDINGS")
print("-" * 70)
print("1. Poverty explains 92% of the variation in ER visit rates")
print(" across communities (R² = 0.92)")
print()
print("2. However, when we control for insurance and physician access,")
print(" poverty is no longer a significant predictor")
print()
print("3. Each additional PCP per 1,000 residents is associated with")
print(" 55 fewer ER visits per year per 1,000 residents")
print()
print("4. Each 1% increase in uninsured rate is associated with")
print(" 11 additional ER visits per year per 1,000 residents")
print()
print("WHAT THIS MEANS FOR POLICY")
print("-" * 70)
print("Poverty doesn't directly drive ER visits — it does so through")
print("two intermediaries: lack of insurance and lack of primary care.")
print("Targeting these intermediaries directly would be faster and")
print("more cost-effective than general anti-poverty programs.")
print()
print("LIMITATIONS")
print("-" * 70)
print("• This is an observational study — we cannot prove causation")
print("• 25 communities is a modest sample size")
print("• Other factors (e.g., mental health services, transportation)")
print(" were not included in the model")
print("• A pilot program would help establish causal relationships")
print(" before committing to county-wide implementation")
The Design Decisions: Before and After
Let's look at what Maya didn't do — and why.
What Maya Could Have Done (But Didn't)
import matplotlib.pyplot as plt
import numpy as np
# ============================================================
# THE CHART MAYA REJECTED: REGRESSION TABLE ON A SLIDE
# ============================================================
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# --- BAD VERSION: Regression table on a slide ---
ax1 = axes[0]
ax1.axis('off')
table_data = [
['Variable', 'Coef.', 'SE', 't', 'p-value'],
['Intercept', '47.83', '15.21', '3.14', '0.005'],
['Poverty (%)', '2.31', '2.45', '0.94', '0.358'],
['Uninsured (%)', '7.82', '2.13', '3.67', '0.001'],
['PCP per 1K', '-42.56', '11.38', '-3.74', '0.001'],
]
table = ax1.table(cellText=table_data, loc='center', cellLoc='center')
table.auto_set_font_size(False)
table.set_fontsize(10)
table.scale(1, 1.5)
for (row, col), cell in table.get_celld().items():
if row == 0:
cell.set_facecolor('#4472C4')
cell.set_text_props(color='white', fontweight='bold')
else:
cell.set_facecolor('#F2F2F2' if row % 2 == 0 else 'white')
ax1.set_title('What Maya REJECTED:\nRegression Table on a Slide',
fontsize=12, color='#E74C3C', fontweight='bold',
pad=20)
# --- GOOD VERSION: Key finding, plain language ---
ax2 = axes[1]
ax2.axis('off')
text = ("Two factors predict ER overcrowding:\n\n"
"1. Insurance coverage\n"
" Each 1% increase in uninsured rate →\n"
" ~8 more ER visits per 1,000 residents\n\n"
"2. Primary care access\n"
" Each additional PCP per 1,000 residents →\n"
" ~43 fewer ER visits per 1,000 residents\n\n"
"Poverty is NOT a direct driver —\n"
"it works through these two factors.")
ax2.text(0.1, 0.85, text, transform=ax2.transAxes,
fontsize=12, verticalalignment='top',
fontfamily='sans-serif', color='#333333',
linespacing=1.5)
ax2.set_title('What Maya USED:\nPlain-Language Key Finding',
fontsize=12, color='#27AE60', fontweight='bold',
pad=20)
plt.tight_layout()
plt.savefig('maya_rejected_vs_used.png', dpi=150, bbox_inches='tight')
plt.show()
print("The regression table is accurate but useless for this audience.")
print("The plain-language version communicates the same finding.")
print("The table goes in the appendix. The plain language goes on the slide.")
Maya's Communication Principles
| Principle | How Maya Applied It |
|---|---|
| Lead with the punchline | Recommendation on slide 1, before any data |
| One chart to remember | The annotated scatterplot — not five regression plots |
| Layer the detail | Spoken words → slide visual → written brief → appendix |
| No jargon | "Poverty is NOT a direct driver" instead of "poverty was not significant in the multiple regression model ($p = .358$)" |
| Honest uncertainty | "This is an observational study — we cannot prove causation. A pilot would help establish causal relationships." |
| Actionable | Specific recommendation: 3 communities, Medicaid + PCPs, estimated cost, estimated impact |
The Outcome
The council approved a $2.4 million pilot program in three communities. They didn't ask about degrees of freedom or adjusted $R^2$. They asked: "Which three communities?" and "How soon can we start?"
Maya's analysis was rigorous. But it was her communication that made it useful.
Discussion Questions
-
Maya chose to de-emphasize the poverty finding, even though $r = 0.96$ is a striking correlation. Was this the right call? Could she have been accused of downplaying an important social determinant?
-
The council's question — "Which three communities?" — goes beyond Maya's analysis. How should she respond? What additional analysis would she need?
-
Maya included a Limitations section in her written brief. Some colleagues argue that including limitations "gives ammunition to people who want to do nothing." Is this a valid concern? How does Maya balance scientific honesty with advocacy?
-
If one council member (a former engineer) asks to see the full regression output during the Q&A, how should Maya handle it?
-
Imagine the same data but the opposite policy context: a state legislator arguing against Medicaid expansion wants to use Maya's finding that "poverty, not insurance, drives ER visits." How could the same data be communicated to support a different conclusion? What does this tell us about the ethics of data communication?