Glossary: Data, Society, and Responsibility

#

"deepfake": a portmanteau of "deep learning" and "fake" — originally referred specifically to AI-generated face-swapping technology. It now encompasses any synthetic media produced by AI that realistically depicts people saying or doing things they never actually said or did. → Chapter 18: Generative AI: Ethics of Creation and Deception
"disappearance of disappearance": the progressive elimination of spaces and activities that are not subject to any form of monitoring. In the panopticon, the prisoner could at least know the boundaries of observation (the cell). In the digital assemblage, there are no clear boundaries. Surveillance extends into private homes (smart → Chapter 8: Surveillance: From Panopticon to Platform
"ghost work": the hidden human labor that makes AI systems function — is particularly relevant here. The marketing narrative of generative AI emphasizes machine intelligence. The reality is that this intelligence depends on millions of hours of low-wage human labor, much of it performed under conditions that woul → Chapter 18: Generative AI: Ethics of Creation and Deception
"just transition": originally developed in the context of environmental policy to describe how workers in fossil fuel industries should be supported through the shift to clean energy — is increasingly applied to AI-driven labor displacement. → Chapter 18: Generative AI: Ethics of Creation and Deception
"liar's dividend": the ability of bad actors to dismiss authentic evidence as fabricated. A politician caught on video making a racist remark can simply claim, "That's a deepfake." The mere existence of deepfake technology provides a blanket defense against inconvenient truths. → Chapter 18: Generative AI: Ethics of Creation and Deception
"surveillant assemblage": not a single watchtower but a network of interconnected systems that together produce a comprehensive picture. Your phone tracks your location. Your credit card tracks your purchases. Your fitness tracker monitors your heartbeat. Your email provider scans your correspondence. Your smart TV records y → Chapter 8: Surveillance: From Panopticon to Platform
"the right to be let alone": a right rooted not in property but in personality, in what they called "inviolate personality." → Chapter 7: What Is Privacy? Definitions and Debates
0.6875: below 0.8 threshold. **Flagged.** - Hispanic applicants: 22.0 / 32.0 = **0.6875** — below 0.8 threshold. **Flagged.** - Asian applicants: 31.0 / 32.0 = **0.9688** — above 0.8 threshold. Not flagged. → Quiz: Bias in Data, Bias in Machines
1. System Description: What does the system do? - What decisions does it make or inform? - What data does it use? - Who built it, and who operates it? → Chapter 17: Accountability and Audit
2. Purpose and Necessity: What problem does this system address? - Could the same objective be achieved without algorithmic decision-making? - What is the human alternative, and what are its limitations? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
2. Stakeholder Identification: Who is affected by the system's decisions? - Are there populations that are disproportionately affected? - Have affected communities been consulted? → Chapter 17: Accountability and Audit
3. Bias and Fairness Analysis: Has the system been tested for disparate impact across demographic groups? - Which fairness metrics have been applied (demographic parity, equalized odds, calibration — see Chapter 15)? - What trade-offs between fairness definitions have been accepted, and why? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
3. Risk Assessment: What harms could the system cause? - What is the likelihood and severity of each harm? - Are there disparate impacts across demographic groups? - What happens when the system makes errors? → Chapter 17: Accountability and Audit
4. Mitigation Measures: What steps have been taken to reduce identified risks? - Are there human oversight mechanisms? - Is there a process for appealing algorithmic decisions? → Chapter 17: Accountability and Audit
4. Transparency and Explainability: Can affected individuals obtain a meaningful explanation of how decisions about them were made? - Is the system's logic documented in a way that enables external audit? - Are model cards (Chapter 29) available for the underlying models? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
47.3 tonnes CO2: Transatlantic flights: 47,293 / 1,600 = approximately **29.6 flights** → Quiz: Environmental Data Ethics and Climate
5. Human Oversight: Is there a human decision-maker who can override the system's output? - Under what circumstances is human review mandatory? - Are the humans who oversee the system trained to exercise independent judgment? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
5. Monitoring Plan: How will the system's performance be tracked after deployment? - What metrics will be monitored? - What triggers a reassessment? → Chapter 17: Accountability and Audit
6. Public Reporting: What information about the system will be made publicly available? - How will affected individuals be notified that they are subject to algorithmic decision-making? → Chapter 17: Accountability and Audit
6. Recourse and Redress: Can affected individuals challenge algorithmic decisions? - Is the appeals process accessible, timely, and meaningful? - Does the organization have a process for correcting errors and compensating harm? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
7. Ongoing Monitoring: How will the system be monitored for drift, degradation, and emergent bias? - What triggers a re-assessment? - How frequently is the assessment updated? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
76 working days: 244 hours of reading, at an estimated economic cost of $781 billion in lost productivity nationally. This figure has only grown as the number of digital services has increased. → Chapter 9: Data Collection and Consent

A

Access Now: *Focus:* Digital rights globally, particularly for vulnerable communities - *Notable work:* RightsCon conference, #KeepItOn campaign (internet shutdowns), digital security helpline - *Website:* [accessnow.org](https://www.accessnow.org) - *Relevance to textbook:* Chapters 32, 37 → Appendix E: Privacy Tools and Resource Directory
adequacy decision: a formal determination by the European Commission that a third country's data protection framework provides a level of protection "essentially equivalent" to that of the GDPR. → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
Advantages:: Addresses the power asymmetry between individual farmers and large technology companies - Enables collective bargaining over data terms - Keeps governance decisions in the hands of the community that generates the data - Creates incentives for data quality (cooperative members benefit from better da → Chapter 37: Global South Perspectives on Data Governance
Advisory powers (Article 58(3)):: Issue opinions on processing operations - Advise parliament, government, and other institutions on data protection matters - Approve codes of conduct and certification mechanisms → Chapter 25: Enforcement, Compliance, and the Limits of Law
Against overstatement:: Most people's media diets are shaped more by choice than by algorithms — people actively seek partisan content (Guess, 2021) - People who consume the most partisan content are also the most politically engaged — they seek it out, not just receive it - Algorithmic exposure to diverse viewpoints can a → Chapter 13: How Algorithms Shape Society
Age identification: How does the service determine users' ages? Is the method proportionate and privacy-preserving? 2. **Default settings** — Are privacy-protective settings on by default for young users? 3. **Design audit** — Does the service use dark patterns, engagement-maximizing features, or variable reward schedu → Chapter 35: Children, Teens, and Digital Vulnerability
AI Ethics Researcher: Academic or industry position focused on fairness, accountability, transparency, and social impact of AI systems - Requires technical AI knowledge combined with social science or humanities training - Employed by universities, think tanks, and technology companies - The interdisciplinary nature of t → Chapter 40: Your Responsibility — From Knowledge to Action
Algorithmic audit firms: sometimes called "AI audit firms" or "responsible AI consultancies" — offer services ranging from bias testing and fairness assessments to comprehensive algorithmic impact reviews. → Chapter 17: Accountability and Audit
Algorithmic Auditor: Conducts independent assessments of algorithmic systems for bias, fairness, transparency, and compliance - May work for consulting firms, government agencies, or independent audit organizations - Requires statistical knowledge, domain expertise, and understanding of fairness frameworks → Chapter 40: Your Responsibility — From Knowledge to Action
Algorithmic Justice League: *Focus:* Raising awareness about AI bias and promoting equitable and accountable AI - *Founded by:* Joy Buolamwini (co-author of the Gender Shades study) - *Notable work:* "Coded Bias" documentary, AI bias research, public education - *Website:* [ajl.org](https://www.ajl.org) - *Relevance to textboo → Appendix E: Privacy Tools and Resource Directory
algorithmic wage discrimination: the use of data to pay different workers different rates for substantially similar work, based on the platform's assessment of what each worker will accept. → Chapter 33: Labor, Automation, and the Gig Economy
Alternative governance models: community data governance, data cooperatives, regional frameworks — offer paths beyond the corporate and state models that dominate Western data governance. - The **global surveillance supply chain** connects domestic surveillance to global power dynamics, with the same technologies deployed across → Chapter 37: Global South Perspectives on Data Governance
amplification distinction: differentiating between hosting content and algorithmically promoting it — offers a potential way forward in the publisher/utility/platform debate. → Chapter 31: Misinformation, Disinformation, and Platform Governance
Anticipate potential harms:: Privacy harms (Chapter 7-12) - Bias and fairness harms (Chapter 14-15) - Accountability gaps (Chapter 17) - Power concentration (Chapter 5) - Environmental harms (Chapter 34) - Harms to vulnerable populations — children, marginalized communities, Global South (Chapters 32, 35, 37) 2. **Map the harm → Capstone Project 3: Speculative Design
Anticipated future systems:: Autonomous combat drones that can identify, track, and engage human targets without real-time human authorization. - Autonomous submarine and surface vessel systems capable of engaging enemy ships or submarines independently. - Swarm systems — networks of dozens or hundreds of autonomous drones coor → Case Study: Autonomous Weapons: The Campaign to Stop Killer Robots
Apply the four themes:: **Power Asymmetry:** Where are the asymmetries in this system? - **Consent Fiction:** Is consent meaningful or theatrical? - **Accountability Gap:** If the system causes harm, who is responsible? → Capstone Project 1: Data Ethics Audit
Arguments against backdoors:: **Technical impossibility of "good-only" backdoors.** Cryptographers have consistently argued that it is technically impossible to build a backdoor that is accessible only to authorized parties. Any mechanism that allows government access also creates a vulnerability that can be exploited by hackers → Chapter 36: National Security, Intelligence, and Democratic Oversight
Arguments against:: Deepens inequality — wealthy people can afford to withhold data; poor people may feel compelled to sell it - Impractical at scale — the average person generates data through hundreds of interactions daily; negotiating each one is impossible - Ignores the relational nature of data — your social media → Chapter 3: Who Owns Your Data?
Arguments for lawful access:: **Democratic accountability.** Courts issue warrants based on probable cause. Encryption that prevents the execution of lawful warrants undermines democratic governance and the rule of law. - **Public safety.** Encrypted communications are used by terrorists, child predators, and other criminals. Th → Chapter 36: National Security, Intelligence, and Democratic Oversight
Arguments for:: Gives individuals a legal basis to control their data - Creates market incentives for responsible data handling (if data has a price, companies must weigh the cost) - Aligns with familiar legal frameworks → Chapter 3: Who Owns Your Data?
Assess compliance: is the system in compliance with applicable law? 3. **Assess the gap between compliance and ethics** — where does the law fall short of ethical requirements? 4. **Evaluate existing governance mechanisms** — does the organization have an ethics program, impact assessments, audit processes? → Capstone Project 1: Data Ethics Audit
At Collection:: For every data field, ask: "Is this *necessary* for the stated purpose?" Not useful, not interesting -- *necessary*. - Distinguish between primary data (needed for the service) and secondary data (useful for other purposes). Collect the primary; scrutinize the secondary. - Use progressive collection → Chapter 10: Privacy by Design and Data Minimization
At Deletion:: Ensure deletion is real, not just a UI change. Data marked as "deleted" but retained in backups, logs, or shadow databases is not minimized. - Verify deletion: audit your systems to confirm that data scheduled for deletion has actually been removed. → Chapter 10: Privacy by Design and Data Minimization
At Processing:: Apply the principle of least privilege: grant access to data only to those who need it for their specific role. - Use privacy-preserving analytics (Sections 10.5 and 10.6) where possible. → Chapter 10: Privacy by Design and Data Minimization
At Storage:: Implement retention schedules: define how long each type of data will be kept and automate deletion when the period expires. - Store data at the lowest level of identifiability needed for each purpose. If aggregate statistics suffice, don't store individual records. → Chapter 10: Privacy by Design and Data Minimization
audit study: a controlled experiment in which matched test subjects identical in all respects except for a variable of interest (in this case, perceived race) interact with a system to detect differential treatment. → Case Study: Auditing Airbnb: Racial Discrimination in Platform Marketplaces
Auto-play: used by YouTube, Netflix, and TikTok — eliminates the decision point between videos. When one video ends, the next begins automatically. YouTube's recommendation algorithm selects the next video to maximize the probability that you'll keep watching. Research has shown that this system tends to push → Chapter 4: The Attention Economy
automation bias: the tendency to defer to the machine's recommendation, effectively rubber-stamping rather than genuinely evaluating. → Chapter 19: Autonomous Systems and Moral Machines
autonomy gradient: a gradual increase in autonomy and decrease in protection as children develop — offers a way to navigate this tension: → Chapter 35: Children, Teens, and Digital Vulnerability

B

Backcasting: Starting from a desired future outcome (e.g., "By 2045, all personal data is governed through democratic cooperatives") and working backward to identify the steps, decisions, and institutions required to get there. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
behavioral futures markets: otherwise known as the advertising market. Advertisers pay not just for your attention but for the *probability* that you will take a specific action (click, purchase, vote, believe). → Chapter 4: The Attention Economy
Benefits beyond compliance:: **Better design.** The assessment process often reveals design alternatives that are both more privacy-protective and more elegant. Constraints stimulate creativity. - **Stakeholder trust.** Published assessment summaries demonstrate to users, customers, and regulators that the organization has thou → Chapter 28: Privacy Impact Assessments and Ethical Reviews
Best interests of the child: The best interests of the child should be a primary consideration when designing and developing online services likely to be accessed by children. 2. **Data protection impact assessments** — Undertake a DPIA that considers the specific risks to children. 3. **Age-appropriate application** — Take a r → Case Study: The UK Age Appropriate Design Code in Practice
Biometric authentication: the user provides a fingerprint or iris scan, which is matched against the CIDR database - **Demographic authentication** — the user provides name, date of birth, gender, and address, which are matched against the database - **OTP authentication** — a one-time password sent to the registered mobile → Case Study: Aadhaar — India's Digital Identity Experiment
biopower: power exercised not over individual bodies but over *populations* as biological entities. Biopower operates through statistics, demographics, public health measures, and population management. → Chapter 5: Power, Knowledge, and Data
black box problem: and it is not a marginal technical concern. It is a foundational challenge for democratic governance, individual rights, and institutional accountability. When a neural network with millions of parameters denies your loan application, assigns you a risk score, or recommends a medical treatment, no h → Chapter 16: Transparency, Explainability, and the Black Box Problem
blamelessness: focusing on systemic failures rather than individual blame. This is not about excusing negligence. It is about recognizing that in complex organizations, breaches are almost always the product of *systems* — incentive structures, resource constraints, cultural norms, process gaps — rather than indiv → Chapter 30: When Things Go Wrong: Breach Response and Crisis Ethics
Bottom-up data cooperatives: Communities that don't wait for legal frameworks but create cooperatives using existing legal structures (consumer cooperatives, mutual aid organizations) and adapt them for data governance. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Building the data foundation: skills, infrastructure, standards → Chapter 22: Data Governance Frameworks and Institutions
business associates: organizations that perform functions involving protected health information (PHI) on behalf of covered entities. VitraMed, as an EHR provider, is a business associate to the clinics it serves. → Chapter 24: Sector-Specific Governance: Finance, Health, Education

C

calibration is approximately satisfied: But the FPR differs significantly (Group A ~0.17 vs. Group B ~0.09): **equalized odds is NOT satisfied** - Demographic parity is also not satisfied (selection rates differ: 0.30 vs. 0.20) → Chapter 15: Fairness — Definitions, Tensions, and Trade-offs
carbon emissions: the greenhouse gases produced by generating the electricity that powers data centers. The carbon impact of a given computation depends on the **carbon intensity** of the electricity grid in the region where the computation occurs. → Chapter 34: Environmental Data Ethics and Climate
Challenges specific to data ethics research:: **Research on social media:** Can researchers ethically analyze public social media posts without the users' consent? The posts are public, but the users did not anticipate their content would be studied. - **Research on algorithmic systems:** Black-box auditing may require creating fake accounts or → Appendix A: Research Methods for Data Ethics
Challenges:: Requires organizational capacity and governance infrastructure - Must navigate competition law (is collective data bargaining a form of cartel behavior?) - Depends on the willingness of technology companies to negotiate with cooperatives rather than collecting data directly from individual farmers - → Chapter 37: Global South Perspectives on Data Governance
Characteristics:: Specific, prescriptive requirements ("Companies must encrypt personal data using AES-256 or equivalent") - Clear compliance criteria — you either comply or you don't - Enforced through inspections, audits, and penalties → Chapter 20: The Regulatory Landscape: A Global Survey
Check base rates: if they differ across groups, the impossibility theorem applies 3. **Map the trade-offs** — what does each metric prioritize, and what does it sacrifice? 4. **Identify the most harmful errors** — in this specific context, which type of mistake causes the most damage? 5. **Engage stakeholders** — the → Chapter 15: Fairness — Definitions, Tensions, and Trade-offs
Check for recent amendments: data protection law is evolving rapidly - **Seek qualified legal counsel** for compliance decisions affecting real systems and real people → Appendix D: Legal Frameworks Reference -- Comparative Data Protection Law
Chief Data Officer (CDO): Senior organizational leader responsible for data governance, data quality, and data strategy - Increasingly expected to integrate ethical considerations into data governance programs - Ray Zhao's role at NovaCorp illustrates this pathway - Requires both technical and organizational leadership capab → Chapter 40: Your Responsibility — From Knowledge to Action
chilling effect: the tendency of surveillance to suppress lawful behavior, speech, and association --- is one of the most well-documented consequences of surveillance, and one of the most difficult to govern because the harm is *preventive*: people don't do something they would otherwise have done, and the absence o → Chapter 8: Surveillance: From Panopticon to Platform
Cities: Singapore's "Virtual Singapore" project creates a real-time 3D model of the entire city-state, simulating traffic flows, energy consumption, pedestrian movement, and the impact of proposed construction projects. - **Healthcare** — "Digital twin" models of individual patients are being developed to s → Chapter 38: Emerging Technologies and Anticipatory Governance
Civic Data Trust: an independent entity that would: → Case Study: The Sidewalk Labs Toronto Data Trust
Classify: Is this misinformation (innocent sharing), disinformation (deliberate deception), or malinformation (weaponized truth)? 2. **Trace** — How is it spreading? Through organic sharing, algorithmic amplification, coordinated networks, or cross-platform migration? 3. **Assess impact** — What are the poten → Chapter 31: Misinformation, Disinformation, and Platform Governance
Clipper Chip: a government-designed encryption chip with a built-in backdoor for law enforcement. The proposal was abandoned after intense opposition from technologists and civil liberties organizations who demonstrated that the backdoor could be exploited by adversaries. - In 2015-2016, the FBI demanded that App → Chapter 8: Surveillance: From Panopticon to Platform
Communities that bore the costs: emotional harm, cultural violation, financial exclusion, loss of control over their own biological narratives 5. **Legal systems that provided inadequate redress** — *Moore v. Regents* denied property rights over tissues; the Havasupai settlement came years after the harm was done → Case Study: Indigenous Genomic Data and the HeLa Cells Legacy
community consent: the idea that communities should have collective decision-making authority over data practices that affect them as communities --- is not well developed in existing law. Indigenous data sovereignty frameworks (discussed in Chapter 3) offer one model, in which communities assert collective governance → Chapter 9: Data Collection and Consent
Community Data Advocate: Works with communities to build data governance capacity: data literacy programs, community data cooperatives, participatory governance processes - May work for nonprofit organizations, community foundations, or government agencies - Eli's work in Detroit illustrates this role — and it is becoming a → Chapter 40: Your Responsibility — From Knowledge to Action
community data governance charter: a document that his neighborhood association could adopt as a framework for negotiating with the city about data collection in their community. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Community responses:: **Data cooperatives in East Africa.** Farmer cooperatives in Kenya and Tanzania have developed data governance frameworks that enable collective bargaining over agricultural data. Rather than each individual farmer negotiating with a technology company, the cooperative negotiates collective terms — → Chapter 37: Global South Perspectives on Data Governance
Community technology audits: Neighborhood groups that conduct their own audits of local data systems (surveillance cameras, smart city sensors, predictive policing tools) without waiting for official audit requirements. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
completeness: the percentage of non-null values per column 3. Check **consistency** — validate that values in specified columns match expected formats (e.g., email addresses, dates, phone numbers) 4. Check **uniqueness** — detect duplicate rows based on specified key columns 5. Generate a structured quality repor → Chapter 22: Data Governance Frameworks and Institutions
Completeness issues:: One patient record (`P008`) is missing a name — a critical field in health records - One patient (`P005`) has no email address - One patient (`P009`) has no phone number - One patient (`P004`) has no blood type recorded → Chapter 22: Data Governance Frameworks and Institutions
compounding: each bias multiplies the effect of the others: → Chapter 14: Bias in Data, Bias in Machines
comprehensible: the individual can actually understand what they are agreeing to - The choice is **genuine** --- refusal is possible without losing access to essential services - The scope is **specific** --- consent covers a defined purpose, not a blanket authorization - The consequences are **proportionate** --- → Chapter 9: Data Collection and Consent
consent fatigue: the tendency to stop engaging with consent mechanisms altogether, clicking "accept all" or scrolling past notices without reading them. Research documents this clearly: → Chapter 9: Data Collection and Consent
consent fiction: one of this book's four recurring themes --- moves from the background to center stage. We have encountered it in passing throughout Part 1. Here we dissect it. → Chapter 9: Data Collection and Consent
Consistency issues:: `bob@example` lacks a proper domain extension — likely a data entry error - `frank obi@mail.com` contains a space — invalid email format - `5554567890` is a phone number without dashes — valid number, inconsistent format - `2000-13-01` contains month "13" — an invalid date - `12/15/1975` uses MM/DD/ → Chapter 22: Data Governance Frameworks and Institutions
COPPA's limitations are significant:: **The age-13 threshold is arbitrary.** COPPA protects children under 13 but provides no protection for teenagers 13-17 — precisely the age group most actively using social media and most vulnerable to its harms. - **The "actual knowledge" standard creates a loophole.** Platforms that do not verify a → Chapter 35: Children, Teens, and Digital Vulnerability
Corrective powers (Article 58(2)):: Issue warnings and reprimands - Order compliance with data subject requests - Order controllers or processors to bring processing into compliance - Impose temporary or definitive bans on processing - Order the suspension of cross-border data flows - Impose administrative fines → Chapter 25: Enforcement, Compliance, and the Limits of Law
covered entities: health plans, health care clearinghouses, and health care providers who transmit health information electronically -- and their **business associates** (organizations that handle PHI on behalf of covered entities, such as billing companies, cloud storage providers, and data analytics firms). → Chapter 12: Health Data, Genetic Data, and Biometric Privacy
Critical perspectives:: Public infrastructure controlled by the state is subject to *state* surveillance, not *corporate* surveillance — but it is surveillance nonetheless. The concentration of identity, payment, and document systems in a single state-controlled stack creates surveillance capabilities that even the most po → Chapter 37: Global South Perspectives on Data Governance
Critiques:: The Aether Committee is advisory, not decision-making - Microsoft laid off its entire ethics and society team in 2023, even as it invested billions in OpenAI - The company's massive investment in generative AI (Copilot, Bing Chat) proceeded with limited public ethical review - Critics argue that Mic → Chapter 26: Building a Data Ethics Program
Cross-sector patterns: fiduciary principles, minimum necessary access, breach notification, regulatory lag, and regulatory arbitrage — reveal universal governance principles embedded in sector-specific frameworks. - **Integration** across regulatory frameworks, rather than separate compliance silos, is a best practice for → Chapter 24: Sector-Specific Governance: Finance, Health, Education
Culture change: from "move fast and break things" to responsible innovation — requires leadership commitment, psychological safety, ethics champion networks, and sustained investment. - **Incentive structures** must be redesigned to reward ethical behavior and penalize ethical shortcuts. If ethics never costs anyth → Chapter 26: Building a Data Ethics Program
Current systems with autonomous functions:: **Missile defense systems** (e.g., the U.S. Phalanx CIWS, Israel's Iron Dome) that automatically detect and intercept incoming projectiles. These operate in environments where the speed of engagement makes human decision-making impossible (milliseconds), and the targets are objects (missiles, rocket → Case Study: Autonomous Weapons: The Campaign to Stop Killer Robots

D

Dark patterns: a term coined by UX designer Harry Brignull in 2010 — are user interface design choices that manipulate users into actions they didn't intend or wouldn't choose if fully informed. Unlike persuasive design, which might argue it helps users achieve their goals more easily, dark patterns work *against* → Chapter 4: The Attention Economy
Data & Society Research Institute: *Focus:* Social and cultural implications of data and automation - *Notable work:* Research on media manipulation, platform governance, content moderation, and AI in healthcare - *Website:* [datasociety.net](https://datasociety.net) - *Relevance to textbook:* Chapters 13, 31, 33 → Appendix E: Privacy Tools and Resource Directory
data asymmetry: the platform's monopoly on the information necessary for informed decision-making. - **Algorithmic wage discrimination** uses behavioral data to pay different workers different rates for substantially similar work, based on the platform's prediction of what each worker will accept. - The evidence on → Chapter 33: Labor, Automation, and the Gig Economy
Data brokers: companies whose primary business is collecting, aggregating, and selling information about individuals -- form the backbone of this economy. → Chapter 11: The Economics of Privacy
Data collection: What data is collected about workers, and is the collection proportionate to legitimate management needs? 2. **Transparency** — Do workers know what data is collected, how it's used, and what decisions it informs? 3. **Access** — Can workers access their own data in a usable format? 4. **Voice** — D → Chapter 33: Labor, Automation, and the Gig Economy
data colonialism: the idea that contemporary data extraction practices reproduce the logic of historical colonialism: powerful actors extracting value from less powerful populations, with minimal compensation, consent, or reciprocal benefit. → Chapter 32: Digital Divide, Data Justice, and Equity
Data Ethics Audit: Conduct a comprehensive ethical audit of a real data system 2. **Policy Brief** — Draft a policy brief on a data governance challenge for a specific audience 3. **Speculative Design** — Design a data governance system for a future technology scenario → Data, Society, and Responsibility
data exhaust: information generated as a side effect of digital activities. → Chapter 1: The Data All Around Us
Data is collected by intermediaries: platforms, corporations, development projects — that provide services (weather information, market prices, credit scoring) in exchange for data access. 3. **Data is aggregated and analyzed** by entities with the computational infrastructure and analytical capability to derive value — commodity tradi → Case Study: Data Governance in African Agriculture
data justice: and they connect directly to the Power Asymmetry, Consent Fiction, and Accountability Gap themes that run throughout this text. → Chapter 32: Digital Divide, Data Justice, and Equity
Data literacy programs: Community-organized programs that build the technical knowledge needed for meaningful participation in data governance, without waiting for formal educational institutions to add data ethics to their curricula. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Data portability: the right to receive your data in a structured, commonly used format and transmit it to another service — is enshrined in the GDPR (Article 20) and has been adopted by several other jurisdictions. → Chapter 3: Who Owns Your Data?
data protection by default: personal data should be processed only to the extent necessary for each specific purpose, and by default should not be made accessible to an indefinite number of people. → Chapter 10: Privacy by Design and Data Minimization
Data Protection Officer (DPO): Required by the GDPR for organizations that process personal data at scale - Responsible for ensuring organizational compliance with data protection law - Requires knowledge of data protection regulation, risk assessment, and organizational management - Growing demand: every major company operating → Chapter 40: Your Responsibility — From Knowledge to Action
Data sovereignty: African agricultural data should be governed under African frameworks - **Benefit sharing** — entities that profit from African agricultural data should share benefits with the communities that generated it - **Capacity building** — investment in African data infrastructure, analytical capability, a → Case Study: Data Governance in African Agriculture
Data stewardship models: centralized, federated, and hybrid — determine how governance responsibility is distributed. Most mature organizations use a hybrid approach. - **Data catalogs** are essential ethical infrastructure: organizations that do not know what data they have cannot govern it responsibly, fulfill data subjec → Chapter 27: Data Stewardship and the Chief Data Officer
data trusts: proposes that personal data be managed by independent trustees who owe duties to the data subjects, rather than by the data collectors themselves. Data trusts would negotiate terms of data use on behalf of individuals, monitor compliance, and take legal action when terms are violated. → Chapter 9: Data Collection and Consent
dataveillance: coined by Roger Clarke in 1988 --- refers to the systematic monitoring of people's actions or communications through the application of information technology to personal data. While the Snowden revelations focused attention on state surveillance, corporate dataveillance had already grown to rival a → Chapter 8: Surveillance: From Panopticon to Platform
Degree programs (as of 2026):: Carnegie Mellon University -- MS in Privacy Engineering - DePaul University -- MS in Data Science with AI Ethics concentration - Georgetown University -- MS in Technology Management with Privacy and Cybersecurity track - New York University -- MS in Data Science; PhD programs at AI Now Institute - O → Appendix E: Privacy Tools and Resource Directory
Demonstrate data governance: showing that training data is representative, that biases have been identified and mitigated, and that the data pipeline meets quality standards 3. **Prepare technical documentation** sufficient for conformity assessment 4. **Implement logging** to enable post-deployment monitoring and incident inve → Chapter 21: The EU AI Act and Risk-Based Regulation
Design fiction: Creating realistic artifacts (products, advertisements, news articles, policy documents) from a speculative future, to make that future tangible and debatable. A design fiction might be a mock privacy policy from 2040, an advertisement for a neural data cooperative, or a newspaper article about the → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Device and account signals (lower weight):: Language preference, country setting, device type - Account age and stated interests (selected during onboarding) → Case Study: TikTok's Recommendation Algorithm
different organizations: most without any deliberate action. → Chapter 1: The Data All Around Us
Difficulty Guide:: ⭐ Foundational (5-10 min each) - ⭐⭐ Intermediate (10-20 min each) - ⭐⭐⭐ Challenging (20-40 min each) - ⭐⭐⭐⭐ Advanced/Research (40+ min each) → Exercises: The Data All Around Us
Diffusion models: the architecture behind systems like Stable Diffusion, DALL-E, and Midjourney — learn to generate images by reversing a noise-addition process. During training, the model learns to take a noisy image and predict what the original image looked like. At generation time, the model starts with pure nois → Chapter 18: Generative AI: Ethics of Creation and Deception
digital public infrastructure (DPI): a stack of interconnected digital systems: → Chapter 37: Global South Perspectives on Data Governance
Direct costs:: Forensic investigation and incident response - Customer notification (legally required in most jurisdictions) - Credit monitoring and identity theft protection for affected individuals - Regulatory fines and penalties - Legal fees and settlements → Chapter 11: The Economics of Privacy
Disabling direct messages for users under 16.: **Limiting notifications for younger users** (no notifications after 9 p.m. for users under 16, after 10 p.m. for users 16-17). - **Disabling duets and stitches** (features that allow users to create content alongside strangers' content) for users under 16. - **Default screen time limits** — a 60-mi → Case Study: The UK Age Appropriate Design Code in Practice
Disinformation (deliberately created to deceive):: State-sponsored campaigns: Russian and Chinese state media outlets and affiliated accounts deliberately spread false claims about the origins of the virus, the safety of Western vaccines, and the effectiveness of non-pharmaceutical interventions (EU DisinfoLab, 2021) - Commercial disinformation: sel → Case Study: The Infodemic — COVID-19 Misinformation on Social Media
dual role: contributing to environmental harm through energy consumption and e-waste, while also enabling essential environmental monitoring, climate modeling, and conservation. - **Environmental data justice** demands attention to who bears the costs of data infrastructure — from data center siting to e-waste → Chapter 34: Environmental Data Ethics and Climate
Dyan Gibbens: CEO of Trumbull Unmanned, a drone technology company 2. **Alessandro Acquisti** -- Professor of Information Technology at Carnegie Mellon, a respected privacy researcher 3. **Bubacarr Bah** -- Mathematics professor at the African Institute for Mathematical Sciences 4. **De Kai** -- Computer science → Case Study: Ethical Review in Tech: Google's AI Ethics Board Controversy

E

ego depletion: the tendency for willpower to diminish with repeated exercise. → Chapter 9: Data Collection and Consent
Electronic Frontier Foundation (EFF): *Focus:* Digital civil liberties, free speech, privacy, innovation - *Notable work:* Privacy Badger, Certbot, surveillance litigation, policy advocacy - *Website:* [eff.org](https://www.eff.org) - *Relevance to textbook:* Chapters 7, 8, 9, 36 → Appendix E: Privacy Tools and Resource Directory
Electronic Privacy Information Center (EPIC): *Focus:* Privacy, free expression, democratic values in the information age - *Notable work:* Litigation, regulatory advocacy, privacy research, AI governance - *Website:* [epic.org](https://www.epic.org) - *Relevance to textbook:* Chapters 7, 20, 25 → Appendix E: Privacy Tools and Resource Directory
Emotion recognition: also called "affect detection" --- claims to identify emotional states from facial expressions, though the scientific basis for this technology has been widely criticized by psychologists including Lisa Feldman Barrett, who argues that emotions do not map reliably to specific facial configurations - → Chapter 8: Surveillance: From Panopticon to Platform
Enabling the digital economy: creating regulatory environments that support data-driven innovation and economic growth - **Protecting individual rights** — establishing data protection standards consistent with international human rights frameworks - **Asserting data sovereignty** — ensuring that African data serves African deve → Chapter 37: Global South Perspectives on Data Governance
engagement optimization: algorithmic maximization of time spent and interactions generated - **Dark patterns** are design choices that manipulate users against their interests - **Behavioral surplus** (Zuboff) is the data extracted beyond what is needed for service improvement, used to build prediction products - The social → Chapter 4: The Attention Economy
Ensure data portability: no single company would have exclusive access to the data generated in the neighborhood → Case Study: The Sidewalk Labs Toronto Data Trust
epistemic injustice: injustice that occurs in a person's capacity as a knower. Two forms are particularly relevant to data governance: → Chapter 5: Power, Knowledge, and Data
epistemic power: the ability to control what is known, by whom, and on what terms. Facebook possessed knowledge about its own effects that no one else had access to. It used this knowledge asymmetry to shape public discourse, deflect regulatory scrutiny, and maintain the information conditions that protected its bus → Case Study: Facebook and Epistemic Power — The 2021 Whistleblower Documents
epsilon: a parameter that represents the privacy "budget." Smaller epsilon means stronger privacy (more noise, less precision). Larger epsilon means weaker privacy (less noise, more precision). → Chapter 10: Privacy by Design and Data Minimization
equal representation in outcomes: the idea that a fair system should produce the same results regardless of group membership. → Chapter 15: Fairness — Definitions, Tensions, and Trade-offs
equalized odds: the requirement that true positive rates and false positive rates be equal across groups. ProPublica's analysis demonstrated a clear violation. → Case Study: ProPublica vs. Northpointe — The COMPAS Fairness Debate
Equivalents:: Transatlantic flights: ~169 one-way flights - Car miles: ~1.29 million miles - Household electricity: ~64 years - Trees to offset: ~12,283 trees for one year → Chapter 34: Environmental Data Ethics and Climate
ethics champion network: trained individuals embedded in product teams who serve as first-line ethical reviewers and cultural ambassadors — extends the committee's reach. → Chapter 26: Building a Data Ethics Program
EU AI Act: the full risk classification framework 7. **Nissenbaum's contextual integrity** -- the original article 8. **Foucault's panopticism chapter** -- the theoretical foundation of surveillance studies 9. **Ostrom's eight principles** -- the foundation for participatory governance → Appendix C: Primary Sources Guide -- Annotated Key Documents
EU-US Privacy Shield: a new framework that attempted to address the Court's concerns by including: - Stronger self-certification requirements for US companies - Written assurances from the US intelligence community regarding proportionality and necessity limitations on surveillance - An ombudsperson mechanism for EU citi → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
European Digital Rights (EDRi): *Focus:* Digital rights in Europe - *Notable work:* Advocacy on GDPR implementation, EU AI Act, Digital Services Act - *Website:* [edri.org](https://edri.org) - *Relevance to textbook:* Chapters 20, 21 → Appendix E: Privacy Tools and Resource Directory
Evaluate transparency and explainability: can the system explain its decisions? 4. **Assess accountability mechanisms** — who is responsible when the system errs? → Capstone Project 1: Data Ethics Audit
Evening (3:00 p.m. - midnight): Ride-share app records pickup location, destination, route, driver rating - Payment card records purchases — amount, vendor, category, time, location - Streaming service logs what was watched, when viewing started, when it paused, what was skipped - Gaming platform records play sessions, in-game pur → Chapter 1: The Data All Around Us
Examples:: The **Open Data Institute** (UK) has piloted data trust frameworks for urban mobility data - **Sidewalk Labs** (a Google subsidiary) proposed a data trust for its Toronto smart city project — though the project was ultimately canceled amid privacy concerns - **MIDATA** in Switzerland operates as a h → Chapter 3: Who Owns Your Data?
exclusion error: legitimate beneficiaries being denied services because Aadhaar-based authentication fails. → Case Study: Aadhaar — India's Digital Identity Experiment
Expand the "Open Doors" policy: when a guest reports discrimination, Airbnb finds them an alternative listing or hotel room at Airbnb's expense. 5. **Increase diversity** within Airbnb's workforce and leadership. 6. **Partner with civil rights organizations** for ongoing monitoring and feedback. → Case Study: Auditing Airbnb: Racial Discrimination in Platform Marketplaces
Experian, Equifax, and TransUnion: the "Big Three" credit bureaus -- operate as data brokers in addition to their credit reporting functions, selling consumer data for marketing, risk assessment, and identity verification. → Chapter 11: The Economics of Privacy
Experiential futures: Creating immersive experiences that allow people to "inhabit" a future scenario. A workshop might simulate living in a city with pervasive ambient intelligence, complete with prop sensors, notification alerts, and governance decision points. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
explainability-accuracy trade-off: the observation that the most accurate models (deep neural networks, large ensembles) are often the hardest to explain, while the most interpretable models (linear regression, decision trees) are often less accurate. Consider the following domains: → Exercises: Transparency, Explainability, and the Black Box Problem
External pressure: from researchers, media, and advocacy organizations — was necessary to compel action. Internal accountability mechanisms alone were insufficient → Chapter 17: Accountability and Audit

F

Facebook Papers: revealed something remarkable: Facebook's own researchers had produced extensive internal studies demonstrating that the platform's products caused measurable harm to users, particularly teenagers, and that the company's algorithms amplified divisive and inflammatory content. More remarkable still, → Case Study: Facebook and Epistemic Power — The 2021 Whistleblower Documents
Failed training runs: models that diverge, crash, or underperform and must be retrained > - **Hyperparameter search** — the dozens or hundreds of smaller training runs conducted before the final training > - **Inference emissions** — the ongoing energy cost of *running* the trained model, which for widely deployed models → Chapter 34: Environmental Data Ethics and Climate
fair use: a defense under Section 107 of the Copyright Act that permits certain uses of copyrighted material without authorization. Fair use analysis considers four factors: → Case Study: The AI Art Controversy: Artists vs. Generative Models
False positive rates: the rate at which the system incorrectly matches two different people — were **10 to 100 times higher** for Black and Asian faces compared to white faces in many algorithms. - **Women of color** experienced the highest error rates across nearly all systems tested. - **Accuracy varied dramatically by → Case Study: Facial Recognition in Law Enforcement: The Detroit Case
feedback loop: one of the most dangerous properties of algorithmic systems. The algorithm's output (deploy officers here) generates the data (more arrests here) that becomes the algorithm's input (crime is high here), creating a self-fulfilling prophecy that appears to validate itself. We'll formalize this concept → Chapter 13: How Algorithms Shape Society
feedback loops: cycles in which the system's biased output becomes the input for future training, reinforcing and amplifying the original bias over time. → Chapter 14: Bias in Data, Bias in Machines
FERPA does not directly regulate EdTech companies: it regulates schools. If a school fails to impose adequate data protection requirements on its EdTech vendors, FERPA provides limited recourse. - **FERPA predates the data practices** it now governs. It was designed for paper records in filing cabinets, not for real-time behavioral tracking across d → Chapter 35: Children, Teens, and Digital Vulnerability
Findings:: Guests with African American names were accepted approximately 42% of the time, compared to approximately 50% for guests with white names - The discrimination was present among hosts of all races, both genders, and across price points - Hosts with multiple properties (professional hosts) discriminat → Chapter 17: Accountability and Audit
For each tension, apply all five frameworks:: Utilitarian: Who benefits, who is harmed, and by how much? - Deontological: Are any rights violated? Is anyone treated merely as a means? - Virtue ethics: What character traits does the system's design reflect? - Care ethics: What relationships and vulnerabilities are at stake? - Justice theory: Beh → Capstone Project 1: Data Ethics Audit
For You Page (FYP): a full-screen, vertically scrolling feed of short-form videos selected by the platform's recommendation algorithm. Unlike Instagram or Facebook, where content primarily comes from accounts you follow, TikTok's FYP can surface videos from any creator — including accounts with zero followers. The foll → Case Study: TikTok's Recommendation Algorithm
Foresight: systematic methods for identifying possible futures and their governance implications - **Engagement** — involving diverse stakeholders (including affected communities) in governance design from the earliest stages - **Integration** — embedding governance considerations into the technology developme → Chapter 38: Emerging Technologies and Anticipatory Governance
Four regulatory approaches: command-and-control, principles-based, co-regulation, and self-regulation — each carry distinctive strengths and weaknesses. - The **US sectoral model** regulates data through sector-specific statutes (HIPAA, FERPA, COPPA, FCRA) enforced by sector regulators and the FTC, leaving significant gaps. - → Chapter 20: The Regulatory Landscape: A Global Survey
four-fifths rule: if the selection rate for a protected group is less than four-fifths (80%) of the rate for the group with the highest selection rate, there is evidence of adverse impact — provides a practical threshold. → Chapter 17: Accountability and Audit
functional morality: the idea that a system can be a moral agent in a functional sense even if it lacks consciousness or subjective experience. On this view, if a system can: → Chapter 19: Autonomous Systems and Moral Machines

G

GDPR (EU/EEA): **To the supervisory authority:** Within **72 hours** of becoming aware of the breach, unless the breach is "unlikely to result in a risk to the rights and freedoms of natural persons" (Article 33). - **To data subjects:** "Without undue delay" when the breach is "likely to result in a high risk to → Chapter 30: When Things Go Wrong: Breach Response and Crisis Ethics
Genuine success stories: These are accurate information shared in good faith. They are not misinformation, disinformation, or malinformation. - **Fabricated side effect accounts linked to "natural alternative" sales websites** — This is **disinformation**: deliberately created false content designed to deceive users and dri → Quiz: Misinformation, Disinformation, and Platform Governance
global surveillance supply chain: the network of technology companies, mostly based in the US, Israel, and Europe, that develop surveillance technologies and sell them to governments worldwide: → Chapter 37: Global South Perspectives on Data Governance
Governance infrastructure:: Data governance council formed (representatives from each department, meeting monthly) - Access review process: quarterly reviews of who has access to restricted and confidential assets - Data sharing agreements: template created for all third-party data sharing, requiring governance review - Incide → Case Study: Building a Data Catalog from Scratch
Governance responses:: The FCC issued a declaratory ruling in February 2024 that AI-generated voices in robocalls violate the Telephone Consumer Protection Act - Several states enacted laws specifically prohibiting AI-generated political content within a specified period before elections - Social media platforms implement → Chapter 18: Generative AI: Ethics of Creation and Deception
Government exemptions: the Act includes broad exemptions for government data processing in the interest of "sovereignty and integrity of India," "security of the State," and "public order" → Chapter 37: Global South Perspectives on Data Governance
group fairness: they evaluate whether the system treats *groups* equitably. **Individual fairness**, proposed by Dwork et al. (2012), takes a different approach: it requires that *similar individuals receive similar treatment*. → Chapter 15: Fairness — Definitions, Tensions, and Trade-offs

H

Harvesting behavioral surplus: using data beyond what's needed for service improvement to predict and modify behavior for advertiser benefit — treats users *merely* as means. Their data is extracted for someone else's profit, with no corresponding benefit to the data subject. This violates the categorical imperative. - **Consent → Chapter 6: Ethical Frameworks for the Data Age
Health data: Medical records, genetic information, mental health status - **Financial data** — Bank accounts, credit scores, transaction histories - **Biometric data** — Fingerprints, facial geometry, iris scans, voiceprints - **Location data** — GPS coordinates, cell tower pings, IP geolocation - **Children's d → Chapter 1: The Data All Around Us
High risk: an AI system used for safety-critical purposes in public infrastructure, affecting physical security. Falls under Annex III as a system used in the management and operation of critical infrastructure. - **(b) Facial recognition against wanted-persons list:** **Unacceptable risk / Prohibited (with na → Quiz: The EU AI Act and Risk-Based Regulation
high-risk AI systems: the category most relevant to autonomous systems — the EU AI Act imposes detailed requirements: → Chapter 19: Autonomous Systems and Moral Machines
hope: not as an emotion but as a political practice. Hope, in this sense, is the disciplined commitment to working toward a better future despite uncertainty about whether that future will be realized. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
How it works:: The central team defines data standards, naming conventions, and quality requirements - All data-related decisions (new data collection, sharing, retention changes) require central approval - A unified data catalog documents all organizational data assets - The CDO has authority to enforce standards → Chapter 27: Data Stewardship and the Chief Data Officer

I

Identity layer (Aadhaar): biometric identity for authentication - **Payments layer (UPI)** — instant mobile payments between bank accounts - **Data layer (DigiLocker)** — secure storage and sharing of official documents - **Consent layer (Account Aggregator)** — framework for financial data sharing with user consent → Case Study: Aadhaar — India's Digital Identity Experiment
Immediate (30 days):: Update privacy policy to reflect current data practices and publish. - Clear the backlog of late DSARs and implement tracking to ensure future compliance within one month. - Initiate DPIAs for both high-risk processing activities; suspend automated recruitment scoring and insurance risk profiling un → Quiz: Enforcement, Compliance, and the Limits of Law
Immediate consequences:: VitraMed's Series C funding round was delayed by four months. Two potential investors withdrew. The round ultimately closed at a lower valuation. - Twelve clinics terminated their contracts. Over the following year, eight of them returned, citing VitraMed's handling of the breach as evidence of orga → Chapter 30: When Things Go Wrong: Breach Response and Crisis Ethics
Immediate remediation:: Marketing CRM migrated to secure platform (2 weeks) - Decommissioned legacy systems permanently deleted (4 weeks, after legal review) - Analytics environment retention policy implemented: copies expire after 12 months unless renewed with documented justification - System C integration project initia → Case Study: Building a Data Catalog from Scratch
Important public interest reasons: **Legal claims** - **Vital interests** of the data subject → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
In support of the concern:: Personalization does reduce exposure to cross-cutting political content (Bakshy, Messing, and Adamic, 2015, working with Facebook data) - YouTube's recommendation algorithm has been documented sending users down "rabbit holes" toward increasingly extreme content (Ribeiro et al., 2020; Ledwich and Za → Chapter 13: How Algorithms Shape Society
In this chapter, you will learn to:: Recognize the data you generate in everyday life — and the data generated *about* you without your action - Distinguish between different types and structures of data - Trace a data point from creation through storage, use, sharing, and eventual deletion - Understand why the sheer volume and variety → Chapter 1: The Data All Around Us
In this chapter, you will:: Integrate the four recurring themes into a unified analytical framework you can carry forward - Witness and evaluate Mira's and Eli's capstone presentations - Develop your own data ethics principles - Explore professional pathways in data ethics, privacy, and governance - Consider the Practitioner's → Chapter 40: Your Responsibility — From Knowledge to Action
incidental collection: the acquisition of US persons' communications that happen to involve foreign intelligence targets. When an American emails or calls someone who is a Section 702 target, both sides of the communication are collected. The American's communication is now in the NSA's database — not because the American → Chapter 36: National Security, Intelligence, and Democratic Oversight
Indirect costs:: Customer churn (the Ponemon data shows an average 2.5% increase in customer turnover following a breach) - Reputational damage and brand erosion - Increased customer acquisition costs (replacing lost customers) - Insurance premium increases - Executive time diverted to crisis management → Chapter 11: The Economics of Privacy
individual rights: privacy rights, data access rights, consent rights — are necessary but not sufficient for data equity. Individual rights assume a level playing field: each person exercises their rights independently, and the aggregate result is fair. → Chapter 32: Digital Divide, Data Justice, and Equity
Infinite scroll: pioneered by Aza Raskin in 2006 — eliminates the natural stopping point that exists in paginated content. When a web page has a "next" button, the button creates a moment of decision: continue or stop. Infinite scroll removes that moment, turning content consumption into a continuous, frictionless f → Chapter 4: The Attention Economy
information fiduciaries: entities that owe duties of care, confidentiality, and loyalty to the individuals whose data they hold, analogous to the duties that doctors, lawyers, and financial advisors owe to their clients. → Chapter 9: Data Collection and Consent
Infrastructure: Who controls the digital infrastructure? What dependencies exist, and what are their governance implications? 2. **Extraction** — How does data flow? Where is value generated, and where does value accumulate? Who bears the costs? 3. **Governance capacity** — What institutional capacity exists for da → Chapter 37: Global South Perspectives on Data Governance
Institutional governance focus: Goes beyond personal awareness to examine how organizations, regulators, and societies manage data responsibly - **Four recurring themes** woven throughout: The Power Asymmetry, The Consent Fiction, The Accountability Gap, and The VitraMed Thread - **Python integration** in 7 chapters plus a consoli → Data, Society, and Responsibility
internal governance: the organizational structures, frameworks, and practices that determine how data is actually managed day-to-day. The DAMA-DMBOK framework, data quality management, and the `DataQualityAuditor` Python class will make the abstract principles of governance concrete and measurable. Good regulation requi → Key Takeaways: Chapter 21 — The EU AI Act and Risk-Based Regulation
intersectionality: the insight that overlapping systems of oppression cannot be understood by examining each axis of disadvantage in isolation. Digital inequality is intersectional by nature. → Chapter 32: Digital Divide, Data Justice, and Equity
Invalidated the Privacy Shield: finding that the framework failed to provide protection "essentially equivalent" to EU law, for the same fundamental reasons as Safe Harbor: US surveillance law gave intelligence agencies access to transferred data without adequate proportionality limits or judicial oversight 2. **Upheld the validit → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
Invested heavily in model development: building sophisticated systems with substantial engineering resources 2. **Underinvested in monitoring** -- deploying without systematic, ongoing performance evaluation 3. **Detected failure reactively** -- through financial losses, external studies, or ad hoc review rather than through monitoring s → Case Study: When Models Drift: Real-World Deployment Failures
Investigative powers (Article 58(1)):: Order controllers and processors to provide information - Conduct data protection audits - Carry out investigations, including on-site inspections - Obtain access to premises, including data processing equipment → Chapter 25: Enforcement, Compliance, and the Limits of Law

K

Key CCPA/CPRA provisions:: Right to know what personal information is collected - Right to delete personal information - Right to opt out of the sale or sharing of personal information - Right to non-discrimination for exercising privacy rights - Right to correct inaccurate personal information (CPRA) - Right to limit use of → Chapter 20: The Regulatory Landscape: A Global Survey
Key considerations:: *Sampling* -- Who is interviewed matters enormously. Interviewing only corporate executives about data governance produces a different picture than interviewing data subjects, community organizers, or frontline workers. - *Semi-structured formats* are most common in data ethics research: the researc → Appendix A: Research Methods for Data Ethics
Key features:: Consent-based framework with "deemed consent" for certain specified purposes - Rights of data principals: access, correction, erasure, grievance redressal, nomination of a representative - Obligations on data fiduciaries (India's term for data controllers) - Data Protection Board of India as the enf → Chapter 20: The Regulatory Landscape: A Global Survey
Key governance conditions for the pilot:: All alternative data sources entered into the data catalog before any data is ingested - The `DataLineageTracker` applied to track each data source's origin, transformations, and access - Disaggregated performance monitoring by race, age, gender, and geography - Consumer disclosure: applicants infor → Case Study: The CDO's Dilemma: Innovation vs. Governance at NovaCorp
Key incidents:: **The Biden robocall (January 2024):** An AI-generated voice clone of President Biden was used in robocalls to New Hampshire voters, encouraging them to skip the primary election. The calls were convincing enough that many recipients did not realize they were synthetic. The perpetrator, a political → Chapter 18: Generative AI: Ethics of Creation and Deception
Key provisions:: **Parent/student rights:** Parents (and students over 18) have the right to inspect education records, request corrections, and consent to disclosures - **Directory information exception:** "Directory information" (name, address, phone number, enrollment dates, degrees received) can be disclosed wit → Chapter 24: Sector-Specific Governance: Finance, Health, Education
Key questions for analysis:: Apple's epsilon values reportedly range from 1 to 8 per data type, with a daily per-user budget of 1 to 4. Are these values sufficient for meaningful privacy protection? - Apple's model is "local" differential privacy (noise added on device). How does this differ from "global" differential privacy ( → Chapter 10: Privacy by Design and Data Minimization
Key systems:: **EHR System A** (Epic): Used by 28 clinics, covering approximately 250,000 patients - **EHR System B** (Cerner/Oracle Health): Used by 12 clinics, covering approximately 95,000 patients - **EHR System C** (eClinicalWorks): Used by 5 rural clinics, covering approximately 35,000 patients - **Data war → Case Study: Building a Data Catalog from Scratch

L

Law has inherent limits: regulatory lag, technical complexity, jurisdictional boundaries, the inability to mandate values, and the persistence of structural power. - **Self-regulation** has a poor track record in data protection but can be effective with aligned incentives, credible regulatory backstops, independent verific → Chapter 25: Enforcement, Compliance, and the Limits of Law
Lawfulness, fairness, and transparency: Data must be processed lawfully, fairly, and transparently. 2. **Purpose limitation** — Data must be collected for specified, explicit, and legitimate purposes. 3. **Data minimization** — Only data that is adequate, relevant, and limited to what is necessary may be collected. 4. **Accuracy** — Data → Chapter 20: The Regulatory Landscape: A Global Survey
leapfrogging: the idea that developing countries can skip stages of technological development that wealthy countries went through — has been a recurring theme in development discourse. The classic example is mobile phones: many African countries went directly from minimal landline infrastructure to widespread mob → Chapter 37: Global South Perspectives on Data Governance
Legislative responses:: The **Kids Online Safety Act (KOSA)**, introduced in the US Senate in 2022 and passed in 2024, requires platforms to provide minors with options to protect their information, disable addictive product features, and opt out of personalized recommendations. - **Utah's Social Media Regulation Act (2023 → Chapter 35: Children, Teens, and Digital Vulnerability
Legitimate aim: What security objective does the surveillance serve? Is the objective genuine and specific, or vague and expansive? 2. **Necessity** — Could the objective be achieved through less intrusive means? What evidence supports the claim that mass collection is necessary? 3. **Proportionality** — Do the pri → Chapter 36: National Security, Intelligence, and Democratic Oversight
Limitations of congressional oversight:: **Classification constraints.** Committee members receive classified briefings but often cannot discuss what they learn with the full Congress, the public, or even their own staff. Senator Ron Wyden spent years warning about mass surveillance in cryptic public statements ("I want to deliver a warnin → Chapter 36: National Security, Intelligence, and Democratic Oversight
Limitations of IRBs for data ethics research:: IRBs were designed primarily for biomedical and behavioral research and may lack expertise in evaluating computational research involving large datasets, algorithmic systems, or online platforms. - Research involving publicly available data (social media posts, government records) may not require IR → Appendix A: Research Methods for Data Ethics
Limitations of LIME:: The explanation depends on how perturbations are generated — different perturbation strategies can produce different explanations - The local approximation may not be faithful to the model's actual behavior if the decision boundary is complex in that region - Does not guarantee consistency — explain → Chapter 16: Transparency, Explainability, and the Black Box Problem
Limitations of SHAP:: Computationally expensive — exact Shapley values require evaluating all feature subsets, which grows exponentially with the number of features - Approximations (KernelSHAP, TreeSHAP) are faster but introduce their own assumptions - Feature interactions can be missed — Shapley values attribute import → Chapter 16: Transparency, Explainability, and the Black Box Problem
Limitations:: Can justify sacrificing minorities for the benefit of majorities ("tyranny of the majority") - Requires quantifying values (privacy, dignity, autonomy) that resist quantification - Ignores distributive fairness — an outcome where 100 people gain $1 and 1 person loses $100 has the same utilitarian va → Chapter 6: Ethical Frameworks for the Data Age
linkage attack: is the fundamental threat to anonymization. It demonstrates that whether data is identifiable depends not on the data itself but on what *other* data is available to an attacker. → Chapter 10: Privacy by Design and Data Minimization
locked room: the neural network's 2 million parameters create technical opacity that even the data science team cannot fully penetrate. The model is not deliberately hidden; it is genuinely too complex for humans to interpret at the level of individual weights and interactions. However, if the hospital refuses t → Quiz: Transparency, Explainability, and the Black Box Problem
Long-term (6-12 months):: Restructure DPO reporting: the DPO should report to the board, not the CEO, to ensure independence. - Implement a governance review cycle: quarterly audit of privacy policy currency, DSAR response times, DPIA status, and retention compliance. - Engage an external auditor for annual GDPR compliance a → Quiz: Enforcement, Compliance, and the Limits of Law

M

manufacturing consent: design techniques that manipulate users into granting permissions they would not grant if the choice were presented fairly. → Chapter 9: Data Collection and Consent
Map power dynamics: who benefits most? Who bears the most risk? Who has decision-making authority? 3. **Elicit values** — for each stakeholder group, what values are at stake? (Privacy, autonomy, safety, innovation, equity, efficiency, dignity) 4. **Identify value conflicts** — where do stakeholder values conflict? The → Capstone Project 3: Speculative Design
Measure: Calculate the energy consumption and carbon emissions of training and inference using tools like CarbonEstimator 2. **Compare** — Evaluate alternative configurations (different hardware, regions, model sizes) and their environmental tradeoffs 3. **Contextualize** — Express emissions in familiar equi → Chapter 34: Environmental Data Ethics and Climate
Medical BCIs: enabling paralyzed patients to control prosthetic limbs or type text through thought alone (e.g., BrainGate, Synchron) - **Consumer neurotechnology** — EEG headbands marketed for meditation, focus training, and sleep improvement (e.g., Muse, Emotiv) - **Research BCIs** — high-resolution neural recor → Chapter 38: Emerging Technologies and Anticipatory Governance
medical ethics: and specifically in the aftermath of atrocity. → Chapter 9: Data Collection and Consent
Medium-term (90 days):: Develop a specific data retention schedule with defined timeframes for each data category and processing purpose. - Implement automated DSAR tracking and deadline alerting. - Conduct a comprehensive data inventory to identify any additional high-risk processing requiring DPIAs. - Provide GDPR traini → Quiz: Enforcement, Compliance, and the Limits of Law
Metadata: data about data — can be as revealing as the data itself. - **Datafication** transforms qualitative human experiences into quantitative data points. - **Data exhaust** is the information generated as a byproduct of digital activities, often without the user's knowledge. - The **data lifecycle** (col → Chapter 1: The Data All Around Us
Metadata management: including data catalogs and data lineage — is the infrastructure that makes governance operationally possible. - **Maturity models** provide a framework for assessing current capabilities and planning governance improvements. → Chapter 22: Data Governance Frameworks and Institutions
Midday (9:00 a.m. - 3:00 p.m.): Learning management system (LMS) records login time, pages viewed, time on each page, quiz attempts - Campus library system logs book checkouts and database searches - Social media platforms record posts, likes, shares, scroll time, ad impressions - Text messages transit through carrier servers with → Chapter 1: The Data All Around Us
Mira's proposal included:: A five-member board with two external members (a bioethicist and a patient advocate) - Review authority over all predictive analytics products and data sharing agreements - A confidential reporting channel for employees and clinicians - Integration of ethical risk assessment into VitraMed's product → Chapter 26: Building a Data Ethics Program
Misinformation (shared without intent to deceive):: Well-meaning individuals sharing preliminary studies that were later retracted or contradicted - Parents forwarding alarming but inaccurate claims about children and COVID because they wanted to protect their families - Users sharing satire or speculation that was misinterpreted as factual → Case Study: The Infodemic — COVID-19 Misinformation on Social Media
missing data: the systematic absence of data about marginalized populations, which renders those populations invisible to data-driven decision-making. → Chapter 32: Digital Divide, Data Justice, and Equity
Mood logs and journal entries: mental health data. Not covered by HIPAA (MindWell is not a covered entity). No sector-specific federal protection. - **Voice recordings** — biometric data (voiceprint) and potentially health data (voice analysis can reveal emotional states). Covered by BIPA in Illinois (voiceprint is explicitly lis → Quiz: Health Data, Genetic Data, and Biometric Privacy
Moral Machine experiment: an online platform that presented users with variations of the trolley problem applied to autonomous vehicles and asked them to choose which outcome they preferred. The experiment collected over 40 million decisions from users in 233 countries and territories, making it one of the largest studies of → Chapter 19: Autonomous Systems and Moral Machines
moral pluralism: the view that multiple ethical frameworks each capture genuine moral truths, and that practical ethics requires navigating among them. → Chapter 6: Ethical Frameworks for the Data Age
Morning (6:00 a.m. - 9:00 a.m.): Sleep tracker records sleep stages, duration, heart rate, blood oxygen levels - Phone alarm logs wake time; screen usage tracking begins - Smart meter records electricity usage spike (shower, lights, coffee maker) - Bathroom scale (if smart) records weight, BMI, body fat percentage - Streaming music → Chapter 1: The Data All Around Us
Motor intention: what movements the user intends to make (the primary use case) - **Cognitive state** — attention levels, mental effort, fatigue - **Emotional indicators** — neural correlates of stress, frustration, excitement, sadness - **Potentially, thought patterns** — as resolution and decoding improve, neural → Case Study: Neuralink and Neural Data Governance

N

naive optimism: the belief that technology will solve its own problems, that market forces will produce ethical outcomes, that good intentions are sufficient. This textbook has, we hope, dismantled that belief thoroughly. The second is **sophisticated despair** — the belief that the problems are so deep, the power → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
New enforcement:: **California Privacy Protection Agency (CPPA):** The CPRA created a dedicated enforcement agency — the first state-level data protection authority in the United States. The CPPA was given investigative and enforcement powers, a dedicated budget, and a board appointed through a process designed to en → Case Study: California's CCPA/CPRA: The American Experiment
New rights:: **Right to Correct:** Consumers could request the correction of inaccurate personal information — a right the CCPA had lacked. - **Right to Limit Use of Sensitive Personal Information:** A new category of "sensitive personal information" (including precise geolocation, racial or ethnic origin, healt → Case Study: California's CCPA/CPRA: The American Experiment
non-consensual intimate imagery (NCII): sexually explicit images or videos that realistically depict real people who did not consent to the creation or distribution of such content. Studies estimate that the vast majority (over 90%) of deepfake content online is non-consensual pornography, and the overwhelming majority of targets are wome → Chapter 18: Generative AI: Ethics of Creation and Deception

O

Obfuscation: described by Finn Brunton and Helen Nissenbaum in their 2015 book — refers to the deliberate production of misleading data to confuse surveillance systems. Browser extensions that generate random search queries, strategies for polluting advertising profiles, and community practices of data noise all → Chapter 5: Power, Knowledge, and Data
Open-source governance tools: Developers who create free, open-source tools for data governance — consent management platforms, algorithmic audit tools, cooperative management systems — making participatory governance infrastructure available to anyone. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Organizational challenges:: Appointment of a Data Protection Officer (required because VitraMed processes health data at scale) - Development of DPIA processes for high-risk processing activities - Staff training across the organization - Establishment of a relationship with an EU supervisory authority (lead authority under th → Chapter 20: The Regulatory Landscape: A Global Survey
Organizational ethics programs: the subject of Chapter 26 - **Professional norms** — ethical standards within data science, engineering, and product management communities - **Market pressure** — consumers choosing privacy-respecting products and services - **Civil society** — organizations like DataRights Alliance that advocate, → Chapter 25: Enforcement, Compliance, and the Limits of Law
Override frequency increasing: a signal that human decision-makers were compensating for model limitations rather than flagging them for review → Case Study: When Models Drift: Real-World Deployment Failures

P

pacing problem: is not an accident. It reflects structural features of how technology develops and how governance operates: → Chapter 38: Emerging Technologies and Anticipatory Governance
panopticon: Jeremy Bentham's 18th-century prison design in which a central watchtower allows a single guard to observe all prisoners, while the prisoners cannot tell whether they are being watched at any given moment. → Chapter 5: Power, Knowledge, and Data
Partnership on AI: *Focus:* Multi-stakeholder collaboration on AI's impact on society - *Members:* Technology companies, civil society organizations, academic institutions - *Notable work:* Research on fair, transparent, and accountable AI; guidelines for synthetic media - *Website:* [partnershiponai.org](https://www. → Appendix E: Privacy Tools and Resource Directory
persistent identification: the ability to know who you are, wherever you are, without your knowledge or consent. → Chapter 12: Health Data, Genetic Data, and Biometric Privacy
Phase 1: Foundation (Months 1-6): Established the Executive Data Governance Council with CFO sponsorship - Hired a Data Governance Program Manager and two Data Quality Analysts - Conducted an initial data landscape assessment — inventorying all 87 databases - Defined the top-20 critical data elements (customer ID, account balance, t → Chapter 22: Data Governance Frameworks and Institutions
Phase 2: Operationalization (Months 7-18): Appointed 12 Domain Data Stewards across business units - Implemented a commercial data catalog (Alation) and began populating it - Deployed automated data quality monitoring for the top-20 critical data elements - Established data quality dashboards visible to executive leadership - Developed and p → Chapter 22: Data Governance Frameworks and Institutions
Phase 3: Maturation (Months 19-36): Expanded quality monitoring to cover 200+ data elements - Implemented data lineage tracking for regulatory reporting data flows - Achieved passing marks on regulatory data management exams - Measured ROI: $3.1 million in annual savings from reduced rework and regulatory findings - Began integrating → Chapter 22: Data Governance Frameworks and Institutions
Pillar 1: Privacy by Design (from Part 2): All new products undergo a Data Protection Impact Assessment before development begins, not after deployment - Data minimization is the engineering default — every data field must be justified - Post-quantum encryption for all health data, implemented immediately - Dynamic consent model for continuo → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Pillar 3: Community Engagement (from Parts 6-7): A Patient Data Advisory Council — modeled on Eli's Community Data Council — composed of patients, healthcare workers, community health advocates, and privacy experts - Quarterly community meetings in the geographic areas where VitraMed's products are deployed - An annual health equity report, examin → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Pillar 5: Organizational Culture (from Part 5): Ethics training for all employees, not as annual compliance but as ongoing professional development - Protection for internal dissenters who raise ethical concerns (whistleblower protections) - Executive compensation tied in part to governance metrics (audit results, community feedback, equity outco → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Platform responses:: Meta introduced "teen accounts" on Instagram in 2024, with restrictions on content recommendations, messaging, and notifications. - TikTok limited screen time for users under 18 to 60 minutes per day (with the ability to override). - YouTube established a separate YouTube Kids platform with curated → Chapter 35: Children, Teens, and Digital Vulnerability
pluralistic approach: different governance mechanisms for different contexts, informed by the nature of the data, the power dynamics involved, and the values at stake. → Chapter 3: Who Owns Your Data?
Policy Analyst (Technology/Data Governance): Analyzes technology policy for government agencies, advocacy organizations, or international bodies - Drafts policy proposals, evaluates regulatory impact, and advises legislators - Requires knowledge of technology, law, and political processes - Sofia Reyes's career path at DataRights Alliance illu → Chapter 40: Your Responsibility — From Knowledge to Action
Potential leapfrogging opportunities:: **Skip the notice-and-consent paradigm.** Western data governance is built on individual notice and consent — a model that we've repeatedly identified as a Consent Fiction (Chapters 9, 31, 33). Global South countries developing data governance from scratch could design frameworks based on collective → Chapter 37: Global South Perspectives on Data Governance
prefigurative politics: the idea, articulated by movements from the Spanish anarchists to the civil rights movement to Occupy Wall Street, that the means of change should embody the ends. If you want a democratic data governance system, you must practice democratic data governance now, not after the revolution. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
Premium discount determination: maps to healthcare/insurance domain; stakes are **high** because it directly affects the cost of essential coverage, and members with lower scores pay effectively more. - **Preventive care program enrollment** — maps to healthcare domain; stakes are **high** because access to preventive care can aff → Quiz: How Algorithms Shape Society
Principles:: Data governance decisions are made by the communities most affected by those decisions - Data is treated as a collective resource to be governed collectively, not as an individual asset to be managed through individual consent - Governance structures draw on community norms, cultural values, and exi → Chapter 37: Global South Perspectives on Data Governance
Prioritize findings: which issues are most urgent? Most impactful? 2. **Develop actionable recommendations** — specific, feasible changes the organization could implement 3. **Categorize recommendations** by timeline: - Immediate (0-3 months): quick fixes, policy changes - Medium-term (3-12 months): system redesign, gov → Capstone Project 1: Data Ethics Audit
PRISM: A program through which the NSA obtained data directly from the servers of nine major technology companies (Microsoft, Yahoo, Google, Facebook, PalTalk, YouTube, Skype, AOL, and Apple). The data included emails, chat messages, videos, photos, stored data, VoIP conversations, file transfers, and soci → Chapter 8: Surveillance: From Panopticon to Platform
Privacy characteristics of centralized models:: The government holds a database of who was near whom and when. - The data can potentially be repurposed for non-public-health purposes (law enforcement, immigration enforcement, political surveillance). - The system requires trust in the government's commitment to purpose limitation. - Contact data → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
Privacy characteristics of decentralized models:: No central database of contact events exists. - The government or public health authority never learns who was exposed to whom. - The system is resistant to repurposing because the data needed for surveillance is never collected. - Individual users retain control — they must actively choose to uploa → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
Privacy Engineer: Designs and implements privacy-preserving technologies: differential privacy, federated learning, consent management systems, data minimization architectures - Requires computer science background with specialization in privacy-enhancing technologies - Growing field as privacy-by-design requirements → Chapter 40: Your Responsibility — From Knowledge to Action
Privacy-focused browsers and search engines: Firefox with privacy extensions, Brave browser, DuckDuckGo search --- reduce tracking by blocking third-party cookies, fingerprinting, and data collection. → Chapter 8: Surveillance: From Panopticon to Platform
private right of action: the ability of individuals to sue. Most privacy statutes rely on enforcement by government agencies (the FTC, state attorneys general, data protection authorities). Government enforcement is inherently limited by resources, political will, and competing priorities. → Chapter 12: Health Data, Genetic Data, and Biometric Privacy
proactive provenance: embedding information about a content's origin at the time of creation, rather than trying to detect it after the fact. → Chapter 18: Generative AI: Ethics of Creation and Deception
Prohibited secondary uses:: Decisions detrimental to individuals based on their health data - Advertising or marketing - Increasing insurance premiums - Developing products or services that could be harmful - Making decisions about employment based on health data → Chapter 24: Sector-Specific Governance: Finance, Health, Education
Proponents of centralization argued:: Public health authorities need identifiable data to conduct effective contact tracing — not just anonymous notifications. - Centralized systems allow epidemiologists to analyze contact patterns, identify super-spreader events, and allocate resources. - Democratic governments can be trusted with cont → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
Proponents of decentralization argued:: Centralized databases create irresistible targets for repurposing. History demonstrates that data collected for one purpose is routinely used for others. - Many populations — immigrants, political dissidents, marginalized communities — have well-founded reasons not to trust government databases. - A → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
provenance: establishing where content came from and how it was created — becomes a governance priority. → Chapter 18: Generative AI: Ethics of Creation and Deception
provider: it developed and trained the AI system. The city's transit authority is the **deployer** — it will operate the system in its transit network. Under the Act, TransitGuard bears obligations for conformity assessment, technical documentation, data governance, accuracy, and robustness. The transit autho → Quiz: The EU AI Act and Risk-Based Regulation
psychological safety: the confidence that raising concerns will not result in retaliation, marginalization, or career damage. Research by Amy Edmondson at Harvard Business School demonstrates that teams with high psychological safety report errors more quickly, learn more effectively, and make better decisions. → Chapter 26: Building a Data Ethics Program
Public advocacy: raising awareness of the threat through media campaigns, publications, and public events - **Diplomatic engagement** — participating in CCW discussions and lobbying governments to support a preemptive ban - **Academic and technical engagement** — partnering with AI researchers and ethicists to build → Case Study: Autonomous Weapons: The Campaign to Stop Killer Robots
public good: it is non-rivalrous (my enjoyment of privacy doesn't diminish yours) and, to a degree, non-excludable (the social benefits of a privacy-respecting culture accrue to everyone, not just those who actively protect their own privacy). → Chapter 11: The Economics of Privacy
public-key cryptography: the mathematical foundation of secure communication on the internet. When you connect to your bank's website, send an encrypted message, or authenticate a digital signature, you rely on algorithms (like RSA and elliptic curve cryptography) that are secure because classical computers cannot solve cer → Chapter 38: Emerging Technologies and Anticipatory Governance
Purpose limitation: traffic sensors collect traffic data, not faces, not conversations, not license plate numbers. 2. **Edge processing** -- data is processed at the sensor and only aggregate counts are transmitted. No raw video leaves the camera. 3. **Automatic deletion** -- raw data is overwritten every 24 hours. Onl → Chapter 10: Privacy by Design and Data Minimization

R

Recommended considerations when choosing a VPN:: **No-logs policy:** Choose providers that have been independently audited to verify they do not log user activity (e.g., Mullvad, Proton VPN, IVPN) - **Jurisdiction:** The provider's legal jurisdiction determines which government requests for data it must comply with - **Open-source:** Providers tha → Appendix E: Privacy Tools and Resource Directory
Reformed Governance Framework: not an abstract policy document but a concrete, implementable plan for how VitraMed should govern data going forward, incorporating everything she'd learned. → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
regulatory arbitrage: structuring activities to fall under the jurisdiction with the weakest requirements. → Chapter 24: Sector-Specific Governance: Finance, Health, Education
relationship of vulnerability: they are sick, they depend on the system for care, and they have limited power over how their data is used. - A care ethics perspective would ask not "does the aggregate benefit justify the risk?" (utilitarian) or "did the patient consent?" (deontological) but "are we taking responsibility for the v → Chapter 6: Ethical Frameworks for the Data Age
Representation: Who is in the data? Who is missing? What are the consequences of absence? 2. **Access** — Who can access and use the data system? What barriers exist? 3. **Benefit** — Who benefits from the data system? Are benefits equitably distributed? 4. **Harm** — Who is harmed? Are harms disproportionately bor → Chapter 32: Digital Divide, Data Justice, and Equity
Reputable providers (as of 2026):: **Mullvad VPN** -- Swedish jurisdiction; accepts cash payment; independently audited; no email required for account creation - **Proton VPN** -- Swiss jurisdiction; by the makers of ProtonMail; free tier available; independently audited - **IVPN** -- Gibraltar jurisdiction; open-source; independentl → Appendix E: Privacy Tools and Resource Directory
Research centers:: Berkman Klein Center for Internet & Society (Harvard) - Center for Information Technology Policy (Princeton) - Citizen Lab (University of Toronto) - Future of Humanity Institute (Oxford) - Leverhulme Centre for the Future of Intelligence (Cambridge) - Montreal AI Ethics Institute → Appendix E: Privacy Tools and Resource Directory
Respect persons: treat data subjects as people, not resources 2. **Seek fairness** — examine for bias and refuse to accept disparate impact as acceptable 3. **Practice transparency** — make systems understandable to those affected 4. **Accept accountability** — do not deflect responsibility when harm occurs 5. **Con → Key Takeaways: Chapter 40 — Your Responsibility: From Knowledge to Action
Results after one year:: 47 repositories cataloged - Subject access requests fulfillable in 5 business days (down from "unable to fulfill") - 8 data repositories identified as containing data past its retention period — subsequently deleted - 3 unauthorized data sharing agreements discovered and terminated - 1 data quality → Chapter 27: Data Stewardship and the Chief Data Officer
right not to know: the idea that individuals have a right *not* to receive information about their future health if they do not want it. VitraMed's current informed consent process does not address predictive findings. → Chapter 38: Emerging Technologies and Anticipatory Governance
Right to explanation: addresses the algorithmic data asymmetry (workers experience outputs but cannot see logic). 2. Aggregate earnings data = **Right to collective data** — addresses the earnings data asymmetry (workers know only their own earnings, preventing comparison and collective bargaining). 3. Human review for d → Quiz: Labor, Automation, and the Gig Economy
Risk-based classification:: *Unacceptable risk (prohibited)*: Social scoring by public authorities; subliminal manipulation; exploitation of vulnerabilities; real-time remote biometric identification in public spaces (with limited law enforcement exceptions) - *High risk*: AI in critical infrastructure, education, employment, → Appendix D: Legal Frameworks Reference -- Comparative Data Protection Law

S

Section 1: Project Overview: **Project name:** VitraMed Predictive Health Analytics (PHA) Platform v2.0 - **Processing purpose:** Predict patient risk for chronic conditions to enable preventive care interventions - **Legal basis:** Legitimate interest (patient health improvement); consent (for data sharing with insurance partn → Chapter 28: Privacy Impact Assessments and Ethical Reviews
Section 2: Necessity and Proportionality: Is the processing necessary to achieve the stated purpose? - Could the same purpose be achieved with less data, less intrusive methods, or without personal data? - Is the data collection proportionate to the benefit sought? - Have data minimization principles been applied? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
Section 5: Stakeholder Consultation: Have data subjects or their representatives been consulted? (If not, why not?) - Has the Data Protection Officer been consulted? - Has the ethics committee reviewed this assessment? - Have technical security experts reviewed the safeguards? → Chapter 28: Privacy Impact Assessments and Ethical Reviews
Section 6: Decision and Sign-Off: [ ] Risks are acceptable after mitigation — proceed - [ ] Risks remain high — escalate to ethics committee/DPO - [ ] Risks cannot be mitigated — prior consultation with supervisory authority required - [ ] Risks cannot be mitigated — do not proceed → Chapter 28: Privacy Impact Assessments and Ethical Reviews
sector-specific governance: the specialized frameworks that govern data in finance, health, and education. These sectors present unique cross-border challenges: health data crossing borders for clinical trials and telemedicine, financial data flowing through global payment networks, and education data accompanying students acr → Key Takeaways: Chapter 23 — Cross-Border Data Flows and Digital Sovereignty
Sector-Specific Requirements: **HIPAA (health data):** Breach notification to affected individuals within 60 days. Breaches affecting 500+ individuals must be reported to HHS and the media. The HHS "Wall of Shame" publicly lists healthcare breaches. - **GLBA (financial data):** Financial institutions must notify customers of bre → Chapter 30: When Things Go Wrong: Breach Response and Crisis Ethics
several thousand: and that excludes the continuous background data collection by their phone's operating system, advertising trackers embedded in apps and websites, and the passive sensors (cameras, WiFi sniffers, smart meters) that Jordan passed without interaction. → Case Study: A Day in Data — Mapping Jordan's Digital Footprint
Shapley values: a method for fairly distributing the "payout" of a cooperative game among players — to feature attribution. → Chapter 16: Transparency, Explainability, and the Black Box Problem
Singapore: Personal Data Protection Act (PDPA): covers all personal data MediLink processes, including patient health data and payment information. - **India: Digital Personal Data Protection Act (DPDPA)** — covers personal data of Indian patients, with specific requirements for health data and cross-border transfers. - **United Kingdom: UK GDPR → Quiz: The Regulatory Landscape: A Global Survey
Six types of bias: historical, representation, measurement, aggregation, evaluation, and deployment — can enter at different stages of the ML pipeline. Each requires different interventions. - The **bias pipeline** traces how bias enters at every stage from problem formulation through deployment. Biases introduced ear → Chapter 14: Bias in Data, Bias in Machines
Skills Applied:: Identifying data types (personal, metadata, data exhaust, sensitive categories) in real-world scenarios - Tracing data through the lifecycle (collection, storage, processing, analysis, sharing, retention, deletion) - Analyzing stakeholder interests and power asymmetries in data flows - Distinguishin → Case Study: A Day in Data — Mapping Jordan's Digital Footprint
society: asking how data systems interact with the deepest challenges of our time: misinformation, inequality, labor transformation, environmental crisis, children's vulnerability, national security, and global justice. → Part 6: Society, Justice, and Emerging Frontiers
Sousveillance: a term coined by Steve Mann — refers to the monitoring of authorities *by* the public, rather than the monitoring of the public *by* authorities. Body cameras on police officers, citizen journalism documenting government actions, and leak platforms like WikiLeaks all represent forms of sousveillance → Chapter 5: Power, Knowledge, and Data
Sovereignty declaration: Who has authority over community data? > 2. **Governance body** — How is the community represented in governance decisions? > 3. **Consent mechanism** — How are data collection activities approved? > 4. **Purpose limitation** — What restrictions apply to data use and sharing? > 5. **Equity audit** — → Chapter 39: Designing Data Futures — Participation, Imagination, and Hope
splinternet: fragmentation of the global internet — is a growing concern as more countries implement digital sovereignty measures. → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
stakeholder deliberation: a process in which the people affected by the system have a voice in determining what fairness means in their context. → Chapter 15: Fairness — Definitions, Tensions, and Trade-offs
Standard Contractual Clauses: pre-approved contractual templates adopted by the European Commission that impose GDPR-level data protection obligations on the data recipient. → Chapter 23: Cross-Border Data Flows and Digital Sovereignty
Stanford Institute for Human-Centered AI (HAI): *Focus:* Advancing AI research, education, policy, and practice for the benefit of humanity - *Notable work:* Annual AI Index Report; research on AI governance, foundation models, and AI in healthcare - *Website:* [hai.stanford.edu](https://hai.stanford.edu) - *Relevance to textbook:* Chapters 18, 2 → Appendix E: Privacy Tools and Resource Directory
Step 1: Describe the Practice: What data is being collected, used, or shared? - What is the stated purpose? - Who are the data subjects? - What decisions will be made based on this data? → Chapter 26: Building a Data Ethics Program
Step 2: Stakeholder Mapping: Who benefits from this practice? - Who bears the risks? - Who has been consulted? - Who is absent from the decision-making process? - Are there vulnerable populations involved? → Chapter 26: Building a Data Ethics Program
Step 3: Apply each framework:: **Utilitarian:** What are the consequences? Who benefits, who is harmed, and by how much? - **Deontological:** What duties and rights are involved? Is anyone being treated merely as a means? - **Virtue ethics:** What would a person of practical wisdom do? What character traits does this situation de → Chapter 6: Ethical Frameworks for the Data Age
Step 3: Ethical Analysis (Multi-Framework): *Consequences:* What are the best-case and worst-case outcomes? How likely is each? - *Rights and dignity:* Does this practice respect the autonomy and dignity of data subjects? Is consent meaningful? - *Fairness:* Would this practice be acceptable behind Rawls's veil of ignorance? Does it dispropor → Chapter 26: Building a Data Ethics Program
Step 5: Equivalents.: Transatlantic flights: 60.32 / 1600 = 0.04 flights (a small fraction) - Car miles: 60.32 / 0.21 = 287 miles - Household electricity: 60.32 / 4200 = 0.014 years (about 5 days) - Trees to offset: 60.32 / 22 = 2.7 trees for one year → Chapter 34: Environmental Data Ethics and Climate
Step 5: Mitigation and Monitoring: What safeguards will be implemented? - How will the practice be monitored for unintended consequences? - What triggers a re-review? → Chapter 26: Building a Data Ethics Program
Steps, heart rate, calories burned: personal health/biometric data (clearly personal, as it relates to an identifiable individual via their watch assignment) - **Sleep duration** — personal health data (clearly personal and sensitive, as it reveals behavioral patterns) - **GPS location on campus** — personal location data (clearly per → Quiz: The Data All Around Us
Strategic challenges:: Whether to establish EU-based data processing infrastructure or rely on cross-border transfer mechanisms - How to reconcile HIPAA compliance (US operations) with GDPR compliance (EU operations) where requirements conflict - Whether the cost of compliance — estimated at $1.2 million in the first year → Chapter 20: The Regulatory Landscape: A Global Survey
Streak mechanism: This is a variable reward schedule combined with loss aversion. The streak creates an artificial cost to not using the app each day, exploiting children's fear of losing accumulated progress. It functions as a trigger in Fogg's model. 2. **Confirmshaming** — "Quit now? Your streak will be sad" uses → Quiz: The Attention Economy
Strengthened provisions:: The definition of "sharing" was expanded to cover cross-context behavioral advertising, closing a loophole that had allowed data exchanges to escape the "sale" opt-out. - Purpose limitation and data minimization principles were added, requiring businesses to collect only personal information that wa → Case Study: California's CCPA/CPRA: The American Experiment
Strengths in this context:: Produces a diverse student body that "looks like" the applicant pool - Addresses the historical exclusion of underrepresented groups - Aligns with the educational mission of exposure to diverse perspectives → Case Study: Fairness in College Admissions Algorithms
Strengths of LIME:: Works with any model (model-agnostic) - Produces intuitive, feature-attribution explanations - Can explain individual predictions in concrete terms → Chapter 16: Transparency, Explainability, and the Black Box Problem
Strengths of SHAP:: Mathematically grounded in a framework with desirable properties (consistency, local accuracy, efficiency) - Can provide both local explanations (for individual predictions) and global explanations (by aggregating local Shapley values) - More theoretically principled than LIME — Shapley values have → Chapter 16: Transparency, Explainability, and the Black Box Problem
Strengths:: Provides a clear, systematic decision procedure - Focuses on real-world consequences, not abstract principles - Demands consideration of all affected parties - Well-suited to policy analysis where trade-offs are explicit → Chapter 6: Ethical Frameworks for the Data Age
Structural concerns:: **One-sided proceedings.** In a traditional court, both sides present arguments. In the FISC, only the government appears. There is no adversary to challenge the government's claims, question its evidence, or present alternative interpretations of the law. - **Approval rate.** Between 1979 and 2023, → Chapter 36: National Security, Intelligence, and Democratic Oversight
Structural features:: **Eleven judges** serve on the FISC, appointed by the Chief Justice of the Supreme Court from among sitting federal district court judges. - **Non-adversarial proceedings.** Only the government presents arguments. The surveillance target is not represented, does not know the proceeding exists, and c → Case Study: The FISA Court — Secret Justice
Structural limitations:: Fact-checking is inherently reactive — it can only address claims that have already spread - The scale mismatch is enormous: fact-checking organizations employ hundreds of people; platforms generate billions of pieces of content - Fact-checking organizations may be perceived as politically biased, r → Chapter 31: Misinformation, Disinformation, and Platform Governance
Structure:: **Aether Committee:** A cross-company advisory committee with working groups on fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability - **Office of Responsible AI:** An operational team that develops and implements responsible AI governance processes → Chapter 26: Building a Data Ethics Program
Suggested Issues:: Should your city adopt a facial recognition moratorium? - Should your state pass a comprehensive data privacy law? What should it include? - How should your university govern student learning analytics data? - Should there be a federal algorithmic accountability law in the United States? - How shoul → Capstone Project 2: Policy Brief
Suggested Systems:: Your university's learning management system (LMS) and its data practices - A ride-sharing or delivery platform's data collection and algorithmic management - A municipal surveillance system (cameras, sensors, license plate readers) - A health app's data practices (fitness tracker, menstrual trackin → Capstone Project 1: Data Ethics Audit
Sunset clauses: regulations that automatically expire after a set period unless affirmatively renewed, forcing periodic reassessment - **Monitoring requirements** — mandating that deployers of emerging technologies continuously track impacts and report to regulators - **Trigger mechanisms** — defining thresholds (e → Chapter 38: Emerging Technologies and Anticipatory Governance
Systemic costs:: Erosion of public trust in digital services - Increased costs of security for the entire industry (each breach raises the baseline) - Identity theft and fraud affecting individuals for years after the initial breach → Chapter 11: The Economics of Privacy
Systemic lessons:: **Third-party risk is your risk.** Target was breached through a vendor. Organizations must extend their security perimeter to include all third parties with network access. - **Alerts without action are useless.** Target's monitoring system worked — it detected the malware. The organizational proce → Chapter 30: When Things Go Wrong: Breach Response and Crisis Ethics

T

Technical challenges:: Data architecture redesign to implement GDPR-required data minimization, purpose limitation, and storage limitation - Implementation of consent management platforms for EU data subjects - Development of data portability capabilities (Article 20) - Establishment of data processing records (Article 30 → Chapter 20: The Regulatory Landscape: A Global Survey
technological determinism: the belief that technology develops according to its own internal logic and that society must adapt to whatever technology produces. → Chapter 38: Emerging Technologies and Anticipatory Governance
Technology companies: initially embarrassed by their cooperation with PRISM — began implementing stronger encryption, including end-to-end encryption for messaging services, and publishing transparency reports detailing government data requests. - **International relations** were strained. Germany was particularly angere → Case Study: The Snowden Revelations and Mass Surveillance
Teleperformance: a French company operating in 91 countries - **Majorel** (now part of Teleperformance) - **Samasource** (now Sama) — operating in Kenya and Uganda → Case Study: Content Moderation at Scale — The Human Cost
TEMPORA: The UK's Government Communications Headquarters (GCHQ) program that tapped over 200 fiber optic cables, each carrying data at 10 gigabits per second, and stored the content for three days and the metadata for thirty days --- creating a rolling buffer of virtually all internet traffic passing through → Chapter 8: Surveillance: From Panopticon to Platform
Ten fingerprints: all ten fingers, both hands - **Two iris scans** — both eyes - **A facial photograph** → Case Study: Aadhaar — India's Digital Identity Experiment
Tensions in this context:: If average academic preparation differs across groups (due to school quality, not ability), admitting equal proportions may mean admitting some students who are less prepared — potentially setting them up for difficulty if the university does not invest in support programs - The base rate data shows → Case Study: Fairness in College Admissions Algorithms
The agent: Who creates and distributes the content? Individuals, organized groups, states, automated bots? 2. **The message** — What form does it take? Fabricated content, manipulated images, misleading headlines, imposter sites? 3. **The interpreter** — How does the audience receive and reinterpret it? What p → Chapter 31: Misinformation, Disinformation, and Platform Governance
The case against (aggregate costs):: **Chilling effects on public expression.** If anyone can be identified from any photograph, participation in public life — attending protests, political rallies, religious gatherings, or simply walking down the street — becomes a potentially surveilled activity. The chilling effect on First Amendmen → Case Study: Should Clearview AI Exist? An Ethical Analysis
The case against Aadhaar:: **Surveillance infrastructure.** Aadhaar creates a comprehensive identification system that can be used to track individuals across services. When linked to bank accounts, mobile phones, tax records, and government services, Aadhaar enables a level of state surveillance that would be technically imp → Chapter 37: Global South Perspectives on Data Governance
The case for Aadhaar:: **Financial inclusion.** Aadhaar has enabled hundreds of millions of previously "unbanked" Indians to open bank accounts, receive direct benefit transfers, and access formal financial services. - **Benefit delivery efficiency.** Direct benefit transfers linked to Aadhaar have reduced corruption and → Chapter 37: Global South Perspectives on Data Governance
The centralized model's information flow:: **Type:** Proximity data (who was near whom, when, for how long) — far more granular than traditional contact tracing's "who were you with?" - **Subject:** All app users, not just diagnosed individuals - **Sender:** Smartphone (automatically, continuously) - **Recipient:** Government health authorit → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
The concerns:: **Consent complexity.** Customers must understand and manage which third parties have access to their financial data — adding to consent fatigue (Chapter 9) - **Security risks.** More parties with access to data means a larger attack surface - **Inequality.** Sophisticated consumers may benefit from → Chapter 24: Sector-Specific Governance: Finance, Health, Education
The decentralized model's information flow:: **Type:** Anonymous rotating Bluetooth identifiers - **Subject:** All app users — but anonymously - **Sender:** Smartphone (automatically) - **Recipient:** Other nearby smartphones (no central recipient) - **Transmission principle:** Exposure notification without identification → Case Study: Privacy Norms in Crisis — COVID-19 Contact Tracing
The EU Digital Services Act (2022): includes provisions requiring platforms to protect minors, drawing on AADC principles. - **Australia's Online Safety Act (2021)** — established online safety expectations for social media services, informed by the AADC model. → Case Study: The UK Age Appropriate Design Code in Practice
The evidence is promising:: Google's "prebunking" campaign, tested in collaboration with researchers at Cambridge and Bristol universities, showed short videos explaining manipulation techniques (emotional language, scapegoating, false dichotomies) to millions of users on YouTube. The intervention increased users' ability to i → Chapter 31: Misinformation, Disinformation, and Platform Governance
The Foreign Intelligence Surveillance Court (FISC): a secret court composed of 11 federal judges, appointed by the Chief Justice of the Supreme Court, authorized to approve surveillance orders for foreign intelligence purposes. - **Procedural requirements** — applications for surveillance orders must demonstrate probable cause that the target is a "f → Chapter 36: National Security, Intelligence, and Democratic Oversight
The mean: virtues lie between extremes. Courage is between cowardice and recklessness. Transparency is between secrecy and oversharing. → Chapter 6: Ethical Frameworks for the Data Age
The moratorium model: banning a technology until governance is in place (as several cities have done with facial recognition) - **The default-deny approach** — requiring affirmative approval before a data-intensive technology may be deployed → Chapter 38: Emerging Technologies and Anticipatory Governance
The Naive Technicist: Believes data is neutral, technology is progress, and the right technical solution will resolve any ethical concern. This was Mira at the beginning — trusting the system because the system produced useful outputs. The naive technicist is not malicious. They are simply unaware of the social dimension → Chapter 40: Your Responsibility — From Knowledge to Action
The Power Asymmetry: Who collects data, who is collected upon, and who decides? 2. **The Consent Fiction** — When is consent meaningful, and when is it theater? 3. **The Accountability Gap** — When data systems cause harm, who is responsible? 4. **The VitraMed Thread** — How do data ethics challenges compound as an orga → Part 1: Foundations — Data, Power, and the Digital Self
The Pragmatic Insider: Works within institutions to push them toward better practices, accepting incremental progress over revolutionary change. This is Ray Zhao — genuinely trying to build ethical data governance at NovaCorp while navigating budget constraints, board pressure, and competitive dynamics. The pragmatic insi → Chapter 40: Your Responsibility — From Knowledge to Action
The Principled Practitioner: Brings ethical analysis to technical work, building governance considerations into design from the beginning, speaking up when systems cause harm, and maintaining standards even when it is costly. This is what Mira has become. → Chapter 40: Your Responsibility — From Knowledge to Action
The promises:: Increased competition and innovation in financial services - Better products for consumers (e.g., apps that aggregate accounts across multiple banks, automated savings tools, price comparison services) - Reduced switching costs — customers can more easily move between banks when their data is portab → Chapter 24: Sector-Specific Governance: Finance, Health, Education
The recognition that Indigenous knowledge systems: including oral histories, ecological knowledge, and cultural practices — constitute data that requires governance according to Indigenous values and protocols, not Western intellectual property frameworks. → Chapter 32: Digital Divide, Data Justice, and Equity
The Righteous Critic: Identifies every flaw, challenges every system, distrusts every institution. This was Eli at the beginning — so clear-eyed about injustice that he sometimes couldn't see pathways to change. The righteous critic is essential for diagnosis but insufficient for treatment. → Chapter 40: Your Responsibility — From Knowledge to Action
The Strategic Advocate: Combines systemic critique with strategic action, building coalitions, designing alternatives, and applying pressure where it matters. This is Sofia Reyes — and it is what Eli has become. The strategic advocate sees the system as it is and works to change it. → Chapter 40: Your Responsibility — From Knowledge to Action
The structural view: which both Eli and Sofia found most persuasive — holds that the total number of jobs matters less than the *distribution* of benefits. Even if generative AI creates aggregate economic growth, the gains may be concentrated among AI companies, their investors, and highly skilled workers who use AI as → Chapter 18: Generative AI: Ethics of Creation and Deception
This is a Python chapter: Parts B and C include programming tasks using the `DataQualityAuditor` class. Estimated completion time: 4-5 hours. → Exercises: Data Governance Frameworks and Institutions
Three governance strategies: the precautionary principle, adaptive governance, and regulatory sandboxes — offer complementary approaches to governing under uncertainty. → Chapter 38: Emerging Technologies and Anticipatory Governance
Tier 1: Executive Data Governance Council: **Composition:** CDO (chair), CIO, CISO, CFO, heads of major business units, Chief Privacy Officer, General Counsel - **Frequency:** Quarterly - **Authority:** Sets data strategy, approves policies, allocates resources, resolves escalated disputes - **Accountability:** Reports to the CEO and/or boar → Chapter 22: Data Governance Frameworks and Institutions
Tier 2: Data Governance Working Committee: **Composition:** Data governance program manager (chair), domain data stewards, representatives from IT, legal, compliance, and key business units - **Frequency:** Monthly - **Authority:** Develops and implements policies approved by Tier 1, manages data quality programs, oversees data stewardship - → Chapter 22: Data Governance Frameworks and Institutions
Tier 3: Domain Data Stewardship Teams: **Composition:** Subject matter experts within specific data domains (customer data, financial data, product data, etc.) - **Frequency:** Weekly or bi-weekly - **Authority:** Manage data quality within their domain, define business rules, maintain data definitions, resolve operational data issues - → Chapter 22: Data Governance Frameworks and Institutions
Tier 3: Institutional Wellness Insights (optional): Covers: contributing aggregated, de-identified mood and engagement data to Ashbrook's Student Success Initiative. - Data practices: explained in plain language, including: what "aggregate" means, what the Student Success Initiative will see, and an explicit statement that no individual-level data wi → Quiz: Data Collection and Consent
Tier 4: Research Participation (optional): Covers: sharing de-identified data with named research partners for approved studies. - Data practices: the specific research partner (including any corporate sponsors) is disclosed by name. The research purpose is described. The de-identification method is explained. Students can consent to specifi → Quiz: Data Collection and Consent
Trace the bias pipeline: where in the system did the bias likely enter? 6. **Check for feedback loops** — does the system's output influence its future input? 7. **Consider the human consequences** — what does this bias mean for real people in real situations? → Chapter 14: Bias in Data, Bias in Machines
trolley problem: the thought experiment in which a runaway trolley is headed toward five people, and you can divert it to a side track where it will kill one person instead. Should you act? → Chapter 19: Autonomous Systems and Moral Machines

U

UDHR Article 12: one paragraph, foundational 2. **OECD Privacy Guidelines** -- the eight principles that started it all 3. **GDPR Articles 1-11** -- the core principles of the most influential modern law 4. **CARE Principles** -- a concise alternative framework 5. **Rawls veil of ignorance** (from secondary sources → Appendix C: Primary Sources Guide -- Annotated Key Documents
Underrepresented communities generate less data: fewer clicks, fewer searches, fewer digital transactions. 2. **Algorithms trained on incomplete data perform worse** for these communities — less accurate recommendations, less relevant services, more biased predictions. 3. **Worse algorithmic performance reduces the value** of digital services for → Chapter 32: Digital Divide, Data Justice, and Equity
Uniqueness issues:: Patient `P003` (Carla Davis) appears twice — a duplicate record. This could lead to fragmented medical histories, duplicate billing, or conflicting treatment records. → Chapter 22: Data Governance Frameworks and Institutions
Upstream collection: The NSA's tapping of undersea fiber optic cables to collect communications data in transit. This program captured communications not just of foreign targets but of millions of Americans whose data happened to flow through monitored infrastructure. The technical term for this is "incidental collectio → Chapter 8: Surveillance: From Panopticon to Platform
User interaction signals (strongest weight):: Which videos you watch to completion (completion rate is a critical metric) - Which videos you re-watch - Which videos you share, like, comment on, or save - Which videos you skip within the first second (a strong negative signal) - How long you pause on a video before scrolling - Whether you follow → Case Study: TikTok's Recommendation Algorithm

V

Variable reward schedules: the unpredictable timing of likes, comments, and notifications — exploit the same neurological mechanisms targeted by gambling products. When applied to adolescent users, these mechanisms interact with developmental vulnerability in ways that adult users may not experience. → Chapter 35: Children, Teens, and Digital Vulnerability
veil of ignorance: not knowing your own position in society. You don't know whether you'll be rich or poor, black or white, data-literate or not, healthy or sick. → Chapter 6: Ethical Frameworks for the Data Age
Very Large Online Platforms (VLOPs): those with more than 45 million monthly active EU users — must additionally: - Conduct annual systemic risk assessments covering the risks of dissemination of illegal content, impacts on fundamental rights, and impacts on civic discourse and electoral processes - Implement risk mitigation measures a → Case Study: Section 230 vs. the EU DSA — Two Approaches to Platform Liability
Video information signals (moderate weight):: Captions, hashtags, and sounds used - Content of the video itself (analyzed by computer vision and natural language processing) - Trending status of sounds and effects → Case Study: TikTok's Recommendation Algorithm
VitraMed: Health-tech startup founded by Mira's father, Vikram Chakravarti - Started as electronic health records (EHR) optimization tool for small clinics - Growing into predictive health analytics platform - Timeline across the book: → Continuity Tracking Document

W

What businesses know that residents do not:: Businesses choose to install cameras and consent to the partnership. Residents of the neighborhood — who are surveilled as a consequence — have no equivalent choice. → Case Study: Detroit's Project Green Light — Surveillance and Community Power
What critics note:: **Limited scope.** Model cards are published for selected services, not all AI products. The most controversial applications may lack documentation. - **Static documentation.** Model cards represent a snapshot at publication time. If the model is updated, the card may lag behind, creating a gap betw → Chapter 29: Responsible AI Development
What Google's model cards do well:: **Disaggregated performance reporting.** Google's face detection model card reports performance separately by skin tone (using the Fitzpatrick skin type scale), age, and gender — revealing, for example, that detection accuracy is lower for darker skin tones. - **Explicit intended use and limitations → Chapter 29: Responsible AI Development
What remains uncertain:: **Causation vs. correlation.** Most studies are correlational. It is possible that the causal arrow runs in the opposite direction: adolescents who are already depressed or anxious may use social media more as a coping mechanism. Longitudinal studies offer some evidence of bidirectional effects, but → Chapter 35: Children, Teens, and Digital Vulnerability
What the evidence shows:: Fact-check labels reduce the likelihood of sharing labeled content by approximately 10-25% (Clayton et al., 2020) - Corrections are more effective when they come from sources the audience considers credible (Walter et al., 2020) - The "continued influence effect" means that even after correction, in → Chapter 31: Misinformation, Disinformation, and Platform Governance
What the police know that the public does not:: The full extent of facial recognition use, including how often it is deployed, what databases are searched, and the accuracy rates for different demographic groups - The criteria analysts use to flag individuals or incidents as suspicious - How long footage and associated data (facial recognition re → Case Study: Detroit's Project Green Light — Surveillance and Community Power
What the public knows:: That cameras exist (the green light makes this visible) - The general claim that the program reduces crime - Limited statistical data released by the police department → Case Study: Detroit's Project Green Light — Surveillance and Community Power
Who Owns Your Data?: moves from history to one of the most contested questions in the contemporary data landscape: the question of ownership. Who has rights over the data generated by your body, your behavior, your creative work, and your digital life? The answers depend on legal tradition, data type, and theory of owne → Chapter 2: Key Takeaways

X

XKeyscore: A search system that allowed NSA analysts to search through vast databases of collected emails, chats, and browsing histories. One NSA training slide boasted that XKeyscore covered "nearly everything a typical user does on the internet." Analysts could search by name, email address, IP address, lang → Chapter 8: Surveillance: From Panopticon to Platform