Glossary: AI & Machine Learning for Business

#

"primarily mid-market accounts": No breakdown of customer additions by segment in the source data. 4. **"a 12% increase from the prior year"** — R&D spending percentage is given (7.4% of revenue) but the year-over-year R&D growth rate is not in the source data. 5. **"supporting the launch of three new AI-powered product features"** → Answers to Selected Exercises
"The Hype-Reality Gap": Distinguishing genuine AI capability from marketing hype 2. **"Human-in-the-Loop"** — The boundary between human judgment and algorithmic recommendation 3. **"Data as a Strategic Asset"** — Data quality, provenance, and governance underpin every AI system 4. **"The Build-vs-Buy Decision"** — Custom → AI & Machine Learning for Business — Continuity Tracker
$29,300: **0.5:** (60 x $480) + (80 x -$20) + (40 x -$500) = $28,800 - $1,600 - $20,000 = **$7,200** - **0.7:** (30 x $480) + (20 x -$20) + (70 x -$500) = $14,400 - $400 - $35,000 = **-$21,000** → Answers to Selected Exercises
(a): Precision = TP / (TP + FP) = 180 / (180 + 1,000) = 180 / 1,180 = 15.3%. → Chapter 11 Quiz: Model Evaluation and Selection
(a) True: A stationary series has constant statistical properties (mean, variance, autocorrelation structure) over time. → Chapter 16 Quiz: Time Series Forecasting
(b): The model may simply be predicting the majority class for all instances, which would also achieve 96% accuracy. Additional metrics (precision, recall, F1, AUC) are needed. → Chapter 11 Quiz: Model Evaluation and Selection
(b) False: The MA component in ARIMA models the relationship between the current value and *past forecast errors* (residuals), not a simple moving average of past values. → Chapter 16 Quiz: Time Series Forecasting
(c): In medical screening for serious diseases, missing a true case (false negative) can be life-threatening. High recall ensures most cases are detected, even at the cost of some false positives that can be resolved through follow-up testing. → Chapter 11 Quiz: Model Evaluation and Selection
(d): The chapter discusses AUC, expected profit, interpretability, compute costs, and operations team trust. F1 score comparison was not cited as a reason for choosing Model B. → Chapter 11 Quiz: Model Evaluation and Selection
1. Churn prediction model.: *Function:* Predicts which customers are likely to stop purchasing, triggering retention campaigns (personalized offers, outreach) - *EU AI Act classification:* **Minimal risk.** The model predicts customer behavior for marketing purposes. It does not make consequential decisions about individuals. → Chapter 28: AI Regulation --- Global Landscape
1. Compute Costs: **Training compute:** GPU/CPU time for training and retraining models. Costs scale with model complexity, dataset size, and retraining frequency. - **Inference compute:** CPU/GPU time for serving predictions. For real-time models, this cost scales directly with traffic. For batch models, it scales w → Chapter 12: From Model to Production — MLOps
1. Data Costs: Data acquisition (purchasing third-party data, API costs) - Data labeling (manual labeling, crowdsourcing, active learning) - Data storage (cloud storage, data warehouses) - Data quality (cleaning, validation, deduplication) → Chapter 6: The Business of Machine Learning
1. Detect: An alert fires or a stakeholder reports anomalous behavior. Determine: Is this a data issue, a model issue, or an infrastructure issue? → Chapter 12: From Model to Production — MLOps
1. Job displacement fear.: *Mitigation:* Be honest about what will change. Invest in reskilling. Create transition plans. Identify new roles that AI creates, not just roles it modifies. Recall from Chapter 38 that every major technology transition has created more jobs than it destroyed — but the transition period is real and → Chapter 39: Capstone — AI Transformation Plan
1. Predictive Performance: AUC, F1, expected profit at optimal threshold. This is necessary but not sufficient. A model that does not predict well cannot create business value. But a model that predicts well while failing on the other dimensions can still destroy value. → Chapter 11: Model Evaluation and Selection
1. Product Description Generator: Template with placeholders for product name, category, features, price, and available colors/sizes - Brand voice guidelines embedded in the prompt - Few-shot examples of approved descriptions - Temperature: 0.6 (enough variety for A/B testing) → Chapter 19: Prompt Engineering Fundamentals
1. Technical risks:: Model performance falls short of requirements - Data quality issues discovered after deployment - Integration with legacy systems proves more complex than expected - Vendor lock-in limits future flexibility - *Mitigation:* POC-first approach. Modular architecture. Multi-cloud strategy. Rigorous data → Chapter 39: Capstone — AI Transformation Plan
1. The Optimizer: Uses AI primarily for cost reduction and efficiency. Deploys predictive maintenance, process automation, and yield optimization. Captures incremental value from existing operations. Most common archetype; lowest strategic risk but also lowest upside. → Chapter 31: AI Strategy for the C-Suite
1.931: identical DCG in this case because positions 1 and 3 have symmetric contributions in this particular configuration. However, in general, ranking the most relevant items earlier is always preferred because the logarithmic discount penalizes lower positions more heavily. → Answers to Selected Exercises
18 percent reduction in overstock costs: worth approximately $2.2 million annually. → Chapter 8: Supervised Learning — Regression
2. Compute Costs: Training compute (GPU/TPU time for model training) - Experimentation compute (dozens or hundreds of training runs during development) - Inference compute (running the model in production on new data) - Storage and networking costs → Chapter 6: The Business of Machine Learning
2. Customer Email Responder: Templates for five common scenarios (order inquiry, complaint, return request, product question, compliment) - Tone calibrated to scenario severity (empathetic for complaints, warm for compliments) - Constraints preventing the model from making promises the company cannot keep - Temperature: 0.3 (co → Chapter 19: Prompt Engineering Fundamentals
2. Interpretability: Can the model's decisions be explained to stakeholders, customers, and regulators? Logistic regression coefficients are directly interpretable. Decision tree rules are inspectable. Deep neural network activations are opaque. When Chapter 25 introduces fairness metrics, interpretability becomes a reg → Chapter 11: Model Evaluation and Selection
2. Loss of professional autonomy.: *Mitigation:* Design AI systems as *decision support*, not *decision replacement*. Involve professionals (physicians, engineers, underwriters) in AI design and validation. The human-in-the-loop principle is not just a governance mechanism — it is a change management tool. People accept AI more readi → Chapter 39: Capstone — AI Transformation Plan
2. Organizational risks:: Insufficient executive sponsorship (the "air cover" evaporates after a leadership change) - Talent shortages — inability to hire or retain AI talent - Change resistance exceeds expectations - Shadow AI proliferates despite governance efforts (Chapter 22) - *Mitigation:* Multiple executive sponsors. → Chapter 39: Capstone — AI Transformation Plan
2. Product recommendation engine.: *Function:* Recommends products to customers based on browsing history, purchase history, and similar customer behavior - *EU AI Act classification:* **Limited risk.** Recommendation systems are not listed in Annex III (high-risk) but may trigger transparency requirements. If the system uses profili → Chapter 28: AI Regulation --- Global Landscape
2. Storage Costs: Model artifacts (each version of each model) - Feature store data (online and offline) - Training data and evaluation datasets - Prediction logs (for monitoring and retraining) - Experiment tracking logs → Chapter 12: From Model to Production — MLOps
2. The Differentiator: Uses AI to create customer-facing differentiation. Deploys personalization, recommendation engines, and intelligent customer service. Creates competitive advantage through superior customer experience. Moderate risk; requires strong data assets and customer relationships. → Chapter 31: AI Strategy for the C-Suite
2. Triage: Assess severity. Is the model serving incorrect predictions? Is it serving no predictions? Is the impact limited to a subset of users? What is the business impact? → Chapter 12: From Model to Production — MLOps
2.30x: the true cost is 2.3 times the initial development estimate. This is consistent with the chapter's finding that TCO multipliers typically range from 2x to 4x. → Answers to Selected Exercises
23 percent increase in average order value: from $67 to $82, driven by cross-category recommendations that surfaced complementary products customers would not have found through browsing. > - **15 percent increase in items per basket** — from 2.3 to 2.8, as the "You might also like" module on product pages drove add-on purchases. > - **47 per → Chapter 10: Recommendation Systems
23.08%: only 23% of flagged transactions are actually fraud. → Answers to Selected Exercises
3. Competitive Analysis Template: Role: Athena's competitive intelligence analyst - Structured output with standardized sections (positioning, strengths, weaknesses, strategic implications) - Context includes Athena's current competitive position - Temperature: 0.2 (factual consistency matters) → Chapter 19: Prompt Engineering Fundamentals
3. Customer service chatbot.: *Function:* Handles initial customer inquiries, resolves common issues, escalates complex cases to human agents - *EU AI Act classification:* **Limited risk.** The chatbot must disclose to EU customers that they are interacting with an AI system. This is a transparency requirement, not a full confor → Chapter 28: AI Regulation --- Global Landscape
3. Distrust of algorithmic decisions.: *Mitigation:* Invest in explainability (Chapter 26). Show stakeholders *why* the model made a recommendation, not just *what* it recommended. Build trust incrementally — start with low-stakes decisions and expand as confidence grows. → Chapter 39: Capstone — AI Transformation Plan
3. Ethical risks:: Algorithmic bias produces unfair outcomes (Chapter 25) - Lack of explainability undermines trust (Chapter 26) - AI systems make decisions that violate organizational values - *Mitigation:* Bias testing in the ML pipeline. Explainability requirements in governance tiers. Ethics review board with dive → Chapter 39: Capstone — AI Transformation Plan
3. Infrastructure Costs: ML platform licensing (SageMaker, Vertex AI, Databricks) - Monitoring tools (Arize, Evidently) - Container orchestration (Kubernetes cluster management) - Networking (data transfer between services) → Chapter 12: From Model to Production — MLOps
3. Latency and Scalability: How fast does the model produce predictions, and how does that speed change at scale? A model that takes 200ms per prediction is fine for batch processing (score all customers overnight) but too slow for real-time applications (approve a credit card transaction in under 100ms). At Athena's scale — 2 → Chapter 11: Model Evaluation and Selection
3. Mitigate: Take immediate action to limit damage. Options include: - Rollback to the previous model version - Pause predictions and serve a default value - Disable the feature that depends on the model - Route to a rule-based fallback system → Chapter 12: From Model to Production — MLOps
3. Optimize inference.: Model compression (reduce model size without significant accuracy loss) - Model quantization (use lower-precision numbers for inference) - Batching (group multiple inference requests to amortize overhead) - Caching (cache predictions for frequently seen inputs) - ONNX conversion (optimized inference → Chapter 12: From Model to Production — MLOps
3. Talent Costs: Data scientists (median US salary: $130,000-$170,000, 2025) - ML engineers (median US salary: $150,000-$200,000, 2025) - Data engineers (median US salary: $120,000-$160,000, 2025) - AI/ML product managers (median US salary: $140,000-$180,000, 2025) - Recruiting and retention (signing bonuses, equity → Chapter 6: The Business of Machine Learning
3. The Innovator: Uses AI to create new products, services, or revenue streams. Deploys AI-native offerings that would not exist without machine learning. Higher risk; requires entrepreneurial culture and tolerance for experimentation. → Chapter 31: AI Strategy for the C-Suite
4. Diagnose: Investigate the root cause. Common causes: upstream data change, data pipeline failure, feature engineering bug, model drift, infrastructure failure. → Chapter 12: From Model to Production — MLOps
4. Fairness and Compliance: Does the model treat different demographic groups equitably? Does it comply with relevant regulations? We will explore fairness metrics in depth in Chapter 25, but the evaluation starts here. A model that achieves high AUC overall but performs significantly worse for certain customer segments is a l → Chapter 11: Model Evaluation and Selection
4. HR screening model (rebuilt post-Chapter 25).: *Function:* Assists in screening job applications by scoring candidates against role requirements - *EU AI Act classification:* **High risk.** Employment-related AI systems are explicitly listed in Annex III. The model requires full conformity assessment, including risk management system, data gover → Chapter 28: AI Regulation --- Global Landscape
4. Infrastructure Costs: ML platforms (SageMaker, Vertex AI, Databricks) - Experiment tracking tools (MLflow, Weights & Biases) - Feature stores (Feast, Tecton) - Monitoring tools - Development environments → Chapter 6: The Business of Machine Learning
4. Monitor and eliminate waste.: Identify and shut down unused model endpoints - Archive old model versions and their associated artifacts - Review prediction logs — are you generating predictions nobody uses? - Right-size your feature store — are you computing and storing features that no model consumes? → Chapter 12: From Model to Production — MLOps
4. Regulatory risks:: New AI regulations impose requirements the organization is not prepared for (Chapter 28) - Privacy violations from AI systems processing personal data (Chapter 29) - Industry-specific compliance failures (FDA for healthcare AI, SR 11-7 for financial services AI) - *Mitigation:* Regulatory monitoring → Chapter 39: Capstone — AI Transformation Plan
4. Social Media Caption Writer: Separate templates for Instagram, LinkedIn, and email newsletter - Platform-specific constraints (character limits, hashtag conventions, emoji usage) - Brand voice adjusted by platform (more playful on Instagram, more professional on LinkedIn) - Temperature: 0.7 (creative variety for content calenda → Chapter 19: Prompt Engineering Fundamentals
4. Talent Costs: ML engineers (the most significant ongoing cost) - Data scientists (for model improvement and retraining) - On-call time (opportunity cost and compensation for off-hours support) → Chapter 12: From Model to Production — MLOps
4. The Transformer: Uses AI to fundamentally redesign the business model. Moves from product-based to platform-based, from inventory-based to prediction-based, from human-delivered to AI-delivered. Highest risk and highest potential reward. Ping An's transformation from traditional insurer to AI-powered financial platf → Chapter 31: AI Strategy for the C-Suite
5. Fix: Implement the fix. This may be a pipeline code change, a data source update, a model retraining, or an infrastructure repair. → Chapter 12: From Model to Production — MLOps
5. Maintenance Costs: Ongoing monitoring and incident response - Periodic retraining (new data, updated features) - Model updates (bug fixes, performance improvements) - Compliance and audit activities → Chapter 6: The Business of Machine Learning
5. Organizational Fit: Does the team have the skills to maintain the model? Does the infrastructure support it? Will the business unit actually *use* the predictions? The most sophisticated model in the world creates zero value if the operations team does not trust it, the engineering team cannot deploy it, or the executi → Chapter 11: Model Evaluation and Selection
5. Software and Licensing: **AI platform licenses**: Commercial AutoML platforms, enterprise AI suites - **Open-source support**: Enterprise support contracts for open-source tools - **Vendor model APIs**: Per-call costs for cloud AI services (e.g., OpenAI, Anthropic, Google AI) → Chapter 34: Measuring AI ROI
6. Organizational Costs: **Change management**: Training end users, redesigning workflows, managing resistance - **Governance and compliance**: Building governance frameworks, conducting audits, maintaining documentation (see Chapter 27) - **Executive time**: The opportunity cost of leadership attention devoted to AI initia → Chapter 34: Measuring AI ROI
6. Review: Conduct a blameless post-mortem. Document: What happened? When was it detected? What was the impact? What was the root cause? What will prevent recurrence? → Chapter 12: From Model to Production — MLOps
7. Opportunity Costs: **Foregone projects**: Every AI project you pursue is a project you don't pursue; the opportunity cost is the value of the next-best alternative - **Technical debt**: Shortcuts taken during development that increase future maintenance costs (recall the "hidden technical debt" concept from Chapter 6) → Chapter 34: Measuring AI ROI
8. Risk Costs: **Model failure costs**: The financial impact when models produce wrong predictions (e.g., a demand forecasting error leading to excess inventory) - **Regulatory and legal risk**: Potential fines, lawsuits, or compliance failures - **Reputational risk**: Brand damage from AI failures (biased hiring → Chapter 34: Measuring AI ROI
90.0%: the model catches 90% of actual fraud. → Answers to Selected Exercises
|: 350** | **Attrition only** | → Chapter 38 Exercises: AI, Society, and the Future of Work

A

A model card: a standardized document describing the model's purpose, performance, limitations, training data, and ethical considerations (Mitchell et al., 2019) 2. **A data sheet** — documentation of the training data's provenance, composition, collection methodology, and known biases (Gebru et al., 2021) 3. **A → Chapter 6: The Business of Machine Learning
A typical model container includes:: The serialized model artifact - The serving code (a Flask/FastAPI app, or a model-serving framework like TensorFlow Serving or Triton) - All Python dependencies (with pinned versions) - The operating system environment → Chapter 12: From Model to Production — MLOps
Accessing items: Lists are zero-indexed, meaning the first item is at position 0: → Chapter 3: Python for the Business Professional
action: looking up data, making calculations, calling APIs, sending notifications, updating databases. This is the domain of AI agents. → Chapter 21: AI-Powered Workflows
Adoption metrics:: Number of business units actively using CoE services - Number of models in production across the organization - Platform utilization rates (compute, storage, tool usage) → Chapter 32: Building and Managing AI Teams
Advanced Prompt Engineering: Chain-of-thought prompting · Tree-of-thought · Prompt chaining · Structured outputs (JSON mode) · Self-consistency · Constitutional AI concepts · Evaluation and testing · *Code: `PromptChain`* → AI & Machine Learning for Business
Advantages of regression trees:: No assumptions about the functional form - Automatically capture interactions and nonlinear patterns - Easy to interpret and explain to business stakeholders - Handle both numerical and categorical features natively → Chapter 8: Supervised Learning — Regression
Advantages:: Full control over model behavior, features, and optimization targets - Can encode unique domain knowledge and proprietary data - Competitive differentiation — your model reflects your data and your strategy - No vendor lock-in or per-prediction pricing - Customizable to exact business requirements → Chapter 6: The Business of Machine Learning
Agglomerative (bottom-up): Start with every data point as its own cluster. At each step, merge the two closest clusters. Repeat until everything is in one giant cluster. This is the far more common approach. → Chapter 9: Unsupervised Learning
AI for Marketing and Customer Experience: Personalization at scale · AI-powered chatbots · Customer journey analytics · Attribution modeling · Dynamic pricing · The "creepy line" · NK's marketing AI project at Athena → AI & Machine Learning for Business
AI generates specific resistance patterns: fear of job loss, distrust of algorithmic outputs, data scientist vs. domain expert tension, inertia, and the trust deficit — each requiring targeted responses. The most important principle: resistance is information, not obstruction. It tells you what the change process is missing. → Chapter 35: Change Management for AI
AI Governance Frameworks: What is AI governance · NIST AI Risk Management Framework · ISO/IEC 42001 · Ethics committees and review boards · AI impact assessments · Athena's governance structure → AI & Machine Learning for Business
AI Product Management: The AI PM role · Managing probabilistic products · User research for AI features · Setting expectations · Iterating on AI products · Failure modes · NK's first AI product launch → AI & Machine Learning for Business
AI Regulation — Global Landscape: EU AI Act (risk tiers, compliance) · US regulatory approach · China's AI regulations · UK, Canada, Singapore · Industry self-regulation · Compliance strategies · Athena's regulatory response → AI & Machine Learning for Business
AI Strategy for the C-Suite: AI strategy frameworks · Competitive dynamics · First-mover vs. fast-follower · CEO and board responsibilities · AI governance at the board level · Athena's strategic pivot → AI & Machine Learning for Business
AI translators: people who can bridge the gap between technical AI teams and business stakeholders (the role NK is growing into) 2. **ML engineers** — people who can take models from notebooks to production (the gap between experimentation and deployment) 3. **Data engineers** — people who can build and maintain th → Case Study 1: McKinsey's AI ROI Research — What Separates AI Leaders from Laggards
AI, Society, and the Future of Work: Job displacement research · Augmentation vs. automation · The skills premium · Inequality and AI · Democratic governance of AI · Athena's workforce transformation → AI & Machine Learning for Business
AI-Powered Workflows: RAG (Retrieval-Augmented Generation) · Embeddings and vector databases · AI agents and tool use · Workflow orchestration · LangChain/LlamaIndex overview · Athena's knowledge base · *Code: RAG pipeline* → AI & Machine Learning for Business
AI-specific requirements include:: **Performance requirements:** "The model must achieve at least X% precision and Y% recall on the test set, as measured by [specific metric]." - **Coverage requirements:** "The model must generate predictions for at least Z% of users/transactions/items." - **Latency requirements:** "Predictions must → Chapter 33: AI Product Management
All GPAI providers must:: Maintain up-to-date technical documentation - Provide information and documentation to downstream providers integrating the model into their systems - Comply with EU copyright law, including transparency about training data - Publish a sufficiently detailed summary of training data content → Chapter 28: AI Regulation --- Global Landscape
Annual Review: Re-administer the **AI Maturity Self-Assessment** (Template 15) to measure progress. - Update the **AI Strategy One-Pager** (Template 12) based on results and evolving priorities. → Appendix B: Templates and Worksheets
Areas of likely convergence:: Transparency requirements for AI systems (most jurisdictions moving toward mandatory disclosure) - High-risk classification for AI in employment, credit, healthcare, and law enforcement - Requirements for human oversight in consequential AI decisions - Labeling of AI-generated content - Documentatio → Chapter 28: AI Regulation --- Global Landscape
Areas of persistent divergence:: Content regulation (fundamental disagreement between Western emphasis on free expression and Chinese emphasis on state control) - Enforcement mechanisms (fines, criminal penalties, regulatory shutdown) - GPAI/foundation model regulation (significant disagreement on how to regulate general-purpose sy → Chapter 28: AI Regulation --- Global Landscape
Assumes spherical clusters: K-means works best when clusters are roughly round (in feature space). It struggles with elongated, irregular, or nested shapes. - **Sensitive to initialization** — Different random starting positions can yield different final clusters. The standard mitigation is to run K-means multiple times with d → Chapter 9: Unsupervised Learning
Athena Prompt Library: a shared repository of tested, optimized prompts for common marketing tasks. Each prompt in the library would include: → Chapter 19: Prompt Engineering Fundamentals
Automated checks included:: Model performance above minimum thresholds on standard metrics - No performance degradation on protected subgroups (fairness checks) - Inference latency within acceptable bounds - Model size within deployment constraints - Feature availability confirmed in the online feature store - No training-serv → Case Study 2: Booking.com — 150 Teams, One ML Platform
automatic prompt tuning: is becoming a standard practice in organizations with large prompt portfolios. → Chapter 20: Advanced Prompt Engineering
AWS Considerations:: Breadth can be overwhelming — the sheer number of services creates decision fatigue (as Tom's spreadsheet demonstrates) - Generative AI positioning is evolving; Bedrock launched later than Azure's OpenAI integration - Console and documentation quality varies across services - Pricing complexity is l → Chapter 23: Cloud AI Services and APIs
AWS IoT Greengrass: runs ML models on edge devices connected to AWS - **Azure IoT Edge** — deploys Azure ML models to edge devices - **Google Coral** — edge TPU hardware and software for local ML inference → Chapter 23: Cloud AI Services and APIs
AWS Strengths for AI:: Broadest service portfolio — if a managed AI service exists for a use case, AWS probably has one - Deepest integration with Amazon's own ML research (Alexa, Amazon.com recommendations, logistics) - Largest partner and third-party tool ecosystem - Most mature infrastructure layer — widest selection o → Chapter 23: Cloud AI Services and APIs
Azure Considerations:: AI strategy is heavily coupled to OpenAI partnership — concentration risk if the partnership evolves - Some AI services feel less mature than AWS equivalents (Azure ML vs. SageMaker) - Enterprise agreement pricing can be opaque - Migration from Azure-specific services can be more complex than expect → Chapter 23: Cloud AI Services and APIs
Azure OpenAI Service: enterprise-grade access to GPT-4, DALL-E, and other OpenAI models with Microsoft's security, compliance, and governance layers - **Azure Machine Learning** — a complete MLOps platform for building, training, deploying, and managing custom models - **Copilot Studio** — tools for building custom AI ag → Case Study 1: Satya Nadella's Microsoft — A Case Study in AI-Era Leadership
Azure Sovereign Cloud: physically and logically separated cloud infrastructure meeting EU data sovereignty requirements - **Google Distributed Cloud** — Google Cloud services running in customer-controlled locations - **AWS European Sovereign Cloud** — dedicated AWS infrastructure in Europe with data residency guarantees → Chapter 23: Cloud AI Services and APIs
Azure Strengths for AI:: Exclusive access to OpenAI's models with enterprise security, compliance, and data privacy guarantees - Deepest integration with enterprise productivity tools (Microsoft 365 Copilot) - Strong hybrid cloud story through Azure Arc (extending Azure services to on-premises and edge) - GitHub Copilot int → Chapter 23: Cloud AI Services and APIs

B

Behavioral metrics:: Number of AI project proposals generated through Track 2 post-workshop projects: over 1,200 in the first three years - Number of proposals that progressed to funded pilots: approximately 200 - Internal AI tool adoption rates in divisions with high training completion vs. low training completion → Case Study 2: JPMorgan's AI Training Program — Upskilling 60,000 Employees
Benefits (from the chapter):: Defect detection 3 weeks faster: estimated $2.1M in avoided returns (one-time, but expect recurring) - Support ticket routing: 31 hours/week saved x $35/hour agent cost - Product team speed: estimated 2 months faster to market on eco-friendly line → Chapter 14 Exercises: NLP for Business
Bias in AI Systems: Sources of bias (data, algorithmic, human) · Historical bias · Representation bias · Measurement bias · The hiring AI scandal · Athena's HR screening discovery · *Code: `BiasDetector`* → AI & Machine Learning for Business
Booking in-store services: scheduling makeovers, consultations, and beauty classes through conversational interfaces - **Product discovery** — answering questions like "What's a good moisturizer for dry skin under $40?" with personalized product suggestions - **Beauty tutorials** — delivering step-by-step tutorials based on t → Case Study 1: Sephora's AI-Powered Beauty Experience — Personalization Done Right
Brand safety: Does the output align with the company's voice and values? - **Legal safety** — Does the output avoid making claims that could create liability? - **Data safety** — Does the output avoid revealing sensitive information (internal pricing, customer PII, unreleased plans)? - **Factual safety** — Does t → Chapter 20: Advanced Prompt Engineering
Build and Deploy Phase: Evaluate platforms using the **Vendor/Platform Evaluation Scorecard** (Template 5). - Staff the team using the **AI Team Hiring Plan** (Template 11). - Conduct the **AI Ethics Impact Assessment** (Template 6) before deployment. - Create a **Model Card** (Template 7) for each model entering productio → Appendix B: Templates and Worksheets
Build internally when:: AI is core to your competitive strategy and you need to own the capability. - You have ongoing, recurring AI needs that justify permanent headcount. - Your data is sensitive and you cannot share it externally. - You need to iterate rapidly and continuously — the overhead of managing a consulting eng → Appendix D: Frequently Asked Questions
Build when:: The AI model is a core source of competitive advantage - The problem requires custom architectures or novel approaches - You need full control over the model, data pipeline, and deployment environment - Regulatory requirements demand complete transparency and auditability - You have the data science → Chapter 22: No-Code / Low-Code AI
Building and Managing AI Teams: AI roles and responsibilities · Team structures (centralized, embedded, hub-and-spoke) · Recruiting and retaining AI talent · Upskilling programs · The AI Center of Excellence · Athena's AI CoE → AI & Machine Learning for Business
Business implications:: **Fraud detection.** Adversaries can craft transactions that evade fraud detection models by slightly modifying transaction attributes (timing, amounts, merchant categories) to fall outside the model's learned fraud patterns. - **Content moderation.** Spammers and bad actors can modify text, images, → Chapter 29: Privacy, Security, and AI
business unit strategy: how each business competes within its market. This is where Porter's competitive strategy frameworks apply: cost leadership, differentiation, or focus. → Chapter 31: AI Strategy for the C-Suite
Buy when:: The problem is well-solved by existing vendors (standard CRM, marketing, or operations use cases) - Speed of deployment is the primary constraint - The organization lacks data science capabilities and does not plan to build them - The AI capability is not a source of competitive differentiation - Th → Chapter 22: No-Code / Low-Code AI

C

Capstone — AI Transformation Plan: Guided capstone project · Industry selection · AI maturity assessment · Use case prioritization · Technology selection · Governance plan · Implementation roadmap · *Code: `AIMaturityAssessment` + `TransformationRoadmapGenerator`* → AI & Machine Learning for Business
Certifications that signal genuine capability:: **AWS Machine Learning Specialty / Azure AI Engineer / Google Cloud Professional ML Engineer.** These cloud certifications are rigorous, recognized by employers, and demonstrate hands-on capability. Most valuable if you work in technical or technical-adjacent roles. - **Stanford / MIT / Wharton exec → Appendix D: Frequently Asked Questions
Certifications to approach with caution:: Short courses (under 10 hours) that promise "AI certification." The signal-to-noise ratio is poor, and employers increasingly recognize that a weekend course does not confer expertise. - Vendor-specific certifications for tools you do not use. Certifications are most valuable when they align with th → Appendix D: Frequently Asked Questions
Change Management for AI: ADKAR model applied to AI · Resistance patterns · The "last mile" problem · Communication strategies · Workforce planning and reskilling · Athena's change management journey → AI & Machine Learning for Business
Channel preference: online vs. in-store vs. mobile purchase ratios - **Category affinity** — what product categories does the customer favor? - **Engagement metrics** — email open rates, app usage, loyalty program activity - **Return behavior** — what percentage of purchases are returned? - **Price sensitivity** — does → Chapter 9: Unsupervised Learning
Characteristics:: Models trained in Jupyter notebooks - Manual deployment (copy model file to server, update code) - No automated testing - No monitoring (model performance assumed stable) - Data scientists and engineers work in separate silos - Retraining is manual and infrequent - Each model deployment is a bespoke → Chapter 12: From Model to Production — MLOps
Churn risk scoring: using the classification techniques from Chapter 7, she builds a model that predicts each member's probability of disengaging within the next 90 days. The model uses features including: days since last purchase, trend in purchase frequency, email engagement decline, reward point accumulation rate, a → Chapter 24: AI for Marketing and Customer Experience
Civil society organizations: including Access Now, the European Digital Rights initiative (EDRi), Algorithm Watch, and the AI Now Institute --- consistently pushed for stronger protections, broader prohibitions, and more robust enforcement. They were generally better organized than in previous tech regulation debates, having le → Case Study 1: The EU AI Act --- From Proposal to Law
Claude, GPT-4, and Gemini: the frontier multimodal models — can generate code in response to natural language prompts, explain existing code, identify bugs, write tests, and translate between programming languages. They function as conversational coding assistants, available for everything from quick syntax lookups to archite → Chapter 18: Generative AI — Multimodal
clean, single-source data: **Well-defined target variables** with **established feature patterns** - **Speed is the primary constraint** (rapid prototyping, feasibility validation) - **Business impact is moderate** (the cost of a 1-2% accuracy gap is manageable) - **Resources are limited** (no data science team, or the data s → Case Study 1: DataRobot vs. Hand-Coded ML — A Head-to-Head Comparison
Clinical decision support: AI-assisted diagnosis for radiology (chest X-rays, mammograms) 2. **Readmission prediction** — Identify patients at high risk of 30-day readmission at discharge 3. **Revenue cycle optimization** — Automated coding accuracy review and denial prediction 4. **Patient no-show prediction** — Predict appo → Chapter 39: Capstone — AI Transformation Plan
Clinical validation: demonstrating safety and efficacy through rigorous clinical studies - **Post-market surveillance** — ongoing monitoring of model performance in real-world conditions - **Algorithmic transparency** — regulators increasingly expect explainability for AI diagnostic systems → Chapter 15: Computer Vision for Business
Closed (proprietary) models: including OpenAI's GPT-4 and successors, Anthropic's Claude, and Google's Gemini --- are accessible only through APIs. Users can prompt the model and receive outputs, but cannot examine the model's architecture, weights, or training data. The provider controls pricing, access, and the model's capabi → Chapter 37: Emerging AI Technologies
Cloud AI Services and APIs: AWS AI/ML services · Azure AI · Google Cloud AI · Pricing models and cost optimization · Vendor lock-in risks · Multi-cloud strategies · Athena's cloud AI architecture → AI & Machine Learning for Business
Clustering: grouping similar data points together (K-means, hierarchical clustering, DBSCAN) 2. **Dimensionality reduction** — compressing high-dimensional data into fewer dimensions while preserving essential structure (PCA, t-SNE, UMAP) 3. **Anomaly detection** — identifying data points that don't fit the nor → Chapter 9: Unsupervised Learning
Collaborative filtering: recommending products that similar customers have purchased, using the techniques described in Chapter 10 - **Content-based filtering** — matching product attributes (ingredients, textures, scents, color families) to stated and inferred customer preferences - **Contextual signals** — adjusting recom → Case Study 1: Sephora's AI-Powered Beauty Experience — Personalization Done Right
combinatorial creativity: the ability to generate novel combinations of existing elements. Large language models produce text that is, in a meaningful sense, new: no human has written that exact sequence of words before. Image generation models create visual compositions that did not previously exist. Music generation models → Chapter 38: AI, Society, and the Future of Work
Communication: the ability to express intent clearly and unambiguously 2. **Systems thinking** — understanding how the model interprets and processes instructions 3. **Iterative design** — testing, measuring, and refining until outputs meet quality standards → Chapter 19: Prompt Engineering Fundamentals
Completion metrics:: Track 1 completion rate: exceeded 80% across targeted employee populations by 2024 - Track 2 completion rate: approximately 70% of targeted managers - Track 3 certifications earned: over 5,000 → Case Study 2: JPMorgan's AI Training Program — Upskilling 60,000 Employees
Computer Vision for Business: Image as data · CNN intuition · Transfer learning · Object detection · Retail applications (shelf analytics, visual search) · Quality inspection · Athena's in-store analytics → AI & Machine Learning for Business
Concept drift: changes in the underlying data distribution — is inevitable. Customer preferences shift. Product catalogs evolve. Seasonal patterns change. Competitive dynamics alter user behavior. The model, trained on historical data, becomes increasingly stale. → Chapter 33: AI Product Management
Conference attendance: at least one major conference per year (NeurIPS, ICML, KDD for researchers; Strata, MLConf for practitioners) - **Learning budgets** — $2,000 to $5,000 per person per year for courses, books, and certifications - **Research time** — 10 to 20 percent of time allocated to learning, experimentation, an → Chapter 32: Building and Managing AI Teams
Configure when:: The problem is somewhat standard but requires customization for your data and context - You need to empower domain experts to build and iterate on models - The use case is valuable but does not justify a full data science engagement - You want to prototype quickly before deciding whether to invest i → Chapter 22: No-Code / Low-Code AI
Conformity Assessment (Article 43): [ ] Determine the applicable conformity assessment procedure (self-assessment or third-party assessment) - [ ] For Annex III systems (stand-alone high-risk): self-assessment is generally permitted, except for biometric identification systems (which require third-party assessment by a notified body) → Appendix F: AI Regulation Reference
Content adaptation: converting complex documents into simplified or alternative formats. → Chapter 18: Generative AI — Multimodal
Content areas varied by sub-track:: **Analytics track:** Python, SQL, data visualization, basic statistical modeling, using internal ML platforms - **Engineering track:** ML engineering, model deployment, MLOps practices, working with the firm's internal ML infrastructure - **Quantitative track:** Advanced ML (deep learning, NLP, time → Case Study 2: JPMorgan's AI Training Program — Upskilling 60,000 Employees
Content areas:: How data flows through the organization — from customer interactions to data warehouses to analytical outputs - What AI and ML are, in plain language, with financial services examples - How to interpret model outputs — understanding that a "90% confidence score" on a fraud alert does not mean "defin → Case Study 2: JPMorgan's AI Training Program — Upskilling 60,000 Employees
Content creation: writing, coding, image generation, video production — was transformed overnight. - **Knowledge work** — research, analysis, summarization, translation — became dramatically more productive. - **Customer service** — AI-powered agents could handle increasingly complex interactions. - **Software develo → Chapter 1: The AI-Powered Organization
context window: the maximum amount of text the model can process in a single prompt. Early versions of Copilot used models with context windows of approximately 8,000 tokens (roughly 6,000 words of code). Later versions expanded to 16,000 tokens and beyond, but the challenge remained: a real-world software project → Case Study 1: GitHub Copilot — Prompt Engineering for Code
Continuous Deployment for ML:: Models that pass all tests are automatically registered in the model registry - Approved models are automatically deployed to staging, then production (with manual gates where appropriate) - Deployment includes automated rollback if post-deployment health checks fail → Chapter 12: From Model to Production — MLOps
Continuous Integration for ML:: Every code change (model code, feature engineering code, pipeline code) triggers automated tests - Data validation tests confirm that incoming data meets expected schemas and distributions - Feature engineering tests verify that feature computations produce expected outputs - Model training tests co → Chapter 12: From Model to Production — MLOps
Copy.ai: General marketing and sales content, focused on go-to-market workflows - **Writer** — Enterprise-focused, emphasizing brand governance and compliance - **Writesonic** — SEO-focused content generation with built-in search optimization - **Typeface** — Enterprise content platform with brand-specific m → Case Study 2: Jasper AI and the Marketing Prompt Revolution
Core functionality: The AI feature works, even if performance is modest. 2. **Fallback strategy** — When the AI fails, there's a non-AI alternative that prevents a broken experience. 3. **Feedback mechanism** — Users can signal approval or disapproval, feeding the learning loop. 4. **Monitoring** — The team can measure → Chapter 33: AI Product Management
Core point: Has at least `min_samples` neighbors within `eps` distance. These are the interior of a cluster. 2. **Border point** — Within `eps` distance of a core point, but doesn't have enough neighbors to be a core point itself. These are on the edge of a cluster. 3. **Noise point** — Not within `eps` distanc → Chapter 9: Unsupervised Learning
corporate strategy: the highest-level choices about which businesses to be in, how to allocate capital across them, and how the portfolio creates value. For a diversified company, this is the domain of the CEO and the board. → Chapter 31: AI Strategy for the C-Suite
Cost components:: **Input tokens:** The text you send to the model (prompts, context, documents) - **Output tokens:** The text the model generates (typically 2-4x more expensive per token than input) - **Fine-tuning costs:** One-time training costs if you customize a model - **Infrastructure costs:** For on-premise d → Chapter 17: Generative AI — Large Language Models
Costs:: Development: 2 data scientists x 4 months x $12,000/month fully loaded = ? - Infrastructure: GPU computing at $2,500/month ongoing - Maintenance: 0.5 FTE data scientist at $10,000/month ongoing → Chapter 14 Exercises: NLP for Business
Criminal charges: Boeing agreed to a $2.5 billion deferred prosecution agreement in January 2021, acknowledging that two former employees had deceived the FAA. In 2024, the Department of Justice moved to revoke the deferred prosecution agreement, potentially exposing Boeing to additional criminal liability. - **Regul → Case Study 2: The Boeing 737 MAX MCAS — What Happens Without AI Governance
Culture:: Decisions are made by intuition and seniority, not data - The merchandising team has resisted previous attempts to introduce data-driven buying decisions, arguing that "retail is an art, not a science" - Store managers have high autonomy and low trust in corporate initiatives, particularly technolog → Chapter 1: The AI-Powered Organization
Current state of the technology:: Fully homomorphic encryption (FHE), which supports arbitrary computations, has been theoretically possible since Craig Gentry's breakthrough paper in 2009. But it remains computationally expensive — operations on encrypted data are 10,000x to 1,000,000x slower than the same operations on plaintext. → Chapter 29: Privacy, Security, and AI
Customer churn prediction: a well-defined binary classification problem with clean tabular data 2. **Product demand forecasting** — a time series problem with multiple seasonal patterns and external factors 3. **Customer support ticket routing** — an NLP classification problem requiring text processing and multi-class categor → Case Study 1: DataRobot vs. Hand-Coded ML — A Head-to-Head Comparison
Customer engagement level: customers who read newsletters are likely already more engaged with the brand, and this engagement (not the newsletter itself) drives higher purchasing. (2) **Customer tenure** — longer-tenured customers are more likely both to subscribe to a newsletter and to purchase more, creating a spurious rela → Answers to Selected Exercises

D

Daily (5-10 minutes):: Scan one curated newsletter. *The Batch* (Andrew Ng) and *TLDR AI* provide high-signal summaries of the week's developments. Pick one and read it consistently. → Appendix D: Frequently Asked Questions
Dana Whitfield (38): Senior Marketing Analyst, 8 years at Greenfield. Dana is the team's Excel power user. She has built a complex ecosystem of interconnected spreadsheets that the entire marketing department relies on for weekly and monthly reporting. She can write VLOOKUP formulas in her sleep and has created elaborat → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Data and Data Governance (Article 10): [ ] Use training, validation, and testing datasets that are relevant, sufficiently representative, and as free of errors as possible - [ ] Ensure datasets are appropriate for the intended geographic, behavioral, and functional context - [ ] Implement data governance practices covering data collectio → Appendix F: AI Regulation Reference
data architecture: the structural design of data systems and their relationships. While this chapter focuses on strategy rather than implementation, understanding the major architecture patterns is essential for making informed strategic choices. → Chapter 4: Data Strategy and Data Literacy
data flywheel: a self-reinforcing cycle: → Case Study 1: Amazon's Recommendation Engine — The Store That Knows You
Data Infrastructure:: The company runs on a 15-year-old point-of-sale (POS) system that stores data in a proprietary format incompatible with modern analytics tools - Customer data is split across four systems: the POS, the e-commerce platform, the loyalty program (run by a third-party vendor), and the email marketing pl → Chapter 1: The AI-Powered Organization
data pipeline: the end-to-end process by which raw data is transformed into business value. → Chapter 2: Thinking Like a Data Scientist
Data Pipeline:: Daily POS data from 340 stores, cleaned and validated within 24 hours - External data feeds: promotional calendar, holiday calendar, weather forecasts, local event schedules - Historical sales data enriched with promotional indicators, price change flags, and markdown schedules → Chapter 16: Time Series Forecasting
Data Strategy and Data Literacy: What is data strategy · Data governance fundamentals · Data quality dimensions · Data silos and integration · The CDO role · Building a data-literate organization · Athena's data landscape → AI & Machine Learning for Business
Day of month: captures paycheck cycles (spending often spikes on the 1st and 15th) - **Week of year** — captures annual seasonality - **Holiday flags** — binary indicators for major holidays - **Days until next holiday** — captures pre-holiday shopping behavior - **Is payday** — for businesses where spending corr → Chapter 8: Supervised Learning — Regression
Definition: DataFrame: A two-dimensional, size-mutable, and potentially heterogeneous tabular data structure provided by the pandas library. Think of it as a spreadsheet in your code — rows are records, columns are fields, and the whole thing is programmable. → Chapter 3: Python for the Business Professional
Definition: Python: A high-level, general-purpose programming language created by Guido van Rossum in 1991. Known for its readable syntax and vast ecosystem of third-party libraries. As of 2025, Python is the most widely used programming language in the world (TIOBE Index, IEEE Spectrum, Stack Overflow Developer Survey → Chapter 3: Python for the Business Professional
Definition: Variable: A named reference to a value stored in memory. Variables let you reuse values throughout your code without retyping them. If the underlying value changes, you update it in one place. → Chapter 3: Python for the Business Professional
Desire: addressing fear of replacement is the single most important change management investment for this team. → Answers to Selected Exercises
differencing: computing the change from one period to the next rather than modeling the raw values. → Chapter 16: Time Series Forecasting
Dimension 1: Routine vs. Non-Routine: *Routine tasks* follow explicit, codifiable rules. They can be described as a series of if-then procedures. Data entry, invoice processing, assembly line operations, and basic bookkeeping are routine tasks. - *Non-routine tasks* require flexibility, judgment, and adaptation to novel situations. Nego → Chapter 38: AI, Society, and the Future of Work
Dimension 2: Cognitive vs. Manual: *Cognitive tasks* involve information processing, analysis, communication, and decision-making. - *Manual tasks* involve physical interaction with the environment. → Chapter 38: AI, Society, and the Future of Work
Direct cost savings:: 300 hours/year of analyst time recovered: approximately $22,500 (at a blended cost of $75/hour) - Reduced IT support requests for data exports: approximately $5,000/year - Eliminated need for two Excel add-in licenses: $1,200/year → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Direct Value:: Revenue increase from AI-enabled capabilities (new revenue, upsell, retention) - Cost savings from process automation and optimization - Risk reduction from improved compliance and fraud detection → Chapter 39: Capstone — AI Transformation Plan
Disadvantages of regression trees:: A single tree is prone to overfitting (deep trees memorize training data) - Predictions are "stepped" — a tree cannot predict values outside the range of its training data - Sensitive to small changes in training data (high variance) → Chapter 8: Supervised Learning — Regression
Disadvantages:: Requires specialized talent (expensive and scarce) - Longer time to initial deployment - Full ownership of maintenance, monitoring, and retraining - Higher upfront investment - Risk of building something that doesn't work → Chapter 6: The Business of Machine Learning
disciplined information diet: a deliberate approach to continuous learning that balances depth with breadth, signal with noise, and proactive exploration with focused execution. → Chapter 40: Leading in the AI Era
Discovery Phase: the precarious period between announcing an AI transformation and understanding what one actually requires. CEO Grace Chen has committed $45 million and hired Ravi Mehta as VP of Data & AI. But Ravi quickly discovers that Athena's data infrastructure is fragmented, its organizational silos are deep, → Part 1: Foundations of AI for Business
Diverge: Generate multiple plausible strategies (the "branches" of the tree) 2. **Evaluate** — Assess each branch against consistent criteria 3. **Converge** — Select the best option with reasoned justification → Chapter 20: Advanced Prompt Engineering
Divisive (top-down): Start with everything in one cluster. At each step, split the most heterogeneous cluster. Repeat until every point is its own cluster. → Chapter 9: Unsupervised Learning
Documentation: Is there a data dictionary, codebook, or schema description? Poorly documented data leads to incorrect assumptions. → Appendix E: Data Sources Guide
domain expertise: your expertise — it borrows the ability to ask the right questions, interpret results in context, and translate analytical findings into business action. This is the piece that's hardest to automate and most often undervalued. → Chapter 2: Thinking Like a Data Scientist
Domain knowledge: You must understand the subject matter well enough to evaluate whether the model's output is correct. An LLM can generate a financial analysis, but only a person with financial literacy can judge whether the analysis makes sense. - **Specificity** — Vague prompts produce vague outputs. The ability t → Chapter 19: Prompt Engineering Fundamentals

E

Effective length specifications:: "Respond in exactly three sentences." - "Maximum 200 words." - "Write a one-paragraph summary (four to six sentences)." - "Provide a detailed analysis of 800-1,000 words." → Chapter 19: Prompt Engineering Fundamentals
Efficiency metrics:: Time from project initiation to model deployment (should decrease over time) - Reuse rate of shared components (features, pipelines, model templates) - Infrastructure cost per model served → Chapter 32: Building and Managing AI Teams
Emerging AI Technologies: Agentic AI · Edge AI and on-device inference · Quantum computing reality check · Neuromorphic computing · Hardware economics · Open-source vs. closed models · The competitor threat to Athena → AI & Machine Learning for Business
enterprise prompt governance: the organizational practices, policies, and technical controls that ensure prompts are reliable, secure, and compliant at scale. → Chapter 20: Advanced Prompt Engineering
eps (epsilon): the radius of the neighborhood around each point - **min_samples** — the minimum number of points within that radius for a point to be considered a "core point" → Chapter 9: Unsupervised Learning
Ethical considerations: Does the data contain sensitive attributes (race, gender, health status)? If so, have appropriate protections been applied? Could your analysis cause harm to the populations represented? See Chapters 25–30 for frameworks. → Appendix E: Data Sources Guide
Ethical grounding: A personal framework for navigating bias, fairness, privacy, and the societal impact of AI decisions. The willingness to slow down or halt a project that delivers business value but causes harm. (Chapters 25–30) → Answers to Selected Exercises
exact matching: the database checks each record against precise conditions. → Chapter 21: AI-Powered Workflows
Example — Testing a product description generator:: Simple product: "Blue cotton t-shirt, S-XXL, $29.99" - Complex product: "Wireless noise-cancelling headphones with 30-hour battery, Bluetooth 5.3, USB-C charging, available in black, navy, and forest green, compatible with iOS and Android, $199.99" - Edge case: Product with very limited information: → Chapter 19: Prompt Engineering Fundamentals
Examples at Athena:: Nightly churn scores for all active customers, consumed by the marketing team each morning - Weekly demand forecasts for every SKU, consumed by the supply chain planning system - Monthly customer segment assignments, consumed by the CRM → Chapter 12: From Model to Production — MLOps
Examples of format specifications:: "Format your response as a markdown table with columns: Category, Finding, Recommendation." - "Return your analysis as valid JSON with the following structure: {category: string, sentiment: string, confidence: float}." - "Write your response as three bullet points, each no more than two sentences." → Chapter 19: Prompt Engineering Fundamentals
Exercise 2.1: List the six phases of CRISP-DM in order. For each phase, write one sentence describing its primary purpose. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.10: For each of the following variables, identify the measurement scale (nominal, ordinal, interval, ratio) and explain why you chose that classification: - (a) Customer Net Promoter Score (0–10 scale) - (b) Product category (Electronics, Clothing, Home & Garden, etc.) - (c) Temperature in a warehouse ( → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.11: A SaaS company reports: "Our average customer lifetime value (CLV) is $14,200." Describe a scenario in which this single number could be highly misleading. What additional information about the distribution would you want, and how would it change your strategic recommendations? → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.12: Map the five stages of the data pipeline to a specific business scenario of your choosing (e.g., a ride-sharing app, an online retailer, a hospital system). For each stage, identify one thing that could go wrong and describe the downstream consequences. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.13: Read the following claim and evaluate it critically, using concepts from this chapter: → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.14: The Athena Retail Group case in this chapter illustrated how three departments each had data supporting their preferred explanation for a customer satisfaction decline. This is sometimes called the "advocacy trap" — using data to advocate for a predetermined conclusion rather than to investigate obj → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.15: A product manager presents the following A/B test results: → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.16: Consider this scenario: A city government notices that neighborhoods with more police officers have higher crime rates. A city council member proposes reducing police presence in high-crime neighborhoods, arguing that "the data clearly shows police presence causes crime." → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.17: A venture capital firm uses the following model to evaluate startup investments: they score each company on 50 different metrics, then invest in companies that score in the top 10% across the most metrics. After two years, they notice that their portfolio companies perform no better than the market → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.18: Research Tyler Vigen's "Spurious Correlations" project (tylervigen.com). Find three correlations that you find particularly amusing or instructive. For each, identify the most likely explanation (coincidence, shared confounder, or methodological artifact). → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.19: Find a real business case study where an organization's failure can be traced to a correlation/causation error. Write a 500-word analysis describing: (a) what the organization believed, (b) what the data actually showed, (c) what went wrong in their reasoning, and (d) what they should have done diff → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.2: Define the following terms in your own words, providing a business example for each: - (a) Confounding variable - (b) Spurious correlation - (c) Confirmation bias - (d) Regression to the mean - (e) Dark data → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.20: Research one of the following alternative data science methodologies and compare it to CRISP-DM: - (a) TDSP (Team Data Science Process) by Microsoft - (b) OSEMN (Obtain, Scrub, Explore, Model, Interpret) - (c) KDD (Knowledge Discovery in Databases) → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.21: Investigate how a company you admire (or your current employer) handles the "last mile" problem. Through publicly available information (case studies, blog posts, interviews, annual reports) or your own experience, describe: What practices do they use to translate analytical insights into action? Wh → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.22: "In business, acting on correlation is often rational even when causation hasn't been established." Argue both sides of this statement. Under what circumstances is acting on correlation justified? When is it dangerous? How should the stakes of the decision influence the standard of evidence required → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.23: The chapter argues that the data science mindset emphasizes comfort with uncertainty, while business culture prizes confidence and clear answers. Is this tension resolvable? How should a data scientist communicate probabilistic findings to an executive who demands a yes-or-no recommendation? → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.24: Consider the statement: "Data preparation consumes 60–80% of data science project time." Some argue this is a problem to be solved through automation. Others argue it's actually where the most important intellectual work happens. Take a position and defend it. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.25: A colleague argues: "We don't need a formal process like CRISP-DM. Our best insights come from smart people playing with data — exploring freely, following their intuition, and seeing what emerges." How would you respond? Is there a role for unstructured exploration in data science? How would you ba → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.26: You are the newly hired head of analytics at a regional grocery chain with 85 stores. Customer complaints have risen 20% over the past six months, and the CEO wants answers. Apply the complete CRISP-DM framework to this problem: → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.27: Select an industry you're interested in (healthcare, financial services, manufacturing, retail, education, etc.). For that industry: → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.28: Design a "data literacy assessment" for non-technical managers at your organization (or a hypothetical one). Create 10 questions that test the concepts covered in this chapter — not memorization, but applied understanding. Include an answer key with explanations for why the correct answer is correct → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.3: Explain the difference between structured and unstructured data. Give three examples of each that might exist in a retail company's data ecosystem. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.4: Name the four types of business questions in the analytics maturity framework. For each type, provide an example question that a hospital administrator might ask. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.5: Describe the four measurement scales (nominal, ordinal, interval, ratio). For each scale, identify one variable from an employee database that would be measured at that scale. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.6: What is the "last mile" problem in analytics? Identify three reasons why analytical insights often fail to drive organizational action. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.7: A marketing director presents the following finding: "Customers who read our email newsletter purchase 40% more than customers who don't." She recommends investing $500,000 in expanding the newsletter program to all customers. → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.8: You are a data scientist at a mid-size software company. The CEO tells you: "We need to use AI to improve sales." Apply the Business Understanding phase of CRISP-DM to transform this vague directive into a well-defined data science problem. Write: (a) A specific, measurable business objective. (b) T → Chapter 2 Exercises: Thinking Like a Data Scientist
Exercise 2.9: Classify each of the following business questions as descriptive, diagnostic, predictive, or prescriptive: - (a) What was our customer acquisition cost by channel last quarter? - (b) Why did renewal rates decline among enterprise clients? - (c) Which of our current customers are most likely to upgra → Chapter 2 Exercises: Thinking Like a Data Scientist
EXPECTED OUTCOMES (3-YEAR): `[Outcome 1: e.g., "$15M in AI-driven revenue or cost savings"]` - `[Outcome 2: e.g., "10 AI models in production across 4 business units"]` - `[Outcome 3: e.g., "AI maturity level advanced from Level 2 to Level 4"]` - `[Outcome 4: e.g., "AI governance framework fully operational"]` → Appendix B: Templates and Worksheets
Exploratory Data Analysis: The EDA mindset · Descriptive statistics for managers · Visualization best practices · matplotlib and seaborn · Distribution analysis · Correlation analysis · Telling stories with data · *Code: `EDAReport`* → AI & Machine Learning for Business
external regressors: variables external to the time series that influence its behavior. Prophet (as we saw above with the `promo` variable) and ARIMA (through ARIMAX, the extension with exogenous variables) can incorporate external regressors to improve forecast accuracy. → Chapter 16: Time Series Forecasting

F

Facial landmark detection: identifying the precise contours of lips, eyes, cheeks, and jawline to ensure virtual products are applied accurately - **Color science modeling** — adjusting how a product's color appears based on the user's skin tone, which varies dramatically under different lighting conditions - **Real-time rend → Case Study 1: Sephora's AI-Powered Beauty Experience — Personalization Done Right
Fairness: AI systems should treat all people fairly 2. **Reliability and Safety** — AI systems should perform reliably and safely 3. **Privacy and Security** — AI systems should be secure and respect privacy 4. **Inclusiveness** — AI systems should empower everyone and engage people 5. **Transparency** — AI s → Case Study 1: Microsoft's Responsible AI Standard — Governance at Scale
Fairness criteria:: Relevance score must not differ by more than 10 percentage points across age groups, gender groups, or geographic regions - No individual product category should dominate more than 30% of recommendations for any user segment → Chapter 33: AI Product Management
Fairness, Explainability, and Transparency: Fairness definitions (and their conflicts) · Disparate impact · SHAP values · LIME · Model cards · Datasheets for datasets · Athena's explainability initiative · *Code: `ExplainabilityDashboard`* → AI & Machine Learning for Business
False: K-means is sensitive to initialization. Different starting centroids can produce different final clusters. This is why scikit-learn's implementation runs the algorithm multiple times (n_init parameter) and selects the best result. → Chapter 9 Quiz: Unsupervised Learning
False.: AutoML platforms can identify some data quality issues but do not eliminate the need for human data quality review. The chapter explicitly warns that AutoML makes data quality "invisible" rather than "optional." → Chapter 22 Quiz: No-Code / Low-Code AI
Feasibility (1-10):: Data readiness — availability, quality, accessibility (0-3 points) - Technical complexity — model complexity, integration difficulty (0-3 points) - Organizational readiness — skills, culture, change management effort (0-2 points) - Regulatory/ethical risk — compliance requirements, bias risk (0-2 po → Chapter 39: Capstone — AI Transformation Plan
Five recurring themes: the Hype-Reality Gap, Human-in-the-Loop, Data as Strategic Asset, Build vs. Buy, and Responsible Innovation — provide a framework for analyzing AI decisions throughout the textbook and throughout your career. → Chapter 1: The AI-Powered Organization
For AI product managers, evaluate:: Problem prioritization — given five potential AI projects, how do they evaluate and rank them? - Stakeholder management — how do they handle conflicting priorities between data scientists and business leaders? - Technical literacy — they don't need to build models, but they need to understand trade- → Chapter 32: Building and Managing AI Teams
For classification (e.g., churn prediction):: Value of a true positive = value of the intervention (e.g., retained revenue) minus the cost of the intervention - Cost of a false positive = cost of the intervention applied to a non-churner - Cost of a false negative = lost revenue from a missed churner - Net model value = (TP x value_per_TP) - (F → Chapter 6: The Business of Machine Learning
For data scientists, evaluate:: Problem framing ability — give a vague business scenario and assess how the candidate structures the problem - Statistical reasoning — can they explain their modeling choices and their limitations? - Communication — can they explain a technical concept to a non-technical stakeholder? (Have a non-tec → Chapter 32: Building and Managing AI Teams
For each factor, compute the "partial score": the contribution of that factor to the overall score, holding all other factors at their observed values. 3. **Compare each partial score to the population average** — a factor that is significantly below the population average for its category is a negative contributor. 4. **Rank the negative contr → Case Study 2: FICO's Explainable AI Journey — Making Credit Scoring Transparent
For ML engineers, evaluate:: System design — given a model and a set of requirements, how would they design the production system? - Code quality — not whiteboard algorithms, but production-quality code with tests, documentation, and error handling - Debugging — give them a broken ML pipeline and ask them to diagnose and fix it → Chapter 32: Building and Managing AI Teams
For regression (e.g., demand forecasting):: Under-prediction cost = lost sales, stockouts, missed revenue opportunities - Over-prediction cost = excess inventory, carrying costs, markdowns - Asymmetric costs should inform the loss function used in training → Chapter 6: The Business of Machine Learning
Foundations Phase: the period where initial pilot projects prove (or disprove) the value of ML for specific business problems. Ravi Mehta's team tackles their top three use cases: churn prediction, product recommendations, and demand forecasting. Not all go smoothly. The churn model achieves impressive accuracy but al → Part 2: Core Machine Learning for Business
From Model to Production — MLOps: The deployment gap · MLOps overview · Model serving patterns · Monitoring and drift detection · Feature stores · CI/CD for ML · MLOps maturity model · Athena's MLOps journey → AI & Machine Learning for Business
functional strategy: how each function (marketing, operations, finance, HR) supports the business unit's competitive position. → Chapter 31: AI Strategy for the C-Suite
Further Reading:: Turing, A. M. (1950). "Computing Machinery and Intelligence." *Mind*, 59(236), 433--460. - Copeland, B. J. (2004). *The Essential Turing*. Oxford University Press. → Appendix G: Key Studies and Cases

G

GANDALF: Google, Amazon, Netflix, DBS, Apple, LinkedIn, Facebook. The idea was not to compete with these technology companies but to learn from their operating models: their approach to data, experimentation, customer experience, and speed. → Case Study 1: DBS Bank — The World's Best Digital Transformation
GANDALF Days: hackathon-like events where teams across the bank identified problems, built prototypes, and tested solutions. Critically, these were not IT-only events. Business teams, operations staff, and support functions participated. The program produced thousands of experiments, many of which failed — and th → Case Study 1: DBS Bank — The World's Best Digital Transformation
Gauge reading: automatically monitoring analog pressure and temperature gauges → Chapter 15: Computer Vision for Business
GCP Considerations:: Smaller market share means fewer third-party integrations and a smaller community - Enterprise sales motion historically less mature than AWS or Azure (though rapidly improving) - Narrower service portfolio compared to AWS - Google has a reputation for deprecating products — enterprise customers wor → Chapter 23: Cloud AI Services and APIs
GCP Strengths for AI:: Deepest AI research heritage — many foundational ML innovations originated at Google - TPUs (Tensor Processing Units) — custom AI accelerators that offer price/performance advantages for specific workloads - Strongest data and analytics integration (BigQuery, Dataflow, Looker) — excellent for organi → Chapter 23: Cloud AI Services and APIs
Generative AI — Large Language Models: The transformer story · How LLMs are trained · Capabilities and limitations · Hallucination · Major providers (OpenAI, Anthropic, Google, Meta) · Fine-tuning vs. prompting · Enterprise deployment · *Code: OpenAI API patterns* → AI & Machine Learning for Business
Generative AI — Multimodal: Beyond text · Image generation (DALL-E, Midjourney, Stable Diffusion) · Audio and speech · Video generation · Code generation · Intellectual property issues · Business applications and risks → AI & Machine Learning for Business
golden record: a single, authoritative representation of each entity that serves as the definitive version across the organization. Creating a golden record requires two capabilities: → Chapter 4: Data Strategy and Data Literacy
Google TPUs: custom-designed tensor processing units, now in their fifth generation, offering price/performance advantages for specific workloads - **AWS Trainium and Inferentia** — custom chips designed for ML training (Trainium) and inference (Inferentia) at lower cost than NVIDIA GPUs - **Azure Maia** — Micro → Chapter 23: Cloud AI Services and APIs
Governance culture: built through leadership commitment, training, incentives, psychological safety, and workflow embedding — is what transforms governance from a compliance exercise into an organizational value. → Chapter 27: AI Governance Frameworks
Governance metrics:: Percentage of models reviewed before deployment - Number of bias incidents detected and resolved - Regulatory compliance rate → Chapter 32: Building and Managing AI Teams
Governance operating models: centralized, federated, and hybrid — determine how governance authority is distributed across the organization. Most mature organizations converge on hybrid models. → Chapter 27: AI Governance Frameworks
Governance:: No data governance framework exists - No one has the title "data owner" for any dataset - Privacy compliance is reactive — the legal team reviews data practices only when a specific question arises - There is no AI use policy; several employees are using ChatGPT to draft customer communications with → Chapter 1: The AI-Powered Organization
Graduated rollout strategies:: **Shadow mode:** The AI system runs alongside the existing system, generating predictions that are logged but not shown to users. This allows the team to evaluate performance in production without risk. - **Internal dogfooding:** Employees use the AI feature before customers. This surfaces obvious p → Chapter 33: AI Product Management

H

hallucination: the model's tendency to generate statements that are fluent, confident, and false. → Chapter 17: Generative AI — Large Language Models
High-demand roles:: **AI Product Manager.** Manages AI-powered products and features. Requires understanding of AI capabilities and limitations plus traditional product management skills. Median US compensation in 2025: $160K-$210K (Chapter 33). - **AI Strategy / Transformation Lead.** Develops and executes organizatio → Appendix D: Frequently Asked Questions
Hire consultants when:: You need to move fast and lack internal capability. - The project is a one-time initiative (an AI strategy assessment, a proof of concept, a vendor evaluation). - You need specialized expertise you will not need long-term (computer vision for a specific manufacturing quality problem). - You want to → Appendix D: Frequently Asked Questions
Hiring Decision Thresholds:: **Strong Hire:** Weighted score >= 4.0, no competency below 3 - **Hire:** Weighted score >= 3.5, no competency below 2 - **No Hire:** Weighted score < 3.5 or any competency at 1 → Appendix B: Templates and Worksheets
How to address it:: Be honest about which roles will change and which will not. Vague reassurances ("AI won't replace anyone") erode trust when employees can see that their tasks are being automated. - Distinguish between *task automation* and *job elimination.* A customer service representative whose routine inquiries → Chapter 35: Change Management for AI
Human override capability: supply chain managers could override model predictions for categories with obviously disrupted patterns - **External data integration** — incorporating real-time data on COVID case counts, government restriction announcements, and mobility data (from aggregated cell phone signals) as new features - → Case Study 1: Walmart's Demand Forecasting — Scale, Speed, and Strawberry Pop-Tarts
Human Oversight (Article 14): [ ] Design the system to enable effective human oversight during the period of use - [ ] Enable overseers to: fully understand system capabilities and limitations, properly monitor operation, remain aware of automation bias, correctly interpret output, decide not to use the system or override/revers → Appendix F: AI Regulation Reference
Human review included:: A review by a member of the central ML team for models in high-impact contexts (search ranking, pricing) - A review of the A/B test design — specifically the hypothesis, the success metrics, the sample size calculation, and the expected minimum detectable effect - For models affecting user experienc → Case Study 2: Booking.com — 150 Teams, One ML Platform
Hybrid approach: supplement collaborative filtering with content-based features (product descriptions, categories), so recommendations can be made based on item attributes when interaction data is too sparse. Trade-off: requires maintaining a content feature pipeline alongside the collaborative system. → Answers to Selected Exercises
Hyperparameter tuning: grid search, random search, and Bayesian optimization — systematically optimizes model configuration. Random search is usually more efficient than grid search; Bayesian optimization is more efficient still. → Chapter 11: Model Evaluation and Selection
hyperparameters: settings that are not learned from data but must be chosen by the practitioner. A random forest has the number of trees, the maximum depth of each tree, the minimum number of samples per leaf. A logistic regression has the regularization strength. A neural network has the learning rate, the number o → Chapter 11: Model Evaluation and Selection

I

Impact (1-10):: Revenue potential or cost savings (0-3 points) - Strategic alignment with organizational priorities (0-3 points) - Competitive differentiation (0-2 points) - Scale of affected stakeholders (0-2 points) → Chapter 39: Capstone — AI Transformation Plan
Impact metrics:: Business value generated by AI projects (measured in collaboration with business units) - Number of AI project proposals from Tier 2 graduates - Employee AI literacy scores (pre- and post-training assessments) → Chapter 32: Building and Managing AI Teams
Implications for business AI:: A recommendation engine that reveals too much about its scoring logic can leak customer purchase patterns. - A credit scoring model that provides detailed explanations (as required by some regulations) may reveal enough about its decision boundary to enable inference about other applicants. - There → Chapter 29: Privacy, Security, and AI
Important caveats for t-SNE:: **Distances between clusters are not meaningful** — two clusters that appear far apart in a t-SNE plot may not actually be very different. t-SNE preserves local structure but distorts global structure. - **Cluster sizes are not meaningful** — a large cluster in t-SNE space may not contain more point → Chapter 9: Unsupervised Learning
Incident Summary: What happened, when, and what was the impact? b) **Timeline** — A minute-by-minute or hour-by-hour timeline of the incident from trigger to resolution c) **Root Cause Analysis** — Use the "5 Whys" technique to trace from symptom to root cause d) **Contributing Factors** — What organizational or proc → Chapter 12 Exercises: From Model to Production — MLOps
Indirect value:: Faster time-to-insight: Promotional analysis that took 2 days now takes 2 hours, enabling the team to evaluate more options and make better decisions - Reduced error rate: No more formula drift or copy-paste errors in reports - Improved employee satisfaction: Team members reported spending more time → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Industry Applications of AI: Financial services · Healthcare · Manufacturing · Retail (beyond Athena) · Professional services · Education · Public sector · Cross-industry patterns → AI & Machine Learning for Business
Integration testing: verifying that all components of the system work together correctly in a staging environment that mirrors production. - **Smoke testing** — running the system on a small set of simulated trades immediately after deployment to verify basic functionality. - **Consistency checking** — confirming that a → Case Study 2: The $440 Million Knight Capital Disaster — When Model Deployment Goes Wrong
interleaving experiments: a technique where the old and new models each select some ads for the same query, and the results are compared directly. Interleaving is more statistically efficient than a standard A/B test because each query serves as its own control, reducing variance and requiring fewer impressions to detect sma → Case Study 1: Google's Ad Click Prediction — Optimizing for Billions of Micro-Decisions
Internally homogeneous: customers within a segment are similar to each other - **Externally heterogeneous** — customers in different segments are meaningfully different from each other - **Actionable** — each segment suggests a different business strategy → Chapter 9: Unsupervised Learning
Interquartile Range (IQR) method: Calculate the IQR (Q3 - Q1) for each feature. Points below Q1 - 1.5 * IQR or above Q3 + 1.5 * IQR are flagged. More robust to non-normal distributions than the z-score method. → Chapter 9: Unsupervised Learning
Investment:: Marcus's time building initial infrastructure: approximately 80 hours (one-time) - Team training time: approximately 40 hours total across all team members (one-time) - Ongoing learning: approximately 2 hours/person/month (ongoing) → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
It attracts the wrong kind of attention: regulators, plaintiffs' lawyers, and journalists look for companies whose AI claims exceed their AI reality. 3. **It corrodes trust** -- employees, customers, and investors who feel they were oversold on AI become skeptical of future AI initiatives, even legitimate ones. → Chapter 31: AI Strategy for the C-Suite

J

James Ortiz (44): VP of Marketing. James is results-oriented and not particularly technical. He cares about the accuracy and timeliness of the insights his team delivers, not the tools they use to produce them. He will support a change if someone can show him it will make the team faster and more reliable. → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey

K

Keiko Tanaka (29): Marketing Coordinator, 2 years at Greenfield. Keiko handles campaign execution and basic reporting. She has no coding experience and is anxious about any change that might make her feel incompetent. → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Key design decisions:: **Input validation:** Check for missing or malformed features before running the model - **Response format:** Include confidence scores, model version, and request IDs for traceability - **Error handling:** Return meaningful error messages for invalid requests, model failures, or timeout scenarios - → Chapter 12: From Model to Production — MLOps
Key Functions:: `make_classification()` — Binary or multi-class classification with controllable separability - `make_regression()` — Linear regression targets with specified noise - `make_blobs()` — Gaussian clusters for clustering exercises - `make_moons()` / `make_circles()` — Non-linearly separable classes for → Appendix E: Data Sources Guide
Key Models:: `GaussianCopulaSynthesizer` — Fast, parametric; best for well-behaved numerical data - `CTGANSynthesizer` — GAN-based; handles mixed data types and complex distributions - `TVAESynthesizer` — VAE-based; often faster convergence than CTGAN - **Example:** → Appendix E: Data Sources Guide
Key orchestration concepts:: **Directed Acyclic Graphs (DAGs):** Pipelines are defined as DAGs — a series of steps with dependencies but no circular references - **Scheduling:** Pipelines can be triggered by time (run nightly), by event (new data arrives), or manually - **Retry logic:** Failed steps can be retried automatically → Chapter 12: From Model to Production — MLOps
Key principles for ML on-call:: **Clear ownership:** Every production model has a designated owner and an on-call rotation - **Runbooks:** Step-by-step procedures for responding to common alerts (high null rate, prediction distribution shift, latency spike) - **Escalation paths:** If the on-call engineer cannot resolve the issue w → Chapter 12: From Model to Production — MLOps
Key questions to ask:: Does the provider use your data to train future models? (Most enterprise agreements now include opt-out provisions, but read the terms carefully.) - Where is the data processed? (Relevant for GDPR, data residency requirements, and industry-specific regulations.) - Is the data encrypted in transit an → Chapter 17: Generative AI — Large Language Models
Key syntax rules:: The condition (`quarterly_revenue >= quarterly_target`) is followed by a colon `:` - The indented block below the `if` runs only when the condition is `True` - The `else` block runs when the condition is `False` - **Indentation matters.** Python uses indentation (4 spaces, by convention) to define c → Chapter 3: Python for the Business Professional
Key Takeaways: The 10–15 most important points, organized by theme 2. **Exercises** — 15–40 problems ranging from recall questions to open-ended analysis 3. **Quiz** — 15–25 multiple-choice and short-answer questions for self-assessment 4. **Case Study 1** — A detailed scenario with discussion questions (typically → How to Use This Book

L

Lagging Indicators:: Sustained usage (not just initial trial but continued use over months) - Business outcome improvement (the metrics from Chapter 34's ROI framework) - Employee sentiment trends (improving, stable, or declining over time) - Override rate trends (decreasing toward a healthy equilibrium) → Chapter 35: Change Management for AI
Language and framework context: information about the programming language, imported libraries, and code patterns already used in the project. If the project uses Python with pandas and follows a specific naming convention, Copilot's prompt encodes this context. → Case Study 1: GitHub Copilot — Prompt Engineering for Code
latent factors: hidden dimensions that explain the patterns in the ratings matrix. → Chapter 10: Recommendation Systems
Layer 1: Data Tests (Foundation): Schema validation (expected columns, data types) - Completeness checks (null value rates within acceptable bounds) - Distribution checks (feature distributions haven't shifted dramatically) - Freshness checks (data is not stale) - Referential integrity checks (foreign keys resolve correctly) → Chapter 12: From Model to Production — MLOps
Layer 2: Feature Tests: Feature computation correctness (unit tests for feature engineering code) - Feature value range validation (features within expected bounds) - Training-serving consistency (features computed the same way in both environments) → Chapter 12: From Model to Production — MLOps
Layer 3: Model Tests: Minimum performance thresholds (accuracy, precision, recall, AUC above baseline) - Performance on critical subgroups (no dramatic degradation for any segment) - Inference speed within latency requirements - Model size within deployment constraints → Chapter 12: From Model to Production — MLOps
Layer 4: Integration Tests: End-to-end pipeline execution (data ingestion through prediction output) - API contract tests (request/response formats match specification) - Load tests (system handles expected traffic volume) - Failover tests (system handles component failures gracefully) → Chapter 12: From Model to Production — MLOps
Leading in the AI Era: The AI-ready leader profile · Continuous learning · Building AI intuition · NK and Tom's journeys · Athena's transformation complete · NK hired as Director of AI Strategy · Closing reflection → AI & Machine Learning for Business
Leading Indicators:: Training completion and assessment scores - Manager communication quality (are managers talking about AI in team meetings?) - Help desk ticket volume (high volume early = engagement; high volume late = problems) - Feedback submission rate (employees providing input on the AI system) → Chapter 35: Change Management for AI
Less effective:: "Be concise." (How concise?) - "Keep it short." (How short?) - "Write a comprehensive response." (How long is comprehensive?) → Chapter 19: Prompt Engineering Fundamentals
License: Can you legally use the data for your intended purpose? Academic use, commercial use, and redistribution rights vary by dataset. Check before you build. → Appendix E: Data Sources Guide
Lilli: an internal generative AI tool named after Lillian Dombrowski, the firm's first professional staff member hired in 1935. Lilli was designed not as a general-purpose chatbot but as a specialized research and analysis assistant for consultants. → Case Study 2: McKinsey's AI-Augmented Consulting — Prompt Chains in Professional Services
Limitations of federated learning:: **Communication overhead.** Sending model updates back and forth requires bandwidth. For large models, this can be prohibitive. - **Data heterogeneity.** Different devices have different data distributions. A user who texts primarily in medical terminology trains a very different local model than a → Chapter 29: Privacy, Security, and AI
List comprehensions: A compact way to create lists. This is one of Python's most powerful features: → Chapter 3: Python for the Business Professional
LLM capabilities are genuine and broad: text generation, summarization, translation, code generation, analysis, and classification. But the most valuable business applications are often the unglamorous ones: drafting, extracting, and classifying at scale. → Chapter 17: Generative AI — Large Language Models
LLM limitations are equally real: hallucination (confident fabrication), knowledge cutoffs, reasoning failures, sycophancy, and prompt injection. These are not bugs to be patched but consequences of the architecture. → Chapter 17: Generative AI — Large Language Models
local differential privacy: the strongest variant — in which noise is added on the user's device before any data is transmitted to Apple. This means Apple's servers never receive unperturbed individual data. → Case Study 2: Apple's Differential Privacy — Privacy-Preserving AI at Scale
Long-term holdback groups: a small percentage of users who permanently remain on the old model, allowing Google to measure the cumulative impact of all improvements over time. → Case Study 1: Google's Ad Click Prediction — Optimizing for Billions of Micro-Decisions

M

Mahalanobis distance: A multivariate extension that accounts for correlations between features. It measures how far a point is from the center of the data distribution, normalized by the distribution's shape. Effective when features are correlated (as they often are in business data). → Chapter 9: Unsupervised Learning
Manage: Provides a common vocabulary and structured approach to AI risk - Designed to be adaptable across sectors, organization sizes, and AI maturity levels - Increasingly referenced in procurement requirements, industry standards, and state-level legislation → Chapter 28: AI Regulation --- Global Landscape
Manufacturing: detecting equipment failures before they occur (predictive maintenance). **Healthcare** — identifying unusual patient outcomes that may indicate medical errors. **Finance** — detecting insider trading through unusual trading patterns. **Retail** — spotting inventory shrinkage or pricing errors. **Cy → Chapter 9: Unsupervised Learning
Marcus Chen (26): Marketing Analyst, 1 year at Greenfield. Marcus graduated from a business analytics program where he learned basic Python and R. He has been quietly frustrated that Greenfield's analytics stack is entirely Excel-based. He sees the limitations daily but has not felt empowered to push for change. → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Marketing automation: email sequences triggered by specific behaviors (abandoned cart, time since last purchase, browsing activity) - **Programmatic advertising** — algorithmic buying and selling of ad inventory in real-time auctions - **Customer data platforms (CDPs)** — unified repositories of customer data drawn from → Chapter 24: AI for Marketing and Customer Experience
Mathematical reasoning: financial calculations, unit economics, statistical analysis - **Multi-step logic** — policy evaluation, contract analysis, compliance checking - **Causal reasoning** — diagnosing why metrics changed, root cause analysis - **Decision-making** — comparing options with multiple criteria → Chapter 20: Advanced Prompt Engineering
matrix factorization: a close relative of PCA — to compress the massive user-song interaction matrix into a lower-dimensional space. Imagine a matrix with 600 million rows (users) and 100 million columns (songs), where each cell indicates how many times user *i* played song *j*. This matrix is almost entirely empty (spar → Case Study 1: Spotify's Discover Weekly — Clustering Taste in a Sea of Music
Measuring AI ROI: ROI frameworks for AI · Cost taxonomy · Direct vs. indirect value · Time-to-value · When to kill AI projects · Portfolio management · *Code: `AIROICalculator`* → AI & Machine Learning for Business
Measuring failure modes: not just average quality, but the distribution of quality and the severity of the worst outputs → Chapter 17: Generative AI — Large Language Models
Methods for generating synthetic data:: **Statistical modeling.** Fit probability distributions to the real data and sample from those distributions. - **Generative adversarial networks (GANs).** Train a GAN on the real data to generate synthetic records that are statistically indistinguishable from real ones (see Chapter 18 for GAN funda → Chapter 29: Privacy, Security, and AI
Mitigation strategies:: Implement retry logic with exponential backoff (see the Python section below) - Use request queuing to smooth traffic spikes - Maintain fallback models (e.g., route overflow traffic to a smaller, faster model) - Negotiate higher rate limits through enterprise agreements → Chapter 17: Generative AI — Large Language Models
ML Platform team: a dedicated engineering team whose mission was to build and maintain shared ML infrastructure that all squads could use. → Case Study 1: Spotify's ML Guild — Scaling AI Expertise Across Squads
Model Evaluation and Selection: Confusion matrix · Precision, recall, F1 · ROC/AUC · Cost-sensitive evaluation · Cross-validation · Hyperparameter tuning · A/B testing for models · Business impact translation · *Code: `ModelEvaluator`* → AI & Machine Learning for Business
Model registry records: a catalog of all production models with metadata including purpose, owner, risk tier, data sources, performance metrics, known limitations, deployment date, and review schedule - **Impact assessment records** — completed assessments for all models above the documentation-only tier - **Review and app → Chapter 27: AI Governance Frameworks
Modeling:: Prophet models at the category-region level (approximately 200 primary forecast series) - Holt-Winters models as a secondary method for ensemble weighting - Hierarchical disaggregation to store-SKU level using historical proportions - Ensemble of Prophet and Holt-Winters, weighted by walk-forward cr → Chapter 16: Time Series Forecasting
Monthly (2-4 hours):: Attend one virtual or in-person event: a webinar, meetup, or conference talk. The MLOps Community, AI Product Institute, and local AI meetups provide accessible entry points. - Complete one hands-on tutorial or mini-project. Kaggle competitions, Hugging Face tutorials, and cloud provider workshops k → Appendix D: Frequently Asked Questions
Much faster: UMAP scales better to large datasets - **Better preservation of global structure** — distances between clusters are somewhat more meaningful than in t-SNE - **More consistent** — results are more reproducible across runs - **Can be used for general-purpose dimensionality reduction**, not just visual → Chapter 9: Unsupervised Learning

N

Name: The resource or dataset - **Source** — Where to find it (described by platform and path rather than full URL, since web addresses change) - **Description** — What it contains and why it matters - **Size** — Approximate scale - **Format** — File types and structures - **Access** — Licensing, registra → Appendix E: Data Sources Guide
Name and description: What the prompt does and when to use it 2. **The prompt template** — With variable placeholders for customization 3. **Parameter settings** — Recommended temperature, max tokens, etc. 4. **Example inputs and outputs** — Showing what good results look like 5. **Version history** — What changed and wh → Chapter 19: Prompt Engineering Fundamentals
Net value = $136,000 - $7,200 - $24,000 = $104,800: At 0.50: TP value = 260 x $400 = $104,000; FP cost = 200 x $15 = $3,000; FN cost = 140 x $400 = $56,000. **Net value = $104,000 - $3,000 - $56,000 = $45,000** - At 0.70: TP value = 160 x $400 = $64,000; FP cost = 60 x $15 = $900; FN cost = 240 x $400 = $96,000. **Net value = $64,000 - $900 - $96,000 → Answers to Selected Exercises
Neural Networks Demystified: The neuron analogy · Layers and weights · Activation functions · Training (gradient descent intuition) · CNNs vs. RNNs vs. Transformers · GPU economics · When deep learning is worth it → AI & Machine Learning for Business
NLP for Business: Text as data · Tokenization and embeddings · Sentiment analysis · Named entity recognition · Topic modeling · The transformer revolution · Athena's review analysis · *Code: `ReviewAnalyzer`* → AI & Machine Learning for Business
No-Code / Low-Code AI: The democratization of AI · AutoML platforms · Drag-and-drop ML · When no-code works (and when it doesn't) · Vendor evaluation framework · Shadow AI risks · Athena's citizen data science program → AI & Machine Learning for Business

O

Open-weight models: including Meta's Llama series, Mistral's models, Alibaba's Qwen, and Google's Gemma --- make the model weights publicly available. Users can download the model, run it on their own hardware, fine-tune it on their own data, and modify it as they see fit. (Strictly speaking, most "open-source" AI mode → Chapter 37: Emerging AI Technologies
Operate and Govern Phase: Establish oversight with the **AI Governance Charter** (Template 9). - Prepare for problems with the **AI Incident Response Playbook** (Template 14). - Manage organizational adoption with the **Change Management Plan** (Template 10). → Appendix B: Templates and Worksheets
Organizational practices:: Conduct Privacy Impact Assessments (PIAs) for AI projects that process personal data. - Implement data governance frameworks that classify data by sensitivity level (Chapter 4). - Ensure your AI ethics review process (Q35) includes privacy evaluation. - Train employees on data handling practices spe → Appendix D: Frequently Asked Questions
Outcome metrics:: Reduction in compliance incidents related to AI misuse (measured by the compliance team) - Time-to-adoption for new AI tools (measured by IT deployment teams) - Employee confidence scores on AI-related survey questions (measured annually) → Case Study 2: JPMorgan's AI Training Program — Upskilling 60,000 Employees
Output:: Daily forecasts at SKU-store level, with 80% prediction intervals - Weekly forecast summary dashboards for regional planners - Monthly scenario analysis (optimistic / expected / conservative) for supply chain leadership - Automated alerts when forecast accuracy degrades below threshold → Chapter 16: Time Series Forecasting
overfitting: the single most common and most expensive mistake in machine learning. And it's a problem that is not limited to polynomial regression. Any model that is too complex relative to the amount of training data will overfit." → Chapter 8: Supervised Learning — Regression

P

parameters: inputs the function expects. - The triple-quoted string (`"""..."""`) is a **docstring** — a brief description of what the function does. This is a best practice. - `return margin` sends the result back to the caller. → Chapter 3: Python for the Business Professional
Performance criteria:: Relevance score >= 70% (measured over 30-day rolling window, N >= 10,000 recommendations) - Latency: P95 response time < 200ms - Coverage: Recommendations generated for >= 95% of active loyalty members → Chapter 33: AI Product Management
Performance monitoring: Is the model's accuracy degrading? Are precision and recall shifting? 2. **Data drift monitoring** — Has the input data distribution changed since training? 3. **Fairness monitoring** — Are outcomes equitable across protected groups? (Chapter 25) 4. **Operational monitoring** — Latency, throughput, → Chapter 39: Capstone — AI Transformation Plan
Personalized audio experiences: music or soundscapes tailored to specific contexts, events, or customer segments. → Chapter 18: Generative AI — Multimodal
Phase 1: Build foundations (3-6 months).: Complete this textbook, including the Python exercises and the capstone. - Take Andrew Ng's Machine Learning Specialization or a comparable online course. - Build basic Python proficiency through daily practice — even 30 minutes a day compounds rapidly. - Start using AI tools (ChatGPT, Claude, Copil → Appendix D: Frequently Asked Questions
Phase 1: Quick Wins (Months 1-6): Deploy 2-3 high-feasibility use cases with clear, measurable ROI - Establish foundational data infrastructure - Hire core AI team (or engage implementation partner) - Draft and publish initial AI governance policies - Launch AI literacy program for leadership - *Objective: Demonstrate value. Build o → Chapter 39: Capstone — AI Transformation Plan
Phase 2: Apply in your current role (3-6 months).: Identify an AI opportunity in your current job and volunteer to lead or co-lead it. - Partner with your organization's data science team on a project. Your business context is valuable to them — they need someone who understands the problem domain. - Build an internal AI use case analysis or strateg → Appendix D: Frequently Asked Questions
Phase 2: Foundation (Months 7-12): Deploy 3-4 additional use cases including first medium-complexity applications - Implement ML platform and MLOps pipeline - Establish AI Center of Excellence (or equivalent structure) - Formalize governance framework with risk tiers and review boards - Expand AI literacy to middle management and key → Chapter 39: Capstone — AI Transformation Plan
Phase 3: Position for the transition (3-6 months).: Update your resume and LinkedIn to emphasize AI project experience, not just certifications. - Network with AI professionals — attend meetups, join communities, have informational conversations. - Target roles that value the business-technical bridge: AI product manager, AI strategy consultant, AI p → Appendix D: Frequently Asked Questions
Phase 3: Scale (Months 13-18): Deploy strategic bet use cases that leverage Phase 1-2 infrastructure - Implement advanced capabilities (real-time inference, edge deployment) - Develop internal AI talent through advanced training programs - Begin GenAI integration into business workflows - Establish model monitoring and retraining → Chapter 39: Capstone — AI Transformation Plan
Phase 4: Optimize (Months 19-24): Optimize existing models for performance and cost - Deploy most complex use cases (clinical decision support, autonomous systems) - Establish AI innovation pipeline for continuous opportunity identification - Conduct comprehensive governance audit - Build external AI brand (thought leadership, partn → Chapter 39: Capstone — AI Transformation Plan
Post-deployment verification: automated tests that confirm the new model is running correctly on all instances. - **Automated rollback** — if verification fails, the system automatically reverts to the previous version. → Case Study 2: The $440 Million Knight Capital Disaster — When Model Deployment Goes Wrong
Post-Market Monitoring (Article 72): [ ] Establish a post-market monitoring system proportionate to the nature and risk of the AI system - [ ] Actively and systematically collect, document, and analyze relevant data on performance throughout the system's lifetime - [ ] Use post-market monitoring findings to update the risk management s → Appendix F: AI Regulation Reference
Practical considerations:: Setting max tokens too low truncates responses mid-sentence — a common source of frustration. - Setting max tokens too high does not force the model to write longer responses; it merely allows it to. - For cost control in API-based deployments, max tokens directly affects pricing (you pay per token → Chapter 19: Prompt Engineering Fundamentals
Practical guidance:: **Prefer AI tools with known, licensed training data.** Adobe Firefly, for example, is trained on Adobe Stock, Creative Commons content, and public domain material — and Adobe provides indemnification to enterprise customers. This does not eliminate all risk, but it substantially reduces it. - **Mai → Case Study 1: Getty Images vs. Stability AI — The Copyright Battle That Could Shape AI's Future
Printed text: menus, signs, instructions, mail, labels, expiration dates - **Product identification** — distinguishing between similar containers (shampoo vs. conditioner, sugar vs. salt) in a kitchen or store - **Currency** — US bills are identical in size and texture regardless of denomination - **Social cues** → Case Study 2: Seeing AI by Microsoft — Computer Vision as Accessibility
Priority Tiers:: **Tier 1 (Score 4.0--5.0):** Pursue immediately. Allocate resources and assign executive sponsor. - **Tier 2 (Score 3.0--3.9):** Develop further. Invest in a proof of concept or data readiness initiative. - **Tier 3 (Score 2.0--2.9):** Monitor. Revisit when conditions improve (better data, lower ris → Appendix B: Templates and Worksheets
Privacy advantages:: No real individuals are represented in the data, so re-identification risk is eliminated (in theory). - Synthetic data can be shared freely — with vendors, partners, researchers, or the public — without privacy concerns. - Developers can work with realistic data without accessing production systems. → Chapter 29: Privacy, Security, and AI
Privacy constraints: You need data that mimics real patient, customer, or financial records without exposing actual individuals (Ch. 29) - **Class imbalance** — You need more examples of a rare class (fraud, equipment failure) to train a balanced model (Ch. 7, Ch. 11) - **Prototyping** — You want to build and test a pip → Appendix E: Data Sources Guide
Privacy, Security, and AI: Privacy in the age of AI · Differential privacy · Federated learning · Adversarial attacks · Model inversion · Data breach response · Athena's data breach crisis → AI & Machine Learning for Business
Privacy-preserving techniques:: **Data minimization.** Collect only the data you need for the specific AI application. Resist the "collect everything, figure out uses later" approach — it creates privacy liability without guaranteed AI value. - **Anonymization and pseudonymization.** Remove or mask personal identifiers before usin → Appendix D: Frequently Asked Questions
Priya Sharma (31): Digital Marketing Manager, 3 years at Greenfield. Priya manages the e-commerce and paid media channels. She spends significant time manually downloading data from Google Analytics, Meta Ads Manager, and Shopify, then consolidating it in Excel for her weekly channel performance report. → Case Study 2: From Spreadsheet to Script — A Marketing Team's Python Journey
Product affinity scoring: using the recommendation engine approach from Chapter 10, she generates per-member product affinity scores that predict which product categories and specific items each member is most likely to purchase next. → Chapter 24: AI for Marketing and Customer Experience
Product recommendations: how the recommendation engine works, what data it uses, and how customers can adjust their preferences; (2) **Pricing** --- confirmation that Athena does not use AI for individualized dynamic pricing (a decision made after the red team's price-steering finding); (3) **Customer service** --- disclosu → Chapter 30: Responsible AI in Practice
profit curve: one of the most underused and most powerful evaluation visualizations in applied ML. → Chapter 11: Model Evaluation and Selection
Programming experience: Python is taught from scratch in Chapter 3 - **Linear algebra or calculus** — Mathematical concepts are explained by intuition, not formulas - **Prior AI/ML knowledge** — Chapter 1 starts from the beginning - **A computer science degree** — This book is written for business students - **Access to ex → Prerequisites
Project Initiation Phase: Write an **AI Project Proposal** (Template 1) for each selected initiative. - Complete the **Data Readiness Assessment** (Template 4) to validate that the data foundation exists. - Build the **AI ROI Business Case** (Template 8) to secure funding. - Complete the **ML Project Canvas** (Template 3) to → Appendix B: Templates and Worksheets
Prompt Engineering Fundamentals: What is prompt engineering · Prompt anatomy · Zero-shot vs. few-shot · Role-based prompting · Iterative refinement · Common pitfalls · Athena's prompt library · *Code: `PromptBuilder`* → AI & Machine Learning for Business
Prompt-based app builders: platforms like Relevance AI, Flowise, and Stack AI — allow users to build AI workflows by connecting LLM calls, data sources, and actions through visual or prompt-based interfaces. These tools occupy the space between Custom GPTs (simple) and full-code frameworks like LangChain (complex). They are p → Chapter 22: No-Code / Low-Code AI
Proximity: code near the cursor is more relevant than code far away - **Semantic relevance** — code that uses similar variable names, function calls, or patterns is prioritized - **Import relationships** — files that are imported or referenced by the current file receive higher priority - **Recency** — recentl → Case Study 1: GitHub Copilot — Prompt Engineering for Code
proxy variables: features that are highly correlated with protected characteristics. Zip code, for example, is often a proxy for race and income. → Chapter 26: Fairness, Explainability, and Transparency
Purchase frequency: number of transactions in the last 12 months - **Recency** -- days since last purchase - **Average order value** -- mean spending per transaction - **Return rate** -- percentage of items returned - **Channel mix** -- proportion of purchases online vs. in-store - **Tenure** -- months since first purc → Chapter 7: Supervised Learning -- Classification
Python for the Business Professional: Why Python · Setting up your environment · Jupyter notebooks · Variables, data types, and control flow · pandas essentials · Reading and cleaning data · Your first sales analysis · *Code: pandas workflow* → AI & Machine Learning for Business

Q

Quality: How much preprocessing is required? Some datasets are analysis-ready; others require significant cleaning. Factor this time into your project plan. → Appendix E: Data Sources Guide
Quality concerns:: Synthetic data may not capture rare events, outliers, or edge cases that are critical for model performance. - If the generation process is imperfect, the synthetic data may introduce biases not present in the real data — or fail to preserve biases that should be detected and addressed. - There is a → Chapter 29: Privacy, Security, and AI
Quarter 1-2: Level 0 to Level 1: Deploy MLflow for experiment tracking and model registry - Build automated training pipeline (Airflow) for the churn model - Implement comprehensive monitoring (data quality, prediction distribution, business metrics) - Establish data contracts with upstream data engineering teams - Hire second ML e → Chapter 12: From Model to Production — MLOps
Quarter 3-4: Level 1 Solidification: Migrate recommendation engine (Chapter 10) and demand forecaster (Chapter 8) to the automated pipeline - Implement feature store (Feast) for shared features across models - Add automated data validation (Great Expectations) to all pipelines - Build champion-challenger framework for model retraining → Chapter 12: From Model to Production — MLOps
Quarter 5-6: Level 1 to Level 2: Implement CI/CD for model code (automated testing on every code change) - Add canary deployment capability - Build self-service model deployment for data scientists (deploy to staging with one command) - Implement automated retraining triggered by monitoring alerts - Integrate MLOps metrics into exe → Chapter 12: From Model to Production — MLOps
Quarterly (1-2 days):: Conduct a personal "landscape review." What new tools have launched? What has your industry adopted? What skills should you develop next? - Update your personal AI learning plan (Chapter 40 provides a template). → Appendix D: Frequently Asked Questions
query: "What am I looking for?" 2. A **key** — "What do I contain?" 3. A **value** — "What information do I carry?" → Chapter 17: Generative AI — Large Language Models
question of who is liable: the AI developer, the deployer, the data provider, or the user --- remains unsettled in most jurisdictions → Chapter 28: AI Regulation --- Global Landscape

R

Recency: When was the data collected? A dataset from 2010 may not reflect current patterns, especially in fast-moving domains like social media or e-commerce. → Appendix E: Data Sources Guide
Recommendation Systems: Why recommendations matter · Collaborative filtering · Content-based filtering · Hybrid approaches · Cold start problem · Evaluation metrics · Athena's product recommendations · *Code: `RecommendationEngine`* → AI & Machine Learning for Business
Record-Keeping and Logging (Article 12): [ ] Design the system to automatically record events (logs) relevant to identifying risks and enabling post-market monitoring - [ ] Ensure logging captures: periods of use, reference database against which input data was checked, input data for which the system produced a match, identification of na → Appendix F: AI Regulation Reference
Regression metrics: R-squared, MAE, RMSE, MAPE — each capture different aspects of continuous prediction quality. Choose the metric that best reflects how the business experiences prediction errors. → Chapter 11: Model Evaluation and Selection
Related file content: code from other open tabs and files in the same project that may be relevant. If the developer is writing a function that calls another function defined in a different file, Copilot attempts to include that function's definition in the prompt. → Case Study 1: GitHub Copilot — Prompt Engineering for Code
Relevance: Does the dataset align with your business question? A technically interesting dataset that does not map to a real business problem will produce a weak project. → Appendix E: Data Sources Guide
Reliability criteria:: System uptime >= 99.5% - Fallback to popularity-based recommendations activates within 500ms of model timeout - Model retraining runs weekly without manual intervention → Chapter 33: AI Product Management
Repository metadata: file names, directory structure, and project configuration files that provide additional signals about the project's purpose and architecture. → Case Study 1: GitHub Copilot — Prompt Engineering for Code
Reproducibility: Can someone else access the same data to verify your results? APIs with changing data are harder to reproduce from than static CSV downloads. Consider saving a snapshot. → Appendix E: Data Sources Guide
Responsible AI in Practice: Operationalizing responsible AI · Red-teaming · Bias bounties · Inclusive design · Sustainability and AI's carbon footprint · Responsible AI maturity model · Athena's responsible AI program → AI & Machine Learning for Business
Results:: 22% improvement in forecast accuracy (WMAPE) compared to the previous moving-average approach - $6.1 million annual reduction in inventory carrying costs (from reduced safety stock) - 31% reduction in stockout incidents for top-200 SKUs - Supply chain team adoption rate: 94% of planners using the ne → Chapter 16: Time Series Forecasting
Retrieval-Augmented Generation (RAG): a technique that grounds the LLM's responses in Athena's actual documents rather than relying on the model's general knowledge. Instead of asking the LLM "What is Athena's return policy?" and hoping it generates the right answer, a RAG system retrieves the actual return policy document and provides → Chapter 17: Generative AI — Large Language Models
Risk Management System (Article 9): [ ] Establish a documented risk management process covering the entire AI system lifecycle - [ ] Identify and analyze known and reasonably foreseeable risks to health, safety, and fundamental rights - [ ] Evaluate risks based on post-market monitoring data - [ ] Adopt risk mitigation measures, prior → Appendix F: AI Regulation Reference

S

safety stock: extra inventory above the forecast level — to buffer against under-prediction. → Chapter 8: Supervised Learning — Regression
sample: a subset of the larger population — and draw conclusions about the whole from the part. → Chapter 2: Thinking Like a Data Scientist
Scenario 1: Churn Prediction (Athena Retail Group): **True Positive:** Model predicts customer will churn, and she does. Athena sends a retention offer. Good outcome — the offer costs $20 but saves a customer worth $500/year. - **False Positive:** Model predicts customer will churn, but she was never going to leave. Athena sends a $20 retention offer → Chapter 11: Model Evaluation and Selection
Scenario 2: Fraud Detection (Financial Services): **True Positive:** Model flags a fraudulent transaction. The bank blocks it. Loss prevented. - **False Positive:** Model flags a legitimate transaction. The customer's card is declined. Customer calls in frustration. Cost: customer service time ($15) plus potential customer dissatisfaction. - **True → Chapter 11: Model Evaluation and Selection
Scenario 3: Medical Screening: **True Positive:** Model identifies a disease. Patient receives treatment early. - **False Positive:** Model says a healthy patient has the disease. Patient undergoes unnecessary additional tests and experiences anxiety. - **True Negative:** Model correctly identifies a healthy patient. No unnecessa → Chapter 11: Model Evaluation and Selection
Seafarers: officers and crew operating vessels, with traditions dating back centuries - **Terminal workers** — crane operators, straddle carrier drivers, stevedores - **Logistics professionals** — operations coordinators, customs brokers, freight forwarders - **Corporate staff** — finance, HR, marketing, strat → Case Study 2: Maersk's AI-Powered Supply Chain — Transforming the World's Largest Container Shipping Company
SECTION 4: Guardrails and Constraints: [ ] Input validation: `[How inputs are validated before reaching the prompt]` - [ ] Output filtering: `[Any post-processing or filtering applied to outputs]` - [ ] Content moderation: `[Moderation layer, if applicable]` - [ ] PII handling: `[How PII in inputs/outputs is managed]` - [ ] Rate limiting → Appendix B: Templates and Worksheets
Segment assignment: using the clustering approach from Chapter 9, she identifies seven behavioral segments within the loyalty base: Enthusiasts (high engagement, high spend), Routine Buyers (consistent but modest spend), Bargain Hunters (activated primarily by discounts), Lapsed VIPs (formerly high-value, now declining → Chapter 24: AI for Marketing and Customer Experience
Segmented evaluation: performance broken down by key dimensions (city, time of day, customer segment), ensuring that a model that performed well on average wasn't hiding poor performance on specific subgroups - **Feature importance analysis** — identifying which features contributed most to predictions - **Baseline compa → Case Study 1: Uber's Michelangelo — Building an ML Platform at Scale
Semi-supervised learning: Use unsupervised clustering to propagate labels. If you have a small set of labeled examples and a large set of unlabeled data, cluster the data and propagate the labels from labeled points to their cluster-mates. This is particularly useful when labeling is expensive (medical images, legal document → Chapter 9: Unsupervised Learning
Serious delinquency: The consumer has one or more accounts with payments 90+ days past due. 2. **Proportion of balances to credit limits is too high** — The consumer is using 85 percent of their available revolving credit. 3. **Length of time accounts have been established** — The consumer's oldest account is only 3 yea → Case Study 2: FICO's Explainable AI Journey — Making Credit Scoring Transparent
Significant media coverage: the campaign was covered by major business, technology, and marketing publications, generating an estimated $50+ million in earned media value. - **Consumer engagement metrics** exceeded benchmarks for traditional digital campaigns, with average time-on-platform of over seven minutes (compared to in → Case Study 2: Coca-Cola's AI-Generated Advertising — A Campaign Study in Human-AI Creative Collaboration
Simple factual retrieval: "What is Athena's headquarters city?" Adding reasoning steps just adds noise. - **Creative generation** — Writing a marketing tagline does not benefit from explicit reasoning. - **Tasks where speed matters more than accuracy** — Conversational interfaces where users expect instant responses. → Chapter 20: Advanced Prompt Engineering
Size: Is the dataset large enough for your chosen method? Deep learning generally needs thousands of examples; classical ML can work with hundreds. Is it small enough to work with given your compute resources? → Appendix E: Data Sources Guide
So what: why does this matter? What's the insight? 3. **Now what** — what action does this suggest? → Chapter 5: Exploratory Data Analysis
Solutions:: **Popularity-based fallback.** Show the most popular items overall or within broad categories. This is better than nothing but not personalized. - **Demographic defaults.** If basic demographic information is available (location, age, gender from account creation), use it to initialize recommendatio → Chapter 10: Recommendation Systems
Stakeholder testing: consulting with key stakeholders (employees, civil society, academic partners) before finalizing appointments, not after - **Scenario planning** --- anticipating potential controversies, preparing responses, and establishing decision protocols before the public announcement - **Onboarding** --- ensu → Case Study 1: Google's AI Ethics Board — 10 Days from Launch to Dissolution
Step 1: Choose K: Decide how many clusters you want. (We'll come back to *how* you choose K. For now, imagine someone tells you K = 3.) → Chapter 9: Unsupervised Learning
Step 2: Drop random anchors: Place K points randomly in your data space. These are the initial *centroids* — the centers of your clusters. Think of them as flags dropped blindly onto a map. → Chapter 9: Unsupervised Learning
Step 5: Repeat Steps 3 and 4: With the centroids in new positions, some data points are now closer to a different centroid. Reassign them. Recalculate centroids. Repeat until no points change clusters — the algorithm has *converged*. → Chapter 9: Unsupervised Learning
Strategic clarity: knowing which AI investments create competitive advantage and which are table stakes - **Organizational design** — building team structures that bridge technical and business expertise - **Measurement discipline** — quantifying value without either inflating claims or dismissing genuine impact - **H → Part 6: AI Strategy and Organizational Transformation
Strategic Planning Phase: Start with the **AI Maturity Self-Assessment** (Template 15) to understand where you are. - Use the **AI Strategy One-Pager** (Template 12) to define where you are going. - Apply the **AI Use Case Prioritization Matrix** (Template 2) to decide which initiatives to pursue first. → Appendix B: Templates and Worksheets
Strategic vision: The ability to connect AI capabilities to business strategy, build organizational capacity for AI adoption, and communicate AI's value and limitations to boards, regulators, and the public. (Chapters 31–39) → Answers to Selected Exercises
Strategy Phase: the period where individual AI projects coalesce into an organizational capability. Ravi Mehta establishes Athena's AI Center of Excellence. Grace Chen presents an AI strategy to the board. The CFO demands ROI measurement that satisfies investors. The Chief People Officer launches an upskilling prog → Part 6: AI Strategy and Organizational Transformation
Strengths:: Flexibility to adapt to rapidly evolving technology - Lower compliance costs for businesses (no new bureaucratic requirements) - Leverages existing regulatory expertise in sector-specific contexts - Attractive to AI companies seeking a less burdensome regulatory environment → Chapter 28: AI Regulation --- Global Landscape
Strong data foundations: the organizations that had invested in data quality and infrastructure before the generative AI wave arrived were best positioned to ride it. 3. **Human-in-the-loop design** — not as an afterthought but as a core architectural principle. 4. **Iterative deployment** — starting with internal, low-stak → Case Study 1: ChatGPT's First Year in Business — Hype, Hope, and Hard Lessons
Subject to:: Total budget constraint - Talent constraint (limited ML engineers, data scientists) - Risk tolerance (maximum acceptable portfolio risk) - Strategic alignment (minimum investment in priority areas) → Chapter 34: Measuring AI ROI
Supervised Learning — Classification: Classification overview · Logistic regression by intuition · Decision trees · Random forests · Gradient boosting · Feature engineering · Athena's churn prediction · *Code: `ChurnClassifier`* → AI & Machine Learning for Business
Supervised Learning — Regression: Regression overview · Linear regression · Multiple regression · Regularization (Ridge/Lasso intuition) · Time series basics · Athena's demand forecasting · *Code: `DemandForecaster`* → AI & Machine Learning for Business
sycophancy: agreeing with the user's stated position even when that position is incorrect. If you tell the model "I believe our revenue target of $50 million is achievable" and ask for analysis, the model is more likely to produce supporting arguments than to challenge your assumption. → Chapter 17: Generative AI — Large Language Models

T

Talent:: The "analytics team" consists of two business analysts who primarily build reports in Excel and Tableau - No one at the company has experience building or deploying machine learning models - The IT department is consumed by system maintenance and has no bandwidth for new data initiatives - The data → Chapter 1: The AI-Powered Organization
Technical Documentation (Article 11): [ ] Prepare technical documentation before the system is placed on the market - [ ] Include: general description of the system, detailed description of elements and development process, monitoring and control specifications, and detailed information about the system's purpose - [ ] Document: intende → Appendix F: AI Regulation Reference
Technical fluency: Not expert-level coding, but the ability to ask the right questions of technical teams, evaluate AI vendor claims, and understand the difference between a proof-of-concept and a production system. (Chapters 1–12) → Answers to Selected Exercises
The "last mile" problem: the gap between a deployed model and a model that is actually used — is the most common cause of AI project failure. Closing it requires co-design with users, friction reduction, feedback loops, and the paradox of override: giving users the power to reject recommendations increases adoption. → Chapter 35: Change Management for AI
The "regulation channels innovation" argument:: GDPR catalyzed a global privacy technology industry worth an estimated $15 billion by 2025 - Companies in regulated industries (pharmaceuticals, financial services, aviation) innovate continuously within regulatory frameworks - Regulation creates demand for compliance tools, audit services, and gove → Chapter 28: AI Regulation --- Global Landscape
The "regulation kills innovation" argument:: The EU has no tech company valued at over $100 billion. The US has many. This gap predates GDPR and the AI Act, but regulation may widen it. - Compliance costs create barriers to entry that disproportionately affect startups - Regulatory uncertainty chills investment: VCs are more cautious about fun → Chapter 28: AI Regulation --- Global Landscape
The AI-Powered Organization: The AI landscape today · A brief history of AI · AI maturity models · The Athena Retail Group story · Defining AI, ML, DL, and GenAI · Why business leaders must understand AI · Chapter roadmap → AI & Machine Learning for Business
The Beauty Profile: a voluntary questionnaire that asks about skin type, skin tone, hair type, beauty concerns, and product preferences — gives customers explicit control over what information they share. Completing the profile is incentivized (better recommendations) but not required. Customers can update or delete th → Case Study 1: Sephora's AI-Powered Beauty Experience — Personalization Done Right
The Business of Machine Learning: The ML project lifecycle · Framing business problems as ML problems · Success metrics vs. model metrics · Common failure modes · Build vs. buy · Team composition · Athena's first ML initiative → AI & Machine Learning for Business
The current file's content: the code above and below the cursor position, providing immediate context for what the developer is working on. → Case Study 1: GitHub Copilot — Prompt Engineering for Code
The developer's comment or function signature: the most immediate instruction, which acts as the "instruction" component of the prompt. A comment like `# Calculate the compound annual growth rate given initial value, final value, and number of years` tells Copilot exactly what to generate. → Case Study 1: GitHub Copilot — Prompt Engineering for Code
The entire process: from page load to ad display — happens in under 200 milliseconds. → Chapter 24: AI for Marketing and Customer Experience
The Full-Stack ML Team Model: Cross-functional teams that include a data scientist, an ML engineer, a data engineer, and a product manager - Each team owns a set of models end-to-end, from development through production - Requires broader skills but eliminates handoffs → Chapter 12: From Model to Production — MLOps
The governance gap: the distance between AI deployment and AI oversight — is one of the most significant organizational risks in the current business landscape. Closing it requires deliberate investment in governance infrastructure. → Chapter 27: AI Governance Frameworks
The hybrid model typically works as follows:: **Central function responsibilities:** Policy development, standard setting, model registry maintenance, ethics committee support, training and education, regulatory monitoring, reporting to senior leadership - **Business unit responsibilities:** Initial risk classification of AI projects, conductin → Chapter 27: AI Governance Frameworks
The Hype-Reality Gap: How has your ability to distinguish hype from reality evolved? Provide a specific example. 2. **Human-in-the-Loop** — Where should humans stay in the loop, and where is full automation appropriate? What principles guide this boundary? 3. **Data as a Strategic Asset** — What does it mean, practically → Chapter 40 Exercises: Leading in the AI Era
The ML Engineer Bridge Model: ML engineers sit between data scientists and operations, translating models into production systems - Reduces handoff friction but creates a new bottleneck (the ML engineers) - Works well for small to medium teams → Chapter 12: From Model to Production — MLOps
The ML Platform Model: A central ML platform team builds shared infrastructure (pipelines, feature store, monitoring, model serving) - Product-aligned ML teams use the platform to build and deploy models - The platform team provides tools and standards; the product teams provide domain expertise and business context - Bes → Chapter 12: From Model to Production — MLOps
The model's accuracy: that it produces reliable outputs. - **The model's fairness** — that it does not systematically disadvantage certain groups (a concern Athena learned viscerally in Chapter 25). - **The organization's intent** — that AI is being deployed to improve the work, not to surveil, control, or eliminate work → Chapter 35: Change Management for AI
The OECD AI Principles: inclusive growth, human-centered values, transparency, robustness, and accountability — provide the policy foundation that many national governance frameworks build upon. → Chapter 27: AI Governance Frameworks
The opportunity: Why AI matters for this organization, now 2. **The current state** — Maturity assessment summary (one line: "We are a Developing-stage organization with a governance gap") 3. **The plan** — What you will do, in how many phases, at what cost 4. **The expected return** — Risk-adjusted ROI projection 5 → Chapter 39: Capstone — AI Transformation Plan
The relationships between clusters matter: for example, understanding that customer Segment A is more similar to Segment B than to Segment C - **You need a visual representation** of cluster structure to communicate to stakeholders - **Your dataset is small to medium** (hierarchical clustering is computationally expensive for large datasets → Chapter 9: Unsupervised Learning
The risks are real:: **Data leakage:** Employees paste proprietary data, customer information, or trade secrets into consumer AI tools. This data may be used for model training or stored in ways that violate privacy regulations. - **Quality and accuracy:** AI-generated work may contain errors, hallucinations, or biases → Appendix D: Frequently Asked Questions
The Traditional Siloed Model (Anti-Pattern): Data scientists build models → hand off to engineers → engineers deploy → operations monitors - Each handoff is a communication bottleneck and a source of lost context - Nobody owns the model end-to-end → Chapter 12: From Model to Production — MLOps
Thinking Like a Data Scientist: The data science mindset · Structured vs. unstructured data · The CRISP-DM framework · Hypothesis-driven analysis · Correlation vs. causation · Business questions that data can answer · From insight to action → AI & Machine Learning for Business
Tier 1: AI for Everyone: **Audience:** All employees - **Format:** Online, self-paced, 4 to 8 hours - **Learning objectives:** What AI can and cannot do, how AI is being used at the organization, how to interact with AI-powered tools, basic data literacy, ethical considerations - **Success metric:** Completion rate, post-co → Chapter 32: Building and Managing AI Teams
Tier 1: Essential for every business professional.: **AI literacy.** Understanding what AI can and cannot do, at a level sufficient to evaluate proposals, ask informed questions, and avoid being misled by hype or vendor marketing. This entire textbook builds this skill. - **Prompt engineering.** The ability to use LLMs effectively is becoming as fund → Appendix D: Frequently Asked Questions
Tier 2: AI for Managers: **Audience:** Managers and senior leaders - **Format:** Workshop, 1 to 2 days (instructor-led) - **Learning objectives:** How to identify AI opportunities in their domain, how to frame problems for the AI team, how to evaluate AI project proposals, how to manage AI-augmented teams, how to communicat → Chapter 32: Building and Managing AI Teams
Tier 3: AI Builder: **Audience:** Power users — analysts, engineers, and domain experts who will work hands-on with AI tools - **Format:** Intensive program, 4 to 8 weeks (blended: online + project-based) - **Learning objectives:** Python for data analysis, basic ML concepts, using no-code/low-code AI platforms (Chapte → Chapter 32: Building and Managing AI Teams
Tier 3: Differentiating for leadership roles.: **AI strategy.** Connecting AI capabilities to business strategy and competitive advantage (Chapter 31). - **Change management for AI.** Leading the organizational transformation that AI requires (Chapter 35). - **AI product thinking.** Designing products and services that leverage AI effectively (C → Appendix D: Frequently Asked Questions
Time Series Forecasting: Time series components · ARIMA by intuition · Facebook Prophet · LSTM for sequences · Ensemble forecasting · Forecast uncertainty · Athena's supply chain forecasting · *Code: Prophet workflow* → AI & Machine Learning for Business
Tool use: the ability to invoke external functions: search the web, query a database, execute code, send an email, call an API - **Memory** --- maintaining context across a long sequence of actions, including what has been tried, what failed, and what succeeded - **Planning and replanning** --- the ability to → Chapter 37: Emerging AI Technologies
tornado chart: a visualization that ranks variables by their impact on NPV. In Athena's churn prediction case, the analysis reveals that NPV is most sensitive to annual value (the benefits) and least sensitive to operations costs or discount rate. This tells the team where to focus their measurement effort: valida → Chapter 34: Measuring AI ROI
translation invariance: the ability to recognize a pattern regardless of its exact position in the image. A cereal box shifted a few pixels to the left or right should still be recognized as a cereal box. → Chapter 15: Computer Vision for Business
Trend: the long-term direction. Is demand growing, shrinking, or stable over months and years? 2. **Seasonality** — repeating patterns at fixed intervals. Daily patterns (weekday vs. weekend), monthly patterns (beginning of month vs. end of month), annual patterns (summer vs. winter). 3. **Residual** — wha → Chapter 8: Supervised Learning — Regression
True: A negative silhouette score means the point is, on average, closer to points in a neighboring cluster than to points in its own cluster, suggesting misassignment. → Chapter 9 Quiz: Unsupervised Learning
True.: This is a direct quote from the chapter's caution about AutoML risks. → Chapter 22 Quiz: No-Code / Low-Code AI
Typical signs:: Employees use ChatGPT on personal accounts for work tasks - Data is siloed in departmental spreadsheets and legacy systems - No one has the title "Chief AI Officer" or "VP of Data" - AI is discussed in strategy meetings as something "we should look into" → Chapter 1: The AI-Powered Organization

U

underfitting: occurs when a model is too simple to capture the true pattern, performing poorly on both training and test data. → Chapter 8: Supervised Learning — Regression
Unscented lotion: Pregnant women in the second trimester often switch to unscented products as their skin becomes more sensitive and their sense of smell heightens. - **Mineral supplements** — Particularly calcium, magnesium, and zinc, which are commonly recommended during pregnancy. - **Extra-large bags of cotton ba → Case Study 1: Target's Pregnancy Prediction — When Data Science Gets Too Good
Unsupervised as exploration: Before building a churn model, cluster customers to understand the natural segments. The clusters might suggest different churn dynamics for different groups, leading to segment-specific models rather than one-size-fits-all. → Chapter 9: Unsupervised Learning
Unsupervised as monitoring: Use anomaly detection to monitor deployed models. If the distribution of incoming data shifts (more anomalies than usual), it may signal concept drift — the model's predictions may be degrading. → Chapter 9: Unsupervised Learning
Unsupervised as preprocessing: Use PCA to reduce dimensions before training a supervised classifier. Use clustering to create a "segment" feature that improves supervised model performance. → Chapter 9: Unsupervised Learning
Unsupervised Learning: Clustering overview · K-means · Hierarchical clustering · DBSCAN · Dimensionality reduction (PCA) · Anomaly detection · Athena's customer segmentation · *Code: `CustomerSegmenter`* → AI & Machine Learning for Business
Unusual transaction chains: money flowing through a sequence of accounts in a pattern consistent with laundering - **Account takeover patterns** — an existing account suddenly connecting to new devices, new merchants, and new geographic locations simultaneously → Case Study 2: PayPal's Anomaly Detection — Finding Fraud Without Labels
Use closed models when:: You need frontier-level capability (the most sophisticated reasoning, longest context windows, multimodal processing) - You want minimal infrastructure burden - Your volume is moderate (thousands, not millions, of queries per day) - You are prototyping and need speed to market → Chapter 37: Emerging AI Technologies
Use open models when:: Data privacy is a hard requirement (regulated industries, sensitive data) - You need deep customization for a specific domain or task - Your inference volume is high enough that API costs become significant - You need to run AI in environments without reliable internet connectivity - Vendor dependen → Chapter 37: Emerging AI Technologies
User experience criteria:: "Why was this recommended?" explanation available for 100% of recommendations - "Not for me" feedback button processes within 1 second and updates recommendations within the next session - Opt-out available for users who prefer non-personalized experience → Chapter 33: AI Product Management

V

V = $370,055/year: the project needs to generate at least approximately $370,000 in annual value to break even over five years. → Answers to Selected Exercises
Version number: Increment on every change - **Change log** — What changed and why - **Author** — Who made the change - **Test results** — Pass rate before and after the change - **Rollback plan** — How to revert if the new version underperforms → Chapter 20: Advanced Prompt Engineering
Virtual try-on data: which products customers try on virtually, how long they spend, which they save — feeds back into the personalization engine with the customer's knowledge and consent. → Case Study 1: Sephora's AI-Powered Beauty Experience — Personalization Done Right
Volume commitments: commit to a minimum annual spend in exchange for per-unit discounts - **Multi-year terms** — longer commitments yield deeper discounts (but increase lock-in) - **Service credits** — negotiate SLA credits for downtime or performance failures - **Training and support** — request included training cred → Chapter 23: Cloud AI Services and APIs

W

walk-forward validation: and it is fundamentally different from the random k-fold cross-validation used in standard machine learning. Here, we train on 365 days, forecast the next 30 days, then slide the window forward by 90 days and repeat. This mimics real forecasting: you always train on the past and predict the future, → Chapter 16: Time Series Forecasting
Weaknesses:: Lack of legal enforceability creates uncertainty about actual protections - Fragmented across multiple regulators with varying interpretations, resources, and enforcement enthusiasm - No clear mechanism for addressing cross-cutting AI risks that don't fit neatly into a single sector - Potential regu → Chapter 28: AI Regulation --- Global Landscape
Weekly (30-60 minutes):: Read one long-form article or research summary. Harvard Business Review, MIT Sloan Management Review, and Stratechery cover AI from a business strategy perspective. ArXiv summaries (via Papers With Code) cover technical advances. - Experiment with one new AI tool or feature. Try a new prompt techniq → Appendix D: Frequently Asked Questions
What a model registry tracks:: Model artifacts (serialized model files) - Model metadata (algorithm, hyperparameters, training data hash, training date) - Performance metrics (accuracy, precision, recall, AUC on test data) - Lineage (which data, which code, which pipeline produced this model) - Stage (development, staging, produc → Chapter 12: From Model to Production — MLOps
What agents can do reliably:: Research and information synthesis (gathering information from multiple sources, summarizing, and producing structured reports) - Code generation and debugging (writing, testing, and iterating on software --- see Case Study 1 on AI coding agents) - Structured data workflows (extracting data from doc → Chapter 37: Emerging AI Technologies
What agents struggle with:: Tasks requiring genuine judgment about ambiguous situations - High-stakes decisions where errors have serious consequences (medical, legal, financial) - Tasks that require understanding organizational context, politics, or unwritten norms - Long-horizon planning (more than 10-15 sequential steps ten → Chapter 37: Emerging AI Technologies
What has been demonstrated:: Quantum algorithms for specific linear algebra operations (like solving linear systems via the HHL algorithm) that are theoretically faster than classical equivalents - Quantum kernel methods for small classification problems - Quantum-inspired classical algorithms that borrow concepts from quantum → Chapter 37: Emerging AI Technologies
What has not been demonstrated:: Any quantum machine learning algorithm outperforming classical methods on a practically useful problem at a meaningful scale - Quantum advantage for training neural networks - Scalable quantum hardware capable of running the algorithms that theorists have designed → Chapter 37: Emerging AI Technologies
What It Took:: 3 months of data pipeline development - 2 months of model development and validation - 6 months of phased rollout (one region at a time) - 1 full-time data engineer, 1 data scientist, and 0.5 FTE from the supply chain planning team - Total cost: approximately $450,000 in the first year (primarily pe → Chapter 16: Time Series Forecasting
What this means for business leaders:: Focus your risk management on the near-term, concrete risks that affect your organization and stakeholders today. - Support and comply with regulatory frameworks designed to manage AI risks at a societal level (Chapter 28). - Stay informed about the long-term debate without letting it paralyze near- → Appendix D: Frequently Asked Questions
What to recommend: product selections ranked by affinity score, filtered by inventory availability and margin targets - **How to frame it** — messaging tone and content adapted to segment (an Enthusiast receives "new arrivals" framing; a Lapsed VIP receives "we miss you" framing; a Bargain Hunter receives "exclusive m → Chapter 24: AI for Marketing and Customer Experience
When to conduct one:: Before deploying any AI system that makes or influences decisions affecting people (hiring, lending, pricing, content moderation, resource allocation). - When using AI in high-risk domains as defined by the EU AI Act (Chapter 28). - When processing sensitive personal data for AI/ML purposes. - Befor → Appendix D: Frequently Asked Questions
When to use batch prediction:: The business does not need real-time answers (daily customer churn scores, weekly demand forecasts) - The prediction set is finite and known in advance (score all current customers, forecast all SKUs) - Latency requirements are lenient (predictions needed within hours, not milliseconds) - The organi → Chapter 12: From Model to Production — MLOps
When to use edge deployment:: Network connectivity is unreliable or unavailable - Latency requirements are extreme (autonomous vehicle decisions in milliseconds) - Privacy requirements prohibit sending data to the cloud - The model is small enough to run on constrained hardware → Chapter 12: From Model to Production — MLOps
When to use it:: Exploratory or low-volume use cases - Tasks where general knowledge is sufficient - When requirements change frequently (prompts are easy to update; fine-tuning is not) - When budget or technical expertise is limited → Chapter 17: Generative AI — Large Language Models
When to use real-time inference:: The prediction must be made at the moment of interaction (product recommendations during a browsing session, fraud detection at the point of transaction) - The input data is unique to the request and not known in advance (a specific customer's current shopping cart) - Low latency is critical (sub-se → Chapter 12: From Model to Production — MLOps
When to use serverless inference:: Traffic is sporadic or unpredictable (infrequent but time-sensitive predictions) - The model is small enough to load quickly (cold start must be acceptable) - The team wants to minimize infrastructure management → Chapter 12: From Model to Production — MLOps

Y

y(t) = g(t) + s(t) + h(t) + error: **g(t)** is the trend — either linear or logistic growth, with automatic detection of changepoints where the growth rate shifts - **s(t)** is seasonality — modeled using Fourier series (fancy sine and cosine waves that capture periodic patterns) - **h(t)** is the holiday/event effect — additive bump → Chapter 16: Time Series Forecasting
Yes: > Probably not churning. But check further. - Have they made more than 3 purchases this year? - **Yes** --> Low risk. - **No** --> Medium risk. - **No** --> Possibly churning. - Is their average order value above $75? - **Yes** --> Medium risk (high-value customer going quiet -- investigate). - **No → Chapter 7: Supervised Learning -- Classification

Z

Z-score method: Calculate the mean and standard deviation of each feature. Points more than 2 or 3 standard deviations from the mean on any feature are flagged as potential anomalies. Simple, interpretable, but assumes normally distributed data. → Chapter 9: Unsupervised Learning