Chapter 14 Cheat Sheet — Doing the Right Thing

One-Line Summaries

ConceptOne-Liner
Algorithmic biasModels inherit historical discrimination; accuracy does not equal fairness
Feedback loopsModel output becomes training data → biases self-reinforce
Proxy discriminationDiscrimination through correlated variables (ZIP code for race) even without protected features
Fairness definitionsMultiple incompatible mathematical definitions; choosing between them is a values question
Right to explanationGDPR Art. 22: high-stakes automated decisions must be explainable — requires interpretable models
Contextual integrityPrivacy = appropriate information flow by context; not just secrecy
Consent fictionInformed consent at internet scale is practically impossible; architecture must do the work
Data as powerBehavioral data concentration = unprecedented institutional knowledge asymmetry
Industrial Revolution analogyIndividual ethics insufficient; structural regulation necessary and historically validated
Purpose limitationMust be architectural (access controls, retention limits), not just policy-stated

Types of Algorithmic Bias — Quick Reference

Historical bias:        Training data reflects past discrimination
  Example:              Amazon hiring model trained on male-dominated past hires
  Fix:                  Reweight data; audit for disparate impact; choose different labels

Representation bias:    Underrepresented groups have insufficient training examples
  Example:              Facial recognition worst on dark-skinned women (34% vs 0.8% error)
  Fix:                  Diversify data collection; measure error rates by group

Measurement bias:       Proxy variable correlates with protected characteristic
  Example:              ZIP code → race → credit scoring discrimination
  Fix:                  Audit feature correlations; consider removing correlated features

Aggregation bias:       One model applied to heterogeneous groups without adaptation
  Example:              Single global model applied to all demographics equally
  Fix:                  Segment models or add demographic context features

Feedback loop bias:     Model output used as future training labels
  Example:              Predictive policing → more arrests in predicted areas → model reinforced
  Fix:                  Break loop; use independent ground truth for labels

Deployment bias:        Model used in context it was not built or tested for
  Example:              COMPAS trained on one region, used nationally in criminal sentencing
  Fix:                  Validate in deployment context; restrict use to validated contexts

Feedback Loop Anatomy

Feedback Loop Example: Predictive Policing

  [Historical crime data]
           │
           ▼
  [Train model: predict where crime will occur]
           │
           ▼
  [Deploy: send more police to predicted areas]
           │
           ▼
  [Police discover more crime in those areas]
           │
           ▼
  [New crime data added to training set]
           │
           └──────────────────────────────────▶ (loop repeats, amplified)

How to break the loop:
  ├─ Use independent ground truth (victim reports, not police reports)
  ├─ Random exploration (don't always follow model predictions)
  ├─ Human review before feeding output back as training labels
  └─ Measure and monitor group-level outcomes over time

Privacy Regulations at a Glance

RegulationWhereCore RightsKey Technical RequirementMax Fine
GDPREU/EEAAccess, erasure, portability, explanationRight to erasure in 30 days; DPIA for high-risk4% global turnover
CCPA/CPRACaliforniaKnow, delete, opt-out of saleHonor opt-out requests; disclose sharing$7,500/intentional violation
LGPDBrazilSimilar to GDPRData minimization, purpose limitation2% Brazil revenue, cap R$50M
PDPBIndiaAccess, correction, erasureData localization for sensitive dataUp to ₹500 crore
EU AI ActEUHuman oversight for high-risk AIConformity assessment, bias testing, logging6% turnover (prohibited AI)

EU AI Act: Risk Tiers

UNACCEPTABLE RISK (banned):
  ├─ Social scoring by governments
  ├─ Real-time remote biometric ID in public spaces (narrow exceptions)
  ├─ Emotion recognition in workplace/schools
  └─ AI that manipulates people's behavior covertly

HIGH RISK (regulated — most relevant for data engineers):
  ├─ Employment: CV screening, performance evaluation, promotion decisions
  ├─ Credit scoring and financial decisions
  ├─ Criminal justice: recidivism prediction, risk assessment
  ├─ Education: scoring exams, admission, monitoring students
  └─ Critical infrastructure management

High-risk requirements:
  ✓ Conformity assessment before deployment
  ✓ Technical documentation (training data, architecture, testing)
  ✓ Bias testing disaggregated by demographic group
  ✓ Human oversight capability
  ✓ Logging of decisions for audit
  ✓ Transparency disclosure to affected individuals

LIMITED RISK (transparency required):
  └─ Chatbots, deepfakes: must disclose it is AI

MINIMAL RISK (no requirements):
  └─ Spam filters, recommendation systems (mostly)

Ethical Decision Framework for Data Engineers

Step 1: MINIMALISM TEST
  Q: Do we have a specific, documented use for this data?
  Q: Could we achieve the same goal with less data?
  If no clear purpose → do not collect

Step 2: CONTEXTUAL INTEGRITY TEST
  Q: In what context was this data originally shared?
  Q: Does the proposed new use respect those contextual norms?
  Medical data + insurance pricing → violation
  Location for navigation + advertising → likely violation

Step 3: DISPARATE IMPACT TEST
  Q: What are error rates disaggregated by demographic group?
  Q: Does any group bear a higher false positive/negative rate?
  Q: What are the consequences of errors for the individual?
  Run before deployment; run on an ongoing basis

Step 4: POWER ASYMMETRY TEST
  Q: Does this system create power asymmetries?
  Q: Do surveilled people have meaningful recourse?
  Q: Could this system be turned against the people it serves?

Step 5: REGRET TEST
  Q: Would I be comfortable if the affected people saw exactly what we do?
  Q: Would I be proud explaining this to a journalist covering algorithmic harm?
  If "no" to either → redesign

Surveillance Capitalism Data Flow

User visits website / uses app
          │
          ▼
First-party data collected (login, purchases, clicks)
          │
          │
          ▼
Third-party pixels/SDKs fire (Google Analytics, Meta Pixel, etc.)
          │
          ├──▶ Advertising network receives: user ID, URL, timestamp, device
          │
          ▼
Real-time bidding (RTB): user profile broadcast to 500+ ad buyers
          │
          └──▶ Highest bidder's ad shown (~100ms)
          
Data broker layer:
  ├─ Purchases behavioral data from apps, loyalty programs, public records
  ├─ Aggregates: name, address, income, health, political affiliation
  └─ Sells to insurers, employers, law enforcement, political campaigns

User knowledge: essentially none
Company knowledge: comprehensive behavioral profile updated continuously

Fairness Definitions (and Why They Conflict)

Demographic Parity:    Equal positive prediction rate across groups
  P(ŷ=1 | group=A) = P(ŷ=1 | group=B)

Equalized Odds:        Equal true positive AND false positive rates
  P(ŷ=1 | Y=1, group=A) = P(ŷ=1 | Y=1, group=B)  [equal TPR]
  P(ŷ=1 | Y=0, group=A) = P(ŷ=1 | Y=0, group=B)  [equal FPR]

Predictive Parity:     Equal positive predictive value
  P(Y=1 | ŷ=1, group=A) = P(Y=1 | ŷ=1, group=B)

Counterfactual:        Same prediction if protected attribute were different

KEY INSIGHT (Chouldechova 2017):
  Demographic parity + equalized odds + predictive parity
  CANNOT ALL BE SATISFIED when base rates differ between groups
  
  This is not a technical limitation to overcome.
  It is a values question: which errors harm people more?
  That is a political and ethical decision, not a model parameter.

Key Case Studies

CaseSystemHarmLesson
COMPAS (2016)Recidivism prediction in US courts2x false high-risk rate for Black defendants vs whiteAggregate accuracy hides demographic disparate impact
Amazon hiring (2018)Resume screening MLSystematically downrated women’s resumesHistorical training data encodes historical discrimination
Gender Shades (2018)Commercial facial recognition APIs34% error on dark-skinned women vs 0.8% on light-skinned menRepresentation bias in training data = unequal service quality
PredPol (ongoing)Predictive policingOver-policing in communities of colorFeedback loops amplify rather than measure crime
Cambridge Analytica (2018)Psychological targetingPolitical manipulation from Facebook dataData collected for one purpose repurposed without consent
YouTube radicalizationContent recommendationAlgorithmic progression toward extreme contentEngagement optimization creates harmful feedback loops

Data Minimalism Principles

COLLECTION:
  Don't collect: data you don't have a specific documented use for
  Don't collect: more precision than needed (city vs exact GPS)
  Don't collect: in raw form if aggregated form is sufficient

RETENTION:
  Define retention period at design time, not retrospectively
  Automate deletion (TTL, deletion pipelines)
  Log retention ≠ endless retention; plan for right-to-erasure

SHARING:
  Default to not sharing
  Document every third-party data flow (required by GDPR)
  Evaluate each third-party's privacy practices before enabling pixel/SDK

STORAGE:
  Store user IDs, not names/emails, in event logs and analytics
  PII should live in one authoritative store, referenced elsewhere by ID
  Separation enables erasure propagation: delete one store, not thousands

Professional Responsibility Framing

The "just following specs" argument:
  Engineer  → "I just implemented what was specified"
  PM        → "I just defined what the business needed"  
  Executive → "I just approved what the team built"
  No one made the discriminatory decision individually.
  The system did.

Why this is insufficient:
  Civil engineers: professionally licensed; legally liable for structural safety
  Doctors: professional oath; cannot "just follow orders" against patient welfare
  Data engineers: no comparable framework yet — but the harms are at comparable scale

What professional responsibility means in practice:
  1. Raise ethical concerns during design, not after deployment
  2. Require disparate impact analysis before deploying classification models
  3. Document known limitations and failure modes
  4. Push back on use cases that violate contextual integrity
  5. Support, rather than resist, regulatory accountability frameworks

Quick Revision Time: 8 minutes
Interview Prep: 20 minutes
Last Updated: 2026-05-29