Compliance for ML Systems
ML compliance has become a real engineering workstream. SOC 2, HIPAA, GDPR, EU AI Act all touch how you train, store, and serve models. Here is what they actually require.
Regulatory map
The regulations that touch ML systems in 2026:
- EU AI Act, risk-tiered framework for AI systems sold in or used in the EU. Applies extraterritorially.
- GDPR, data protection. Implications for ML: right to explanation, consent for training data, deletion rights affecting trained models.
- US Executive Orders and state laws, patchwork. CCPA in California; emerging AI rules in NY, IL, and elsewhere. SEC and FTC have AI-specific guidance.
- Sectoral rules, HIPAA for healthcare, SOX for financial controls, FCRA for credit decisions, EEOC for hiring decisions.
- Industry standards, NIST AI RMF, ISO/IEC 42001 (AI management systems). Voluntary but increasingly expected.
The territorial extraterritoriality. EU AI Act and GDPR apply if EU users are affected. US state laws apply if affected residents. Sectoral rules apply by industry. Map which regulations apply by reviewing your user base, your data sources, your operational locations.
The risk-tier framing (EU AI Act). Different obligations per tier. Most consumer AI is minimal-risk (transparency only). Specific use cases (hiring, credit, education, medical, law enforcement) are high-risk with substantial compliance burden.
The GDPR ML implications. Article 22: automated decisions affecting individuals require human review. Article 17: right to be forgotten, does this require model retraining? Active legal interpretation; conservative reading requires deletion-aware ML processes.
The sectoral overlay. Healthcare ML: HIPAA + Medical Device regulations + EU AI Act high-risk. Financial ML: SOX + FCRA + bank regulators + EU AI Act. The overlapping rules require multi-disciplinary compliance work.
Engineering controls
The technical capabilities you need:
- Data lineage, for any model output, what training data influenced it.
- Audit logs, every model decision with input, output, model version, timestamp.
- Bias monitoring, track outcome distributions across demographic groups.
- Model versioning, every deployed model is reproducible from code + data.
- Explanation generation, feature attributions per decision (where regulations require).
- Deletion / opt-out, process for removing user data and updating models.
- Human-review checkpoints, for high-stakes decisions, escalation paths to humans.
The data-lineage detail. Track which datasets were used to train which models. Track which transformations applied. When auditors ask "what data trained this model that produced this decision", you must be able to answer.
The audit-log detail. Per-decision logs with input features, output, model version, timestamp. Retention typically 7+ years for regulated decisions. Storage and indexing cost; budget for it.
The bias-monitoring detail. Define protected groups (gender, race, age, depends on jurisdiction and use case). Track outcome distributions per group. Alert on disparate impact. Document the methodology; auditors will ask.
The model-versioning detail. Every deployed model has a version. The version is reproducible: same code + same data + same compute → same model. Without reproducibility, you can't certify anything about the model.
The explanation-generation detail. SHAP values, feature importance, decision tree paths. Specific to model architecture. Stored alongside decisions; available on demand. Required by some regulations; nice-to-have for many.
The deletion / opt-out detail. User opts out: remove their data from training set; mark their account for non-collection. Whether you need to retrain the model is interpretive; conservative interpretation says yes.
Paperwork
Compliance is paperwork-heavy. Required documents (varying by regulation):
- Risk assessments per high-risk system.
- Conformity assessments (EU AI Act).
- Data Protection Impact Assessments (GDPR).
- Model cards documenting capabilities, limitations, intended use.
- System cards for the larger production system.
- Post-market monitoring plans and reports.
- Incident logs and reporting evidence.
The risk-assessment template. For each high-risk system: who's affected, what could go wrong, mitigations, residual risk, sign-offs. Standard template across systems eases consistency. Risk assessments are living documents, update on changes.
The DPIA reality. Required for most personal-data processing. Documents legal basis, necessity, risks, mitigations. Often outsourced to specialised legal/compliance teams; producing them in-house takes specific training.
The model-card practice. Document each deployed model: what it does, what data trained it, evaluation results, known limitations. Public model cards (HuggingFace standard) are a baseline; internal model cards add proprietary detail.
The system-card practice. Higher-level than model cards; describes the system that uses the models. Intended use, deployment context, monitoring practices, incident response. Required for high-risk systems; useful for any.
The post-market monitoring. Required by EU AI Act for high-risk systems. Periodic reviews of: model performance drift, bias drift, incident rates, user complaints. Documented; submitted on regulator request.
The EU AI Act effects
The EU AI Act has had global effects similar to GDPR. Companies serving EU markets implement compliance globally because per-region differences are operationally complex. The "Brussels effect" extends EU rules worldwide. Even US-only companies often comply because their cloud providers, customers, or partners require it.
The global standardisation. Companies build once for EU compliance; ship globally. EU rules become the de facto global rules. Non-EU jurisdictions can deviate but face friction; many simply align.
The compliance investment. Major SaaS companies have allocated $5M-$50M annually for EU AI Act compliance. The investment is permanent; not one-time. Smaller companies face proportionally higher relative cost.
The chilling effect. Some companies decline to offer specific AI features in the EU. Others delay launches until compliance is sorted. The market segmentation is real; mitigation cost varies by feature.
The auditor ecosystem. Conformity assessments require accredited auditors. The auditor ecosystem is still building; demand exceeds supply. Long lead times for high-risk system certifications. Plan accordingly.
Common antipatterns
Treating compliance as one-time. Ongoing operations: monitoring, incidents, audits. Build the process.
Skipping audit logs early. Retroactive logging is impossible. Build in from day one.
No per-decision explanations. Required by regulations for high-stakes decisions. Add SHAP or similar before regulators ask.
Compliance as a separate team. Engineering is the implementation; without engineering buy-in, compliance is paperwork without controls. Embed compliance into engineering practice.
What to do this week
Three moves. (1) Map applicable regulations to your specific systems. The map surfaces gaps. (2) For one high-risk system, write the risk assessment. The exercise reveals what's missing operationally. (3) Audit your audit logs. If they don't exist, start recording today; if they exist, verify retention and indexing meet regulatory requirements.