Building a CTEM Program for AI-Driven Environments
Traditional threat management wasn't built for systems that learn, drift, and fail in ways their creators never anticipated. Here's how to adapt.
The Problem with Traditional Threat Models
Continuous Threat Exposure Management (CTEM) has become a cornerstone of modern cybersecurity strategy. Gartner's framework—scoping, discovery, validation, prioritization, and mobilization—gives organizations a disciplined approach to understanding their attack surface and addressing vulnerabilities before adversaries exploit them. For conventional IT infrastructure, it works. For AI systems, it doesn't.
The disconnect isn't subtle. Traditional CTEM assumes that vulnerabilities are discoverable through scanning, testable through penetration testing, and addressable through patching. AI systems violate all three assumptions. A machine learning model doesn't have CVEs you can look up in a database. Its vulnerabilities emerge from the interaction between training data, model architecture, and the adversarial environment it encounters in production. A model that performs flawlessly in testing can fail catastrophically when deployed—not because something changed in the code, but because the world changed around it.
For Saudi organizations operating under the National Cybersecurity Authority's Essential Cybersecurity Controls and SDAIA's AI governance framework, this creates a gap that conventional security programs don't fill. NCA's Control 5.3 explicitly requires "adversarial robustness testing" for AI systems in government and critical infrastructure contexts. But what does continuous threat management actually look like when your attack surface includes a fraud detection model that might be manipulated through carefully crafted transaction patterns, or a customer service chatbot that could be prompted into revealing sensitive information?
This lab note maps CTEM principles to AI-specific threat landscapes, with practical guidance for Saudi organizations building programs that address how machine learning systems actually fail.
Why AI Requires a Different CTEM Approach
The fundamental difference between traditional software vulnerabilities and AI vulnerabilities is this: traditional software fails when its logic is incorrect. AI fails when its assumptions are violated.
A buffer overflow vulnerability exists because a programmer made a specific mistake—the code doesn't properly validate input length. You can scan for it, you can test for it, and once you patch it, it's gone. An adversarial vulnerability in an image classifier exists not because anyone made a mistake, but because the model learned to recognize patterns that can be subtly manipulated. Add imperceptible noise to an image, and a model that correctly identifies a stop sign might classify it as a speed limit sign. The model is working exactly as designed. The problem is that what it learned doesn't generalize to adversarial conditions.
This has three implications for threat management:
Vulnerabilities are emergent rather than discrete. You can't enumerate AI vulnerabilities in advance because they emerge from the model's learned representations, which depend on training data that changes over time. A model retrained on new data has new vulnerabilities—or at least, potentially different ones.
Testing must be continuous, not periodic. A penetration test that validates model robustness in January tells you nothing about model robustness in March if the model has been retrained or if the input distribution has shifted. The threat surface evolves.
Defenses are probabilistic, not deterministic. You can't "patch" an adversarial vulnerability in the way you patch a buffer overflow. You can make attacks harder, more expensive, or less likely to succeed. You cannot eliminate them.
For Saudi organizations, this means that CTEM programs built around quarterly vulnerability scans and annual penetration tests provide false confidence when applied to AI systems. A different approach is required.
CTEM Phase 1: Scoping for AI Threats
Traditional CTEM scoping identifies the assets to protect and the threat actors who might target them. For AI systems, scoping requires understanding not just what systems exist, but how they fail.
Start with an AI system inventory that goes beyond deployment tracking. For each model in production, document:
-
Decision impact classification: What decisions does this model influence, and what are the consequences of incorrect decisions? A recommendation engine suggesting products carries different risk than a model screening loan applications or prioritizing maintenance interventions on critical infrastructure.
-
Data sensitivity: What data does the model process, and what regulatory frameworks apply? Models processing personal data trigger PDPL obligations. Models processing health information implicate additional sectoral requirements. Models making decisions about individuals may fall under SDAIA's high-risk AI classification.
-
Adversarial exposure: Who benefits from manipulating this model's outputs? Fraud detection models are targeted by fraudsters. Content moderation systems are targeted by actors seeking to spread disinformation. Trading algorithms are targeted by competitors and market manipulators. Understanding motivation shapes threat modeling.
-
Dependency mapping: What data pipelines feed this model? What third-party models or APIs does it depend on? A compromise in a data pipeline or a malicious update from a vendor becomes a compromise of your model.
For Saudi organizations in regulated sectors, this inventory should align with SDAIA's risk classification framework and NCA's cybersecurity controls. The scoping exercise produces a prioritized list of AI assets requiring continuous monitoring—not just the most critical models, but the models with the highest combination of criticality, exposure, and regulatory sensitivity.
CTEM Phase 2: Discovery—Finding AI Vulnerabilities
Discovery in traditional CTEM involves vulnerability scanning and attack surface mapping. For AI systems, discovery requires different techniques that address how models actually fail.
Adversarial robustness testing replaces vulnerability scanning. Rather than checking for known CVEs, you probe the model for susceptibility to adversarial manipulation. This includes:
-
Evasion testing: Can inputs be perturbed to cause misclassification? For image models, this means generating adversarial examples. For text models, this means crafting prompts that bypass safety filters or cause harmful outputs. For tabular data models, this means identifying the smallest changes to input features that flip predictions.
-
Data poisoning detection: Has the training data been compromised? This requires analyzing training data for anomalous patterns that might indicate deliberate manipulation—labels that have been flipped, samples that cluster suspiciously, features that show unusual correlations with target variables.
-
Model extraction detection: Is someone attempting to steal your model through API queries? Monitor for query patterns consistent with model extraction attacks: high volumes of systematically varied inputs, queries that probe decision boundaries, unusual access patterns from single sources.
Distribution drift detection addresses a vulnerability category unique to ML systems. Models degrade when production data diverges from training data—not because anyone attacked the model, but because the world changed. Statistical tests comparing training and production distributions, combined with performance monitoring on labeled samples, reveal when models are operating outside their validated domain.
For Saudi organizations, NCA's Control 5.3 requires documented adversarial robustness testing with specific metrics: false positive rates under adversarial conditions, performance degradation thresholds, and recovery mechanisms. Discovery is not optional; it is a compliance requirement.
CTEM Phase 3: Validation—Testing AI Exploitability
Traditional validation confirms that discovered vulnerabilities are exploitable. For AI systems, validation means demonstrating that identified weaknesses translate to real-world impact.
Red team exercises for AI go beyond conventional penetration testing. AI red teams probe not just technical vulnerabilities but the human-AI interaction layer. Can users be manipulated into trusting model outputs they should question? Can the model be prompted into generating harmful content? Can decision processes be gamed through repeated experimentation?
For Saudi organizations, validation should include:
-
Regulatory scenario testing: Does the model produce outputs that violate PDPL requirements, SDAIA ethical guidelines, or sector-specific regulations? Generate test cases specifically designed to trigger prohibited behaviors.
-
Fail-safe verification: When the model encounters adversarial inputs or operates outside its validated domain, does it fail safely? Does it flag uncertainty, refuse to predict, or escalate to human review—or does it confidently generate wrong answers?
-
Incident simulation: Walk through the detection-to-response timeline. If an adversarial attack were underway, would monitoring detect it? How long would detection take? What would containment involve?
Document validation results with the same rigor you'd apply to traditional penetration testing. For regulated entities, this documentation is evidence of due diligence.
CTEM Phase 4: Prioritization—Ranking AI Risks
Traditional prioritization uses CVSS scores and exploitability metrics. AI vulnerabilities don't have CVSS scores. Prioritization requires a framework that accounts for AI-specific factors.
Consider four dimensions:
Exploitability: How difficult is this vulnerability to exploit? Some adversarial attacks require deep technical expertise and intimate knowledge of the model. Others—prompt injection attacks on large language models, for example—can be executed by anyone with basic knowledge and trial-and-error persistence.
Impact: What happens if this vulnerability is exploited? Impact assessment should consider both direct consequences (incorrect predictions, data exposure) and secondary effects (regulatory penalties, reputational damage, erosion of trust in AI systems).
Detectability: How likely is exploitation to be detected? Vulnerabilities that produce detectable anomalies are less risky than those that operate silently. A model that has been subtly poisoned to produce biased outputs might operate for months before anyone notices.
Recoverability: How quickly can the organization respond? Vulnerabilities in models with no rollback capability, no fallback systems, and no human-in-the-loop processes carry higher risk than those in systems with graceful degradation.
Prioritization outputs a ranked remediation roadmap. Not every vulnerability requires immediate action, but every vulnerability requires a documented decision about response timing and approach.
CTEM Phase 5: Mobilization—Responding to AI Threats
Mobilization translates prioritized risks into action. For AI systems, mobilization includes both technical remediation and governance updates.
Technical responses to AI vulnerabilities differ from traditional patching:
-
Input filtering and output validation: Rather than modifying the model, add layers that intercept adversarial inputs before they reach the model and validate outputs before they're acted upon. This doesn't eliminate vulnerabilities but adds defense in depth.
-
Model retraining: For vulnerabilities stemming from training data issues, retraining on corrected or augmented data may be appropriate. Retraining itself introduces risk—new models have new behaviors—and requires its own validation pipeline.
-
Fallback mechanisms: For high-risk decision categories, implement human-in-the-loop requirements or deterministic fallback systems that operate when model confidence falls below thresholds or when adversarial inputs are suspected.
-
Monitoring enhancement: When a vulnerability is discovered, enhance monitoring to detect exploitation attempts. The same techniques used in discovery—adversarial detection, drift monitoring, query pattern analysis—become ongoing controls.
Governance responses ensure that lessons learned are incorporated into policy:
- Update risk registers with AI-specific vulnerabilities and their mitigation status.
- Revise model deployment checklists to include tests for newly discovered vulnerability categories.
- Brief incident response teams on AI-specific failure modes and containment strategies.
- Document regulatory implications and update compliance mappings.
For Saudi organizations, mobilization must account for NCA and SDAIA reporting requirements. Certain AI vulnerabilities—particularly those involving potential data breaches or adversarial attacks on critical infrastructure—may trigger notification obligations.
Threat Modeling for ML Pipelines
AI systems are not isolated artifacts. They're embedded in data pipelines, served through APIs, consumed by downstream applications, and monitored by operational systems. Each integration point is an attack surface.
Data pipeline threats include data poisoning (injecting malicious samples into training data), data exfiltration (extracting sensitive information through the pipeline), and data integrity attacks (modifying data in transit). Defenses include data provenance tracking, anomaly detection on incoming data, and access controls that limit who can contribute to training datasets.
Model serving threats include model extraction (reconstructing the model through API queries), model inversion (extracting training data from model outputs), and denial of service (overwhelming the model with adversarial queries). Defenses include rate limiting, query logging and analysis, and differential privacy techniques that add noise to model outputs.
Downstream consumption threats include over-reliance (users trusting model outputs without verification), misuse (applying models to use cases they weren't designed for), and cascade failures (errors in one model propagating to dependent systems). Defenses include clear documentation of model limitations, output confidence indicators, and circuit breakers that isolate failing components.
Adversarial Monitoring: Continuous Vigilance
CTEM is continuous by definition. For AI systems, this means monitoring that evolves as the threat landscape evolves.
Baseline establishment: Before you can detect anomalies, you need to know what normal looks like. Establish baselines for model performance metrics, input distributions, output patterns, and query volumes. These baselines become the reference points against which deviations are measured.
Multi-layer monitoring: Monitor at the model level (accuracy, drift, confidence distributions), the input level (distribution shift, adversarial signatures, anomalous queries), and the system level (latency, throughput, error rates). No single layer catches everything.
Alert calibration: The challenge with AI monitoring is distinguishing genuine threats from statistical noise. False positive rates that are acceptable for intrusion detection may be unacceptable for AI monitoring if they overwhelm analysts and desensitize them to real threats. Calibrate thresholds based on your organization's risk tolerance and response capacity.
Threat intelligence integration: Track emerging AI attack techniques through security research, vendor advisories, and sector-specific intelligence sharing. A vulnerability discovered in similar models elsewhere is a vulnerability you should test for, even if you haven't observed exploitation attempts.
Incident Response for AI Threats
When adversarial activity is detected, the response follows the same phases as traditional incident response—containment, investigation, remediation, recovery—with AI-specific considerations at each stage.
Containment for AI systems often means moving to human-in-the-loop operation or falling back to deterministic systems rather than simply shutting down the model. The operational impact of disabling an AI system may exceed the risk of continued operation with enhanced monitoring, depending on the severity and nature of the threat.
Investigation requires ML expertise. Understanding why a model is behaving unexpectedly—whether due to adversarial inputs, data poisoning, distribution drift, or a technical failure—requires someone who can interrogate model behavior, not just system logs.
Remediation may involve model rollback, input filtering, retraining, or architectural changes. Document the root cause, the remediation approach, and the validation that confirms the remediation is effective.
Recovery includes updating monitoring to detect recurrence, sharing lessons learned (appropriately sanitized) with relevant communities, and incorporating findings into threat models for related systems.
For Saudi organizations, certain AI incidents—particularly those involving adversarial attacks on critical infrastructure or personal data exposure—trigger notification obligations to NCA and potentially SDAIA. Incident response playbooks should include templates for these notifications and clear criteria for when they're required.
The SDAIA/NCA Alignment Imperative
Saudi organizations don't operate in a regulatory vacuum. Building a CTEM program for AI systems means building one that satisfies both security requirements and governance requirements.
NCA's Essential Cybersecurity Controls require AI-specific security measures for systems in government and critical infrastructure contexts. Control 5.3 mandates adversarial robustness testing with documented metrics. Control requirements around supply chain security apply to AI vendors and third-party models. Incident reporting requirements apply to AI system compromises.
SDAIA's AI Ethics Framework creates governance obligations that intersect with security. Transparency requirements mean you need to understand and be able to explain model behavior—impossible if you don't monitor for adversarial manipulation. Fairness requirements mean you need to detect when models are producing biased outputs—whether from adversarial manipulation or other causes. Accountability requirements mean you need clear ownership of AI security alongside AI governance.
The convergence is real: good AI security is prerequisite to good AI governance. You cannot assure stakeholders that your AI systems operate ethically if you cannot assure them that your AI systems operate securely.
A Practical Starting Point
For organizations beginning this journey, start with three actions:
Inventory your AI attack surface. You cannot protect what you don't know exists. Document every model in production, the decisions it influences, the data it processes, and the adversaries who might benefit from manipulating it.
Implement adversarial monitoring. Even basic monitoring—tracking input distributions, output patterns, and query volumes—provides early warning of many attack categories. You don't need perfect detection; you need detection that's good enough to catch the most likely threats.
Run a tabletop exercise. Walk through an AI incident scenario from detection to recovery. Who would be involved? What decisions would need to be made? What information would you need that you don't currently have? The gaps revealed by exercises are the gaps you should prioritize closing.
AI systems are becoming integral to how Saudi organizations operate, make decisions, and serve their stakeholders. The threats against these systems are real, evolving, and increasingly sophisticated. A CTEM program adapted for AI's unique characteristics isn't a luxury—it's a necessity for organizations that want to reap AI's benefits without exposing themselves to its risks.
PeopleSafetyLab is the AI safety lab for the Arab world. We help organizations protect people at work, at home, and everywhere AI touches their lives. Learn more at peoplesafetylab.com.