Threat Model
This document describes the AgentWeave SDK threat model: what assets we protect, who we defend against, what attack vectors we consider, and how we mitigate threats.
Table of Contents
- Threat Model
Assets to Protect
Primary Assets
- Agent Identity Credentials (SVIDs)
- X.509 certificates issued by SPIRE
- Private keys associated with identities
- SPIFFE IDs (identity URIs)
- Impact if compromised: Impersonation attacks, unauthorized access
- Authorization Policies
- OPA policy files (Rego)
- Policy data (allowlists, capabilities)
- Impact if compromised: Unauthorized access, privilege escalation
- Agent Communication
- Request/response payloads
- Task parameters and results
- Context data passed between agents
- Impact if compromised: Data leakage, tampering
- Configuration Data
- Agent configurations (YAML)
- SPIRE connection details
- OPA endpoints
- Impact if compromised: System misconfiguration, security bypass attempts
- Audit Logs
- Authorization decisions
- Request traces
- Security events
- Impact if compromised: Loss of forensic evidence, compliance violations
Secondary Assets
- Agent Availability
- Uptime and responsiveness
- Resource capacity
- Impact if compromised: Denial of service
- System Integrity
- Agent binary and dependencies
- Configuration immutability
- Impact if compromised: Code execution, backdoors
Trust Boundaries
Boundary 1: Agent Process
1
2
3
4
5
6
7
8
9
10
11
12
13
┌─────────────────────────────────────────────┐
│ Agent Process (Trusted) │
│ • Agent code │
│ • AgentWeave SDK │
│ • Python runtime │
└─────────────────────────────────────────────┘
│ SPIFFE Workload API
│ (Unix domain socket)
┌─────────────▼─────────────────────────────┐
│ SPIRE Agent (Trusted) │
│ • Issues SVIDs │
│ • Validates workload identity │
└───────────────────────────────────────────┘
Crossing this boundary:
- Agent fetches SVIDs via Unix socket (local only)
- Attestation proves agent identity to SPIRE
- Trust established through platform-specific attestation
Boundary 2: Network Communication
1
2
3
4
5
6
7
8
9
10
11
12
13
14
┌─────────────────┐ ┌─────────────────┐
│ Agent A │ │ Agent B │
│ (Trusted) │ │ (Untrusted) │
└────────┬────────┘ └────────┬────────┘
│ │
│ mTLS Handshake │
│ • Verify peer SVID │
│ • Validate trust domain │
│◄───────────────────────────────────►│
│ │
│ Encrypted Communication │
│ • TLS 1.3 │
│ • Mutual authentication │
│◄───────────────────────────────────►│
Crossing this boundary:
- mTLS handshake with peer verification
- Trust domain validation
- All data encrypted in transit
- Authorization check before processing request
Boundary 3: Authorization Decision
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
┌─────────────────────────────────────────────┐
│ Agent Runtime (Trusted) │
│ │
│ Request received ──────┐ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ OPA Authorization│ │
│ │ (Policy Engine) │ │
│ └──────────────────┘ │
│ │ │
│ ▼ │
│ Allow / Deny Decision │
│ │ │
│ ┌───────────────┴────────────┐ │
│ ▼ ▼ │
│ Execute Capability Log & Reject │
└─────────────────────────────────────────────┘
Crossing this boundary:
- Every capability invocation checked
- Default deny policy enforced
- Authorization context evaluated
- Decision logged for audit
Boundary 4: External Systems
1
2
3
4
5
6
7
8
9
10
11
┌─────────────────┐
│ Agent │
└────────┬────────┘
│
│ (Application-level)
▼
┌─────────────────┐
│ External API │
│ Database │
│ Service │
└─────────────────┘
Crossing this boundary:
- Agent is responsible for authentication to external systems
- AgentWeave provides secure context but not external auth
- Application must implement additional security controls
Threat Actors
Tier 1: Opportunistic Attackers
- Motivation: Casual exploitation, automated attacks
- Capabilities: Script kiddies, automated scanners
- Mitigations: Default security, secure defaults, input validation
Tier 2: Skilled Attackers
- Motivation: Targeted data theft, service disruption
- Capabilities: Exploit development, social engineering
- Mitigations: Defense in depth, monitoring, incident response
Tier 3: Advanced Persistent Threats (APT)
- Motivation: Long-term access, espionage, sabotage
- Capabilities: Zero-day exploits, supply chain attacks, insider threats
- Mitigations: Comprehensive defense, threat hunting, security audits
Tier 4: Insider Threats
- Motivation: Malicious insiders, compromised credentials
- Capabilities: Legitimate access, insider knowledge
- Mitigations: Least privilege, audit logging, separation of duties
STRIDE Analysis
Spoofing Identity
Threats:
- Attacker impersonates legitimate agent
- Stolen or forged credentials
- SVID replay attacks
Mitigations:
- ✅ SPIFFE cryptographic identity (cannot forge without private key)
- ✅ Platform attestation prevents identity theft
- ✅ Short-lived SVIDs (1 hour TTL) limit exposure
- ✅ Automatic rotation prevents replay
- ✅ Mutual TLS verifies both parties
- ✅ Trust domain validation prevents cross-domain attacks
Residual Risk: Low - requires compromise of SPIRE server or node
Tampering with Data
Threats:
- Man-in-the-middle (MITM) attacks
- Message modification in transit
- Request payload tampering
Mitigations:
- ✅ TLS 1.3 encryption protects data in transit
- ✅ Perfect forward secrecy (PFS) prevents decryption of past sessions
- ✅ Message integrity via TLS MAC
- ✅ Mutual authentication prevents MITM
- ✅ Input validation at application layer
Residual Risk: Low - requires breaking TLS 1.3 cryptography
Repudiation
Threats:
- Agent denies performing action
- Caller denies making request
- No proof of authorization decision
Mitigations:
- ✅ Comprehensive audit logging with SPIFFE IDs
- ✅ Request traces with distributed trace IDs
- ✅ Authorization decisions logged
- ✅ Immutable audit logs (when using write-once storage)
- ✅ Timestamps and correlation IDs
Residual Risk: Low - full audit trail maintained
Information Disclosure
Threats:
- Eavesdropping on communication
- Log exposure with sensitive data
- Error messages leak system info
- Unauthorized access to capabilities
Mitigations:
- ✅ TLS 1.3 encryption for all communication
- ✅ No plaintext fallback possible
- ✅ Authorization enforced before data access
- ✅ Secure error handling (no stack traces to callers)
- ✅ Audit logs configurable to redact sensitive fields
- ✅ Network policies restrict communication paths
Residual Risk: Medium - application must protect data in use
Denial of Service (DoS)
Threats:
- Request flooding
- Resource exhaustion
- SVID rotation failures
- OPA policy evaluation overload
Mitigations:
- ✅ Rate limiting per capability
- ✅ Circuit breakers prevent cascading failures
- ✅ Request timeouts
- ✅ Resource limits (memory, CPU)
- ✅ Connection limits
- ⚠️ Application-level DoS protection required
Residual Risk: Medium - distributed DoS harder to mitigate
Elevation of Privilege
Threats:
- Bypass authorization checks
- Exploit policy logic bugs
- Gain unauthorized capabilities
- Privilege escalation through federation
Mitigations:
- ✅ Default deny authorization
- ✅ OPA policy enforcement cannot be bypassed
- ✅ Least privilege by default
- ✅ Trust domain isolation
- ✅ Capability-level granularity
- ✅ No security bypass flags or configs
- ✅ Policy testing framework
Residual Risk: Low - requires policy misconfiguration
Attack Vectors
1. Network-Based Attacks
MITM Attack on Agent Communication
Attack: Attacker intercepts traffic between agents
Mitigations:
- mTLS required for all communication
- Peer verification prevents MITM
- Network policies restrict routing
- No plaintext fallback
Likelihood: Very Low Impact: High Risk: Low
TLS Downgrade Attack
Attack: Force use of weaker TLS version
Mitigations:
- TLS 1.3 enforced (min version)
- No version negotiation allowed
- Cipher suite hardening
- Secure renegotiation disabled
Likelihood: Very Low Impact: High Risk: Low
2. Identity-Based Attacks
SVID Theft
Attack: Steal SVID from agent process
Mitigations:
- SVIDs stored in memory only
- Platform attestation required for issuance
- Short TTL (1 hour)
- Process isolation
- Read-only root filesystem
Likelihood: Low Impact: High Risk: Medium
SPIRE Server Compromise
Attack: Compromise SPIRE server to issue fraudulent SVIDs
Mitigations:
- SPIRE server should run in isolated environment
- Database encryption
- Server authentication
- Monitoring and alerting
- Regular security audits
Likelihood: Low Impact: Critical Risk: High (external to SDK)
3. Authorization Attacks
Policy Bypass
Attack: Circumvent OPA policy evaluation
Mitigations:
- Policy enforcement in SDK code path (cannot bypass)
- No configuration to disable authorization
- Complete mediation (all requests checked)
- Default deny
Likelihood: Very Low Impact: Critical Risk: Low
Policy Logic Bugs
Attack: Exploit bugs in Rego policies
Mitigations:
- Policy testing framework
- Example policies provided
- Policy review best practices
- OPA built-in functions (audited)
Likelihood: Medium Impact: High Risk: Medium
4. Application-Level Attacks
Malicious Capability Implementation
Attack: Developer writes capability that leaks data
Mitigations:
- ⚠️ Application responsibility
- Code review required
- Context isolation per request
- Audit logging of all invocations
Likelihood: Medium Impact: High Risk: Medium (application-level)
Dependency Vulnerabilities
Attack: Exploit vulnerability in Python dependencies
Mitigations:
- Regular dependency scanning
- Pin dependency versions
- Security update process
- Minimal dependencies
Likelihood: Medium Impact: Varies Risk: Medium
5. Operational Attacks
Configuration Tampering
Attack: Modify agent configuration to weaken security
Mitigations:
- Configuration validation at startup
- Immutable config (ConfigMap in K8s)
- No flags to disable security
- Config file checksums
Likelihood: Low Impact: High Risk: Medium
Log Tampering
Attack: Modify audit logs to hide attacks
Mitigations:
- Write-once storage (recommended)
- Send logs to SIEM immediately
- Log signing (optional)
- Access controls on log files
Likelihood: Medium Impact: Medium Risk: Medium
Residual Risks
Even with all mitigations, some risks remain:
1. SPIRE Infrastructure Compromise
Risk: If SPIRE server is compromised, attacker can issue arbitrary SVIDs
Acceptance Rationale:
- SPIRE security is external to SDK
- Follow SPIRE production deployment guide
- Use HA setup with monitoring
- Regular security audits
2. Policy Misconfiguration
Risk: Overly permissive OPA policies grant excessive access
Acceptance Rationale:
- Policy testing is operator responsibility
- Provide policy examples and best practices
- Policy review tooling recommended
- Start with restrictive policies, expand as needed
3. Application Logic Vulnerabilities
Risk: Bugs in capability implementation compromise security
Acceptance Rationale:
- Application code is outside SDK scope
- SDK provides secure primitives
- Audit logging enables detection
- Secure coding guidelines provided
4. Distributed Denial of Service
Risk: Coordinated attack from many sources overwhelms agent
Acceptance Rationale:
- DDoS mitigation is infrastructure concern
- Use cloud provider DDoS protection
- Rate limiting reduces impact
- Circuit breakers prevent cascading failures
5. Zero-Day Exploits
Risk: Unknown vulnerabilities in dependencies (Python, OpenSSL, etc.)
Acceptance Rationale:
- Keep dependencies updated
- Subscribe to security advisories
- Incident response plan required
- Defense in depth limits blast radius
6. Insider Threats with Legitimate Access
Risk: Malicious insider with valid credentials abuses access
Acceptance Rationale:
- Least privilege limits damage
- Comprehensive audit logging enables detection
- Separation of duties recommended
- Anomaly detection can flag suspicious behavior
Assumptions
The threat model makes these assumptions:
- SPIRE is Securely Deployed
- SPIRE server is protected and monitored
- Platform attestation is trustworthy
- Node agents are authentic
- Platform is Trusted
- Kubernetes/OS is not compromised
- Container runtime is secure
- Host OS kernel is trustworthy
- Network Infrastructure
- Network stack is reliable
- DNS is not poisoned
- Time synchronization is accurate (for cert validation)
- Cryptography is Sound
- TLS 1.3 implementation is correct
- X.509 certificate validation works
- Random number generators are secure
- Operator Competence
- Security best practices followed
- Regular updates applied
- Monitoring and alerting configured
- Incident response plan exists
Threat Model Maintenance
This threat model should be reviewed:
- Quarterly: Routine review of new threats
- When: New features are added to SDK
- When: New attack vectors are published
- When: Security incidents occur
References
- STRIDE Threat Modeling
- NIST SP 800-154: Guide to Data-Centric Threat Modeling
- SPIFFE Threat Model
- OPA Security
- OWASP Threat Modeling
Next Steps
- Review Best Practices for operational security
- See Audit Logging for detection and response
- Check Compliance for regulatory requirements