A fair question from every issuer we talk to: "the agent reports its own intent — why would you trust that?" We don't. Here is exactly which signals are independently verified, which are attestations, and how attestations are tested.
Intent context is an attestation, not a credential. Every intent claim is scored against the agent's observed behavioral baseline. An agent that misrepresents its reasoning produces measurable divergence between stated intent and transaction behavior — and divergence is exactly what the anomaly gate detects. Agents that lie drift; drift gets caught.
| Signal | Source | How it is verified |
|---|---|---|
| Agent identity | API key binding + registry | Independently verified. Credentials are bound to a registered agent; requests are authenticated per call. Where network agent-identity standards are present (Trusted Agent Protocol, Agentic Tokens), DTP consumes them as stronger identity inputs — verified network identity hardens baseline attribution at gate 1. Identity cannot be attested into existence. |
| Mandate bounds | Platform data (principal-configured) | Independently enforced. Amount, MCC, geography, time-of-day and frequency limits live in our registry — the agent never supplies its own limits. |
| Velocity | Observed transaction stream | Independently computed from what the agent actually did, atomically, under concurrency. |
| Merchant / MCC | Network and acquirer data | Independently verified against network-provided merchant identifiers — not the agent's description of the merchant. |
| Behavioral baseline | Accumulated decision history | Independently measured. Built from every prior authorization: amounts, categories, timing, reasoning patterns, confidence calibration. |
| intent_context | Agent attestation | Tested, not trusted. Scored against the behavioral baseline across six EDQS dimensions. Stated intent that diverges from observed behavior raises the anomaly score and degrades the KYA Score — which gates future authority. |
A dishonest attestation has to keep being dishonest. One fabricated reasoning_summary may pass; a pattern of them shifts the agent's vocabulary fingerprint, decouples stated confidence from outcomes, and collapses the declared alternatives-considered distribution. Those are three of the six dimensions the anomaly gate scores on every single transaction. The cost of sustained misrepresentation is a falling KYA Score — reduced spending authority, step-up challenges, and eventually RED-zone suspension.
A manipulated agent shows reasoning artifacts and behavioral discontinuities mid-session. The behavioral overlay (gate 5) scores degradation and injection indicators before mandate enforcement runs; session-level fingerprinting catches the shift at the first anomalous transaction.
Stolen API credentials produce transactions that authenticate correctly but diverge immediately from the bound agent's baseline — timing, merchant mix, amount distribution. Divergence triggers the anomaly gate independent of any attestation.
Slow manipulation designed to retrain the baseline is bounded by mandate ceilings (platform-held, not attested) and monitored by drift detection across sessions — gradual deviation accumulates in the EDQS trend even when each step is individually plausible.
Covered above: tested every transaction, priced in falling authority. The economics of lying to the anomaly gate are negative.
An agent decomposes a forbidden or over-cap purchase into many individually compliant transactions. Mitigations: cumulative daily/monthly caps enforced alongside per-transaction limits; amount-distribution telemetry (denomination clustering, kurtosis shifts) flags structuring patterns; merchant-affinity concentration surfaces repeated same-merchant decomposition. Residual risk acknowledged: cross-merchant splitting below all caps — an active red-team scenario in the pre-registered adversarial evaluation.
A legitimate, uncompromised agent is steered by an adversarial merchant surface — manipulated listings, dark-pattern checkout flows, injected instructions in product content. The agent’s identity is valid and its mandate is satisfied, so identity-layer controls pass. Detection lives in intent coherence (stated purpose vs. merchant category drift), merchant-affinity anomalies (new-merchant rate spikes), and reasoning-quality signals. This is the threat class where decision-quality evaluation does work that identity and consent layers structurally cannot.
Any score that gates money will be farmed: an adversary runs clean transactions to build trust before exploiting it — the synthetic-identity bust-out pattern applied to agents. Mitigations: trust promotion requires sustained consistency across all score components, not volume alone; promotion velocity is itself a monitored signal; effective limits scale gradually with tier so the value extractable at each tier is bounded relative to the cost of reaching it; demotion is asymmetric (fast down, slow up). Stated honestly: warming resistance is a design property, not yet an adversarially validated one — it is a named arm of the red-team protocol.
Coordinated multi-agent abuse patterns are analyzed in async post-decision (step 9) across the tenant's full agent population.
Full scoring methodology: EDQS Research Framework · Protocol architecture: DTP whitepaper · Terms: Glossary