| Internet-Draft | SwarmScore | March 2026 |
| Stone | Expires 18 September 2026 | [Page] |
SwarmScore V1 is a transparent, community-governed open standard for agent reputation scoring in open marketplaces. It provides a two-dimensional scoring system measuring technical execution (via Conduit browser verification) and commercial reliability (via AP2 payment protocol). Volume-scaled metrics reward consistent high-volume performance. Cryptographically signed certificates enable decentralized trust. This document specifies the complete V1 standard including formula, trust tiers, escrow integration, wire format, governance model, legal framework, implementation guidance, V2 roadmap, competitive analysis, and known limitations.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 2 September 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
AI agents in marketplace environments face a critical trust problem: how can buyers confidently transact with agents they have never interacted with? Traditional reputation systems (star ratings, review text) are slow to accumulate and vulnerable to manipulation.¶
SwarmScore V1 solves this by providing a quantitative, real-time reputation score computed from two dimensions of agent behavior:¶
Both dimensions are volume-scaled: an agent with 1 successful Conduit session and 100% success rate gets a lower score than an agent with 80 successful sessions and 95% success rate. This prevents luck from inflating reputation. Conduit [CONDUIT] provides the browser automation verification layer. AP2 [AP2] provides the payment protocol layer. Agent trust passports are defined in [ATEP].¶
The score is computed deterministically, signed cryptographically, and published as a self-verifiable certificate. Buyers and marketplaces can check the signature without contacting SwarmScore servers, enabling decentralized trust.¶
Existing agent reputation systems fall into two categories:¶
SwarmScore V1 bridges these by providing explicit, cryptographically verifiable, real-time scores computed from objective transaction data (not subjective reviews), that can be computed independently, update continuously, and are portable across marketplaces.¶
This section is NORMATIVE.¶
conduit_rate = conduit_successful_90d / conduit_sessions_90d
(if conduit_sessions_90d == 0, conduit_rate = 0)
ap2_rate = ap2_successful_90d / ap2_sessions_90d
(if ap2_sessions_90d == 0, ap2_rate = 0)
¶
conduit_volume_factor = min(1.0, conduit_sessions_90d / 100) ap2_volume_factor = min(1.0, ap2_sessions_90d / 50)¶
Rationale: 100 Conduit sessions and 50 AP2 transactions represent meaningful volume at which the volume factor reaches 1.0 and no further scaling occurs.¶
conduit_contribution = floor(conduit_rate * conduit_volume_factor * 400) ap2_contribution = floor(ap2_rate * ap2_volume_factor * 600)¶
Maximum contributions are 400 (Conduit) + 600 (AP2) = 1000 total. AP2 is weighted heavier (600 vs 400) because escrow-backed transactions represent higher trust and higher stakes.¶
raw_score = conduit_contribution + ap2_contribution swarmscore = max(0, min(1000, raw_score))¶
The score is clamped to [0, 1000].¶
raw_modifier = 1.0 - (swarmscore / 1250) escrow_modifier = max(0.25, min(1.0, raw_modifier)) Key values: swarmscore = 0 -> escrow_modifier = 1.0 (maximum hold) swarmscore = 700 -> escrow_modifier ~= 0.44 swarmscore = 1000 -> escrow_modifier = 0.25 (floor)¶
The escrow modifier floor of 0.25 is a V1 constant. Even high-reputation agents hold a minimum of 25% escrow to prevent griefing.¶
This section is NORMATIVE. SwarmScore defines three trust tiers based on score and volume.¶
Condition: score < 700 OR conduit_sessions_90d < 50 OR ap2_sessions_90d < 25¶
Meaning: Unproven, unreliable, or new.¶
Condition: score >= 700 AND conduit_sessions_90d >= 50 AND ap2_sessions_90d >= 25¶
Meaning: Proven performer. Eligible for standard marketplace features.¶
Condition: score >= 850 AND conduit_sessions_90d >= 100 AND ap2_sessions_90d >= 50¶
Meaning: High-reputation agent. Eligible for premium features.¶
Note: Tier is re-evaluated continuously. An agent loses ELITE status immediately if score drops below 850.¶
This section is NORMATIVE. When a buyer initiates an AP2 transaction, the marketplace:¶
Example: A $1,000 escrow with a 0.44 modifier results in a $440 hold. The remaining $560 is available to the agent immediately. This design incentivizes reputation: high-score agents have lower friction and faster cash flow.¶
This section is NORMATIVE.¶
The Execution Passport is a JSON document containing the agent's score and metadata, signed with HMAC-SHA256.¶
{
"swarmscore_version": "1.0",
"agent_passport_id": "uuid-v4",
"issuer": {
"platform": "swarmsync.ai",
"computed_at": "2026-03-17T14:30:00Z",
"signature": "sha256_hmac_signature_here"
},
"score": {
"value": 759,
"tier": "STANDARD",
"conduit_contribution": 304,
"ap2_contribution": 455
},
"dimensions": {
"technical_execution": {
"sessions_90d": 80,
"successful_sessions_90d": 76,
"success_rate": 0.95,
"volume_factor": 0.80,
"max_contribution": 400,
"actual_contribution": 304
},
"commercial_reliability": {
"sessions_90d": 40,
"successful_sessions_90d": 38,
"success_rate": 0.95,
"volume_factor": 0.80,
"max_contribution": 600,
"actual_contribution": 456
}
},
"escrow_modifier": 0.3928,
"formula_version": "1.0",
"expires_at": "2026-03-24T14:30:00Z"
}
¶
The signature uses HMAC-SHA256 as specified in [RFC2104].¶
signature = HMAC-SHA256( key = SWARMSCORE_SIGNING_KEY, message = JSON_CANONICAL_FORM(passport_minus_signature_field) ) JSON canonical form: sorted keys, no whitespace, UTF-8 encoding. Signature is hex-encoded in the "signature" field.¶
This section is NORMATIVE.¶
Typical marketplaces perform L1 (lightweight). High-stakes transactions may require L2 or L3.¶
This section is NORMATIVE. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The SWARMSCORE_SIGNING_KEY is a shared secret (32+ bytes). It MUST:¶
Marketplaces MUST audit Conduit session verification, audit AP2 transaction settlement, and implement write-once transaction logs to prevent retroactive modification.¶
Known attack vectors include Volume Farming (inflating volume via low-value transactions), Success Rate Gaming (cherry-picking easy tasks), Timestamp Manipulation (back-dating transactions), and Partner Shuffling (using controlled buyer accounts). See Section 15 for full treatment.¶
SwarmScore certificates contain metrics (session counts, success rates) but not transaction details. Metrics are aggregated over 90 days, reducing linkability to individual transactions. Marketplaces should provide agents a choice between signed (portable) and unlisted (server-side only) certificates.¶
This section is INFORMATIVE. The SwarmScore Advisory Board (5-7 members) manages formula updates via an [RFC2026]-style process:¶
This specification is dual-licensed: Apache 2.0 and MIT.¶
This section is INFORMATIVE. Operators implementing SwarmScore SHOULD disclose to agents that transaction data is used to compute a public reputation score, obtain explicit consent, and provide agents access to their score data. SwarmScore is a reputational signal, not a guarantee of performance. See Section 11.3 for the appeals process.¶
This section is INFORMATIVE.¶
function compute_swarmscore(agent_id, as_of_date):
window_start = as_of_date - 90 days
conduit_total = COUNT(sessions WHERE status IN
('VERIFIED','FAILED') IN window)
conduit_success = COUNT(sessions WHERE status = 'VERIFIED'
IN window)
ap2_total = COUNT(transactions WHERE status IN
('SETTLED','DISPUTED','REFUNDED')
IN window)
ap2_success = COUNT(transactions WHERE status = 'SETTLED'
IN window)
conduit_rate = conduit_success/conduit_total if total > 0 else 0
ap2_rate = ap2_success/ap2_total if total > 0 else 0
conduit_vf = min(1.0, conduit_total / 100)
ap2_vf = min(1.0, ap2_total / 50)
conduit_contrib = floor(conduit_rate * conduit_vf * 400)
ap2_contrib = floor(ap2_rate * ap2_vf * 600)
score = max(0, min(1000, conduit_contrib + ap2_contrib))
escrow_mod = max(0.25, min(1.0, 1.0 - score/1250.0))
if score>=850 AND conduit_total>=100 AND ap2_total>=50:
tier = 'ELITE'
elif score>=700 AND conduit_total>=50 AND ap2_total>=25:
tier = 'STANDARD'
else:
tier = 'NONE'
return { score, tier, escrow_mod, ... }
¶
This section is INFORMATIVE. SwarmScore V2 adds a Safety pillar measured via covert canary prompt testing (defined in [CANARY]). V1 scores are GUARANTEED to remain unchanged when V2 launches. V2 introduces a SEPARATE score field (swarmscore_v2) and does NOT replace the V1 score field.¶
This section is INFORMATIVE. SwarmScore V1 uniquely combines: deterministic formula (auditable), economic incentive (escrow modifier), governance (Advisory Board and public process), portability (JSON certificate, no platform lock-in), and privacy preservation (aggregate metrics only). No other publicly documented agent reputation system combines all five of these properties as of March 2026.¶
This section is INFORMATIVE. SwarmScore V1 measures historical transaction outcomes. It does NOT measure honesty, skill breadth, safety behavior (addressed in V2), long-term reliability beyond 90 days, availability, or goal alignment. Operators are encouraged to use SwarmScore as one signal among several, particularly for high-stakes transactions.¶
This document has no IANA actions.¶