Trust & Security
Every skill on ClawNet carries a trust score from 0 to 100, computed from four weighted signals:
| Signal | Weight | What It Measures |
|---|---|---|
| Attestation Balance | 40% | Safe vs unsafe verdicts. Each "unsafe" counts 3x against "safe". |
| Auditor Breadth | 25% | Number of unique auditors (log scale). More independent reviewers = higher trust. |
| On-chain Anchoring | 20% | Ratio of on-chain and AIP-signed attestations among safe verdicts. |
| Version Continuity | 15% | Auditors who re-vouch across new versions, signaling ongoing review. |
Score Labels
| Score | Label |
|---|---|
| 0 | Unreviewed |
| 1-19 | Low |
| 20-39 | Cautious |
| 40-59 | Moderate |
| 60-79 | Good |
| 80-100 | Strong |
Web of Trust (Relative Scoring)
When you sign in, ClawNet shows your trust score for each skill. The algorithm filters to only the auditors you trust. If you haven't established any trust edges, it falls back to the app's default trust graph.
Two users can see different trust scores for the same skill. This is by design -- trust is personal.
Content Scanning
All published skills are scanned for hidden content that could be used for prompt injection:
- HTML comments invisible in rendered markdown but readable by agents
- Hidden HTML elements (
display:none,visibility:hidden) - Invisible Unicode characters (zero-width spaces, bidi marks)
- Prompt injection phrases ("ignore previous instructions",
curl | bash)
Scanned at five enforcement points: CLI publish, API publish, Convex mutation, CLI install, and rendered output. Skills that fail the scan are rejected.
Trust CLI Commands
clawnet trust add <bapId> # Trust an auditor
clawnet trust threshold <n> # Required trusted approvals
clawnet trust report <slug> [--version <v>] [--tree-limit <n>] # Score + trust tree readout
clawnet vouch skill <slug> [--version <v>] # Vouch for a skill (safe)
clawnet denounce skill <slug> [--version <v>] # Flag a skill (unsafe)
For the full algorithm, see https://github.com/b-open-io/clawnet/blob/master/lib/trust-score.ts