Scoring Engine
5.1 Site Classification
Before scoring, OTR classifies each site into one of three categories. Classification determines which weight formula and signal set apply.Three Site Categories
| Category | Description | Scoring |
|---|---|---|
| ecommerce | Online retail, product sales, marketplaces | COLD Ecommerce weights |
| saas | Software-as-a-Service, cloud platforms, developer tools | COLD SaaS weights |
| non_commerce | Non-commercial sites (government, nonprofit, media, education) | Not scored (score = 0, badge = UNRATED) |
How Classification Works
Classification uses a multi-signal confidence system, not single-signal triggers. The system calculates a commerce intent score from multiple signals:| Signal | Points | Condition |
|---|---|---|
| E-commerce platform | 4 | Shopify, WooCommerce, Magento, etc. detected |
| Payment processor | 3 | Stripe, PayPal, Square, etc. detected |
| Product schema | 3 | Schema.org Product markup found |
| Cart/checkout URL | 2 | /cart, /checkout, /basket URL patterns |
| Pricing page | 2 | /pricing page with paid tiers |
| Commerce links | 2 | /shop, /store, /products URL patterns |
| SaaS signals | 2-4 | Login/signup, API docs, dashboard, status page, changelog, integrations (3+ required) |
| Software schema | 1 | Schema.org SoftwareApplication |
| Live chat | 1 | Live chat widget detected |
| Reservation system | 1 | Reservation/booking system detected |
| Loyalty program | 1 | Loyalty program detected |
Exclusion Factors
Certain site characteristics raise the threshold, making it harder to be classified as commercial:| Factor | Threshold Increase | Condition |
|---|---|---|
| Government domain | +2 | .gov, .edu, .mil, .int domains |
| Financial industry | +4 | Wikidata financial industry classification |
| Donation/fundraising | +4 | Fundraising platform detected |
| News/media | +3 | Wikidata media industry classification |
| Nonprofit/public entity | +3 | Wikidata nonprofit classification |
Ecommerce vs SaaS Subtype
Once a site passes the commerce threshold, a dual-score competition determines whether it is ecommerce or SaaS: Ecommerce score signals: platform fingerprint (+35), product schema (+25), cart URL (+20), commerce links (+10), Wikidata retail QID (+25), retail label match (+20) SaaS score signals: SoftwareApplication schema (+30), pricing page with paid tiers (+25), SaaS HTML signals (+10/+20/+25 by count), Wikidata software label (+15), Wikidata SaaS description (+10) Rules:- ecommerceScore ≥ 20 AND ecommerceScore > saasScore → ecommerce
- saasScore ≥ 20 AND saasScore > ecommerceScore → saas
- Both below 20, but payment processor detected with no product signals → saas (payment without products indicates subscription billing)
- Both below 20, no special conditions → ecommerce (conservative default)
Confidence and Self-Correction
If classification confidence is below 30%, the domain is flagged for review. Subsequent rescans with additional data (Wikidata, GLEIF) can correct the classification automatically.5.2 Three Scoring Modes
OTR operates in three scoring modes, each with a different weight formula:COLD Ecommerce (Default for E-commerce Sites)
| Dimension | Weight | What It Measures |
|---|---|---|
| V — Verification | 40% | Identity verification (SSL, GLEIF, Wikidata, domain age) |
| G — Governance | 20% | Business credentials (legal entity, policies, compliance) |
| S — Security | 15% | Site security (DNSSEC, DMARC, SPF, security headers) |
| D — Data Quality | 15% | Structured data (Schema.org, llms.txt, product data) |
| T — Transparency | 10% | Policy transparency (privacy, terms, refund policies) |
| F — Fulfillment | 0% | Not used in COLD mode (requires merchant authorization) |
COLD SaaS (For SaaS/Software Sites)
| Dimension | Weight | Difference from Ecommerce |
|---|---|---|
| V — Verification | 37% | Slightly lower (SaaS identity often clear from product) |
| G — Governance | 23% | Higher (compliance matters more for B2B software) |
| S — Security | 20% | Higher (security is core to SaaS trust) |
| T — Transparency | 15% | Higher (SLA, uptime, changelog are expected) |
| D — Data Quality | 5% | Lower weight, but uses specialized SaaS D-signals |
| F — Fulfillment | 0% | Not used in COLD mode |
AUTH Mode (Authorized Merchants)
| Dimension | Weight | Difference from COLD |
|---|---|---|
| V — Verification | 10% | Much lower (merchant identity already verified) |
| S — Security | 10% | Baseline check only |
| G — Governance | 10% | Baseline check only |
| T — Transparency | 5% | Minimal weight |
| D — Data Quality | 25% | Higher (merchant provides richer data) |
| F — Fulfillment | 40% | Dominant factor (actual transaction performance) |
is_merchant_authorized = true in the database. This happens after a merchant completes the OTR authorization flow.
Why F dominates in AUTH: Once a merchant is verified and authorized, the most important signal is how well they actually fulfill orders. Shipping records, refund rates, customer complaint rates, and response times become the primary trust indicators.
The OTR-ID format reflects the mode: C prefix for COLD mode, A prefix for AUTH mode.
5.3 Non-Commerce Sites
Sites classified as non_commerce do not receive a trust score, but are still verified:- Trust score: 0
- Badge: UNRATED
- OTR-ID status: NOT_APPLICABLE (no OTR-ID issued)
- Verification: Identity signals still collected (GLEIF, Wikidata, SSL, Google Web Risk)
- API output: Identity, safety, and entity data are returned. Dimensions and signals are not.
- Frontend: Displays “Verification & Security Only” with detected identity signals.
Parked Domains
Domains detected as parked or listed for sale receive special treatment:- Site status: PARKED
- Trust score: 0, no backfill or scoring performed
- Detection: 15+ registrar patterns (GoDaddy, Sedo, Afternic, Dan.com, Namecheap, Porkbun, etc.) plus fingerprint analysis
- Frontend: Displays “Parked Domain” instead of “Verification & Security Only”
5.4 Trust Badges
Per-dimension scores (0-100) are weighted to produce a total score (0-100). Badges are assigned based on the total:| Badge | Score Range | Meaning |
|---|---|---|
| PLATINUM | 90-100 | Top-tier trust |
| GOLD | 80-89 | Highly trustworthy |
| SILVER | 70-79 | Moderately trustworthy |
| BRONZE | 60-69 | Basic trust |
| UNRATED | 0-59 | Insufficient trust (or non-commerce) |
What Badges Mean for AI Agents
Trust badges influence how AI agents make recommendation decisions:- PLATINUM / GOLD — AI agents prioritize these merchants, ranking them higher among similar products
- SILVER — AI agents recommend normally
- BRONZE — AI agents may recommend but will flag the trust level
- UNRATED — AI agents may skip these merchants or deprioritize them
5.5 SaaS D-Dimension: 16 Specialized Signals
When a site is classified as saas, the D (Data Quality) dimension uses a completely different signal set from ecommerce. Instead of product data quality, SaaS D measures infrastructure maturity across 5 sub-dimensions with 16 signals, capped at 100 points total.D1. API Documentation (max 25 points)
| Signal | Points | Condition |
|---|---|---|
| OpenAPI/Swagger spec | +10 | OpenAPI or Swagger specification detected |
| API endpoint count | +3/+5/+8 | ≥3 endpoints: +3, ≥10: +5, ≥20: +8 |
| Authentication docs | +7 | API authentication/authorization documentation found |
D2. SLA & Reliability (max 20 points)
| Signal | Points | Condition |
|---|---|---|
| SLA page | +8 | Dedicated SLA page exists |
| Uptime commitment | +3/+5/+7 | ≥99.0%: +3, ≥99.5%: +5, ≥99.9%: +7 |
| Status page | +5 | statuspage.io or equivalent status page |
D3. Pricing (max 25 points)
| Signal | Points | Condition |
|---|---|---|
| Pricing page | +6 | Dedicated pricing page exists |
| Paid tier count | +3/+5/+7 | 1 tier: +3, ≥2: +5, ≥3: +7 |
| Extractable prices | +7 | Machine-readable pricing data found |
| Free trial/Freemium | +5 | Free trial or freemium tier available |
D4. Security Compliance (max 15 points)
| Signal | Points | Condition |
|---|---|---|
| Security certification | +7 | SOC2, ISO 27001, or GDPR badge detected |
| Security page | +5 | /security or /trust page exists |
| Data Processing Agreement | +3 | DPA page or link found |
D5. Developer Ecosystem (max 15 points)
| Signal | Points | Condition |
|---|---|---|
| Developer docs | +6 | /docs or /developers page exists |
| Changelog | +4 | /changelog or /releases page exists |
| SDK count | +3/+5 | ≥1 SDK: +3, ≥3 SDKs: +5 |
5.6 How Each Dimension Score Is Calculated
Each dimension score is the weighted sum of all signals within that dimension, normalized to 0-100:signal_value= 1 (detected), 0 (not found), or negative (penalty signal triggered)signal_weight= relative weight of each signal within the dimension
5.7 Signal Detection Status Semantics
OTR uses four statuses for each signal:| Status | Meaning | Effect on Score |
|---|---|---|
| detected | Signal found and present | Positive contribution |
| not_found | Scanned, but signal is absent | No contribution (or penalty) |
| not_scanned | Signal has not been scanned yet | Excluded from scoring |
| fetch_failed | Scan attempted but failed (timeout, network error, etc.) | Excluded from scoring (no penalty) |
5.8 Score Lifecycle
A domain’s trust score is not calculated once and frozen. It has a complete lifecycle:Initial Scan
When a domain is first scanned:- DNS Scan — Check all DNS records (DNSSEC, DMARC, SPF, etc.)
- HTML Scan — Crawl the homepage and key pages; check structured data, policy pages, etc.
- Fingerprint Detection — Identify the technology stack (2,438 fingerprints, 975 OTR-relevant)
- Site Classification — Multi-signal classification into ecommerce, saas, or non_commerce
- Third-Party Queries — GLEIF, Wikidata, Finnhub, SEC, WebRisk
- Score Calculation — Apply the correct weight formula and calculate per-dimension and total scores
Periodic Rescans
Scored domains are periodically rescanned:- Temporal rescan — At preset intervals based on domain priority
- Change-triggered rescan — When domain indicators change
- Manual rescan — Domain owners can request an immediate rescan
Score and Classification Changes
After each rescan, if signal changes are detected, scores update immediately. Classification can also change if new data shifts the commerce or subtype scores.| Change | Effect |
|---|---|
| DNSSEC configured | S dimension increases |
| SSL certificate expired | V dimension decreases |
| Schema.org markup added | D dimension increases |
| Privacy policy removed | T dimension decreases |
| SaaS signals detected | May reclassify from ecommerce to saas |
5.9 Data Source Fault Tolerance
OTR depends on multiple third-party data sources (GLEIF, Wikidata, Finnhub, SEC, WebRisk). If a source becomes temporarily unavailable:- Signals dependent on it are marked
not_scannedand excluded from scoring - The system uses circuit breakers: after 7 consecutive failures, it pauses that source for 24 hours
- Once the source recovers, signals are automatically re-collected on the next rescan
5.10 Self-Assessment: Understanding Your Score
When reviewing your domain’s score:- Check which scoring mode applies to you (ecommerce, saas, or non_commerce)
- Identify your lowest-scoring dimension — that is your priority for improvement
- Check signals showing
not_found— those are areas you can address - For ecommerce sites: V accounts for 40%, focus there first. D is the easiest to improve (llms.txt + Schema.org)
- For SaaS sites: S (20%) and G (23%) are weighted higher. Invest in security certifications, compliance documentation, and API documentation for D
- For sites classified as non_commerce that should be commercial: ensure your site has visible commerce signals (pricing page, payment processing, product listings)
Next Chapter: REST API Reference — Complete technical documentation for the OTR query API