---
name: entity-graph-audit
description: Audit how a brand or person is represented in the open entity graph — Wikidata presence, Wikipedia coverage, social profile linkage, sameAs consistency, and citation drift across surfaces. Combines lookup_entity (free Wikidata + Wikipedia APIs), the agent's native web search, and analyze_page to detect entity-name drift, missing canonical surfaces, and AI-grounding trust gaps. Built and maintained by Momentic.
version: 1.0.0
---

# entity-graph-audit

AI grounding systems verify entities by cross-checking attributes across multiple authoritative surfaces (Stage 4 trustworthiness, Stage 10B verification in Krishna Madhavan's framework). If your brand is "Acme Corp" on its own site, "Acme Corporation" in Forbes, "ACME, Inc." on Crunchbase, and absent from Wikidata, the verifier sees three different entities and rejects the claim. This skill audits exactly that.

## Prerequisites

- Server: `https://momenticmarketing.com/mcp`
- Tools used: `lookup_entity`, `analyze_page`
- Agent capability needed: web search (Claude WebSearch, Perplexity native, ChatGPT browse, etc.)

## Process

### Step 1 — Identify the target entity

Get from the user:
- The brand/person/topic name as they prefer it ("the canonical name")
- The entity's primary URL (their own site)
- Optionally: known social profiles, known Wikipedia article, known Wikidata QID

### Step 2 — Wikidata + Wikipedia baseline

Call `lookup_entity({ name: '<brand name>' })`. Three outcomes:

- **Resolved with high confidence** (single match, label exact) → use the returned record as the canonical entity
- **Resolved with disambiguation** (`alternativeMatches` populated) → present the matches to the user and ask which one is theirs. Don't guess.
- **Not resolved** → flag immediately. **No Wikidata entity is the most expensive trust gap there is.** AI verifiers default to "unknown entity" and downgrade citation likelihood for the brand's own site as a result.

If resolved, capture:
- `qid`, `label`, `description`
- `instanceOf` (drives schema recommendation: business → `Organization`, software → `SoftwareApplication`, etc.)
- `sameAs` URIs from claims (Twitter, LinkedIn, GitHub, Crunchbase, etc.)
- `sitelinks` (which language wikis cover the entity — breadth signal)
- `wikipedia.summary` (the canonical first-paragraph description AI systems will quote)
- `wikipediaSections` (canonical sub-topics)

### Step 3 — Pull the entity's own site

Call `analyze_page(<their primary URL>)`. From `schemaTypes`:
- ✓ Has `Organization` / `Person` / etc. matching the Wikidata `instanceOf`?
- ✓ Has `sameAs` array linking back to the surfaces in Wikidata's sameAs?

If schema is missing or wrong, that's an immediate Stage 4 failure. Note it.

### Step 4 — Five-query fan-out via agent web search

This is the core check. **Use your web search to run these queries** and capture the top 10 result URLs per query (and the snippet/title for each result):

1. `<brand name>` — the bare-name query. What do AI systems surface?
2. `<brand name> reviews` — third-party perception
3. `<brand name> vs` — comparison surface
4. `<brand name> CEO` (or founder if appropriate) — leadership entity check
5. `<brand name> <main product or service>` — product/service entity link

You can substitute domain-specific queries if the user's category warrants (e.g. for a non-profit: `<brand name> donate`, `<brand name> impact`).

Report URLs, deduplicate, and pick the top **10 unique** highest-priority third-party surfaces (i.e. NOT the brand's own site).

### Step 5 — Entity-mention consistency on third-party surfaces

For each of the 10 surfaces, call `analyze_page(url)` to extract title, meta description, schema types, and primary structured signals. Then **read the page** (with web fetch or, if your platform supports it, a content extraction tool) to capture:

- **Brand name as written** (exact string)
- **Founder / CEO name** (if mentioned)
- **Description / one-liner** (compare against Wikidata description)
- **Founding date / location** (if mentioned)
- **sameAs links present** (does the third-party surface link back to the brand's site or other canonical surfaces?)

Build this matrix:

| Surface | Brand-name | Founder name | Description-match | sameAs back to brand |
|---|---|---|---|---|
| wikipedia.org/... | "Acme Corporation" | "Jane Doe" | matches | yes |
| forbes.com/... | "Acme Corp" | "Jane Doe" | partial | no |
| crunchbase.com/... | "ACME, Inc." | "J. Doe" | conflicts | yes |
| linkedin.com/... | "Acme" | "Jane Doe" | matches | yes |
| ... |

### Step 6 — Compute the verdict

**Entity coherence score (0–100):** weighted average of:
- Wikidata presence (30 pts) — yes/no
- Wikipedia article exists (15 pts) — yes/no
- Brand-name consistency across 10 surfaces (20 pts) — % exact match
- sameAs reciprocity (15 pts) — % of canonical surfaces that link back to the brand's own site
- Description alignment (10 pts) — surface descriptions don't actively contradict Wikidata
- On-site Organization/Person schema with sameAs (10 pts) — present and correct

### Step 7 — Surface drift findings

For every inconsistency, write a one-line finding:

- "Surface X uses 'Acme Corp', but Wikidata canonical is 'Acme Corporation'. Forbes is the source of citation; the brand's own site says 'Acme Corp'. Pick one and propagate."
- "Crunchbase founding date is 2016, Wikipedia says 2014. Verify and correct."
- "5 of 10 surfaces don't link back to your primary URL. Add structured `sameAs` claims on Wikidata pointing to these surfaces."
- "No Wikidata entity exists. Submit one." (provides instructions; see below)

## Output format

```
## Entity-Graph Audit: <brand name>

**Audit date:** <YYYY-MM-DD>
**Wikidata QID:** <Q12345 or "(none)">
**Wikipedia article:** <URL or "(none)">
**Coherence score:** <0–100> / 100

### Canonical entity record (from Wikidata)

- **Label:** <name>
- **Description:** <description>
- **instanceOf:** <list>
- **sameAs (in Wikidata):** <list of URIs>
- **Sitelinks:** <count> language wikis (<sample of which>)

### On-site schema check

- ✓/✗ `Organization` (or matching type) schema present
- ✓/✗ `sameAs` array on the brand's own site matches Wikidata sameAs
- <list any drift>

### Third-party surfaces (top 10 from web search)

<table as above>

### Drift findings

<numbered list of specific contradictions>

### Top 5 fixes (in order of leverage)

1. <highest leverage — usually "create or claim Wikidata entity" if missing>
2. ...
3. ...
4. ...
5. ...

### What was NOT verified

- <Surfaces I couldn't fetch>
- <Languages I couldn't read>
- <Anything else>
```

### If Wikidata entity is missing

Append this section verbatim:

> **Heads up:** AI grounding systems use Wikidata as a primary entity backbone. Without a Wikidata entity, every claim about your brand competes with anonymous strings in the LLM's training data — and loses. Submitting a Wikidata entity is free and takes ~30 minutes. Required:
> - At least 2 reliable secondary sources (news articles, books, encyclopedia entries; not your own site or social media)
> - A clear `instanceOf` (e.g. business, non-profit organization, software, person)
> - At minimum: official site URL, founding date, and country
>
> Submit at https://www.wikidata.org/wiki/Special:NewItem. If the brand is too small to clear notability — that's the underlying problem to solve, not a Wikidata-specific one.

## Notes for the agent

- **Don't fabricate.** If you can't fetch a surface, list it under "What was NOT verified" — never guess what it says.
- **Web search is for discovery, `analyze_page` is for structure.** Don't try to parse search snippets as if they were the canonical content — fetch the URLs and run them through `analyze_page` for schema/title/structure.
- **Disambiguation matters.** If `lookup_entity` returns alternativeMatches, ASK the user which one. There are dozens of "Acme Corp" entries in Wikidata. Picking the wrong one invalidates the entire audit.
- **The five fan-out queries are a baseline.** For specific verticals, add domain-appropriate queries: for a SaaS product, `<brand> alternatives`, `<brand> pricing`. For a non-profit, `<brand> mission`, `<brand> donate`. Adapt without losing the core five.
- **Pair with `schema-recommender`:** when on-site `Organization`/`sameAs` is missing, that skill emits the exact paste-ready JSON-LD.
- **Pair with `geo-aeo-readiness`:** entity-graph drift is the #1 source of Stage 4 (Trustworthy gate) failures.
