High name similarity (0.94) with only descriptor drift; likely one canonical parent (Flying Monkey Statue) with variant subheads/stubs.
naming drift
Path-style links are mixed with bare-name links across NPC pages, creating inconsistent naming conventions and increasing drift risk ([[npcs/Vivian.md|Vivian]] vs [[Vivian]]).
Transcript-import artifacts present inside link targets, e.g. nested target fragments like locations/[[npcs/Arden.md and factions/Cult of [[npcs/Set.md (malformed target strings observed at scale).
Unresolved name variants (likely aliases or missing canonical pages):Archontean, Thorcin, Ioannes, Vael, Larel, Sortian, Thothian (appear as broken link targets in multiple files).
weak-evidence claims
sessions/Session 2 - Halfling Rent-Seekers.md has high uncertainty density (>=12 weak-confidence markers: “unclear/possibly/maybe/might be/unsure/inaudible/?”).
NPC entries that quote long session snippets appear to include partially ingested transcript blocks; these should be treated as low-confidence secondary evidence until recap/blog corroboration.
structure issues
Large malformed-link footprint detected: 795 nested-link/malformed targets across vault markdown during this batch scan.
Broken link targets (top recurring):Archontean (20), Thorcin (18), Ioannes (16), The Living Wheelbarrow (12), Vael (8), plus several session/file-path variants without .md normalization.
Representative corruption pattern confirmed in npcs/Vivian.md history line containing [[locations/[[npcs/Arden.md|Arden]] Vul.md|...]].
suggested changes
Add a link-normalization cleanup pass (safe, mechanical) to repair nested-target patterns like [[foo/[[bar]]...]] before semantic QA decisions.
Standardize on one vault link style for entities (recommended: [[Entity Name]] + aliases in frontmatter) and reserve path links for disambiguation only.
Create/confirm canonical pages or aliases for high-frequency unresolved ethnonyms/titles (Archontean, Thorcin, Thothian) and principal names (Ioannes, Vael).
Consolidate Flying Monkey Statue variants into one family page with variant subheads; keep lightweight redirect/stub pages if needed.
Flag transcript-heavy sections with confidence tags (confidence: low|medium|high) to separate canon statements from uncertain raw capture text.
Batch Run — 2026-03-13 19:19 UTC (cron dd4fc190, analysis-only)
Mixed target styles remain widespread ([[Entity]], [[folder/Entity.md|Entity]], and malformed nested forms), increasing false duplicate signals during QA.
Path-like session references omit normalization in many links (e.g., raw sessions/Session 34a ... targets), causing avoidable unresolved-link noise.
weak-evidence claims
High-uncertainty file remains the primary weak-evidence hotspot:sessions/Session 2 - Halfling Rent-Seekers.md (9+ uncertainty markers in latest pass).
Transcript-derived assertions in character pages remain weakly grounded when they include long quoted blocks from recording-note style text without recap/blog corroboration.
Classification guidance for these claims: default to hold-for-review unless corroborated in recap/session canon pages.
Structural takeaway: link-shape corruption is broad enough that semantic QA decisions are being masked by syntax noise.
suggested changes
Run a mechanical link-shape repair pass first (nested [[...[[...]]...]] and path-target normalization) before deeper entity reconciliation.
Add/confirm canonical alias mappings for top unresolved targets (Archontean, Thorcin, Thothian, Ioannes, Vael, Larel, Sortian) to reduce repeated false-positive “new entity” detections.
Introduce a lightweight Session Reference Index page listing canonical session filenames + aliases (24a, 24b, etc.) for safer auto-linking.
Apply hold-for-review tags to transcript-heavy claims lacking recap corroboration, especially in early-session notes with explicit uncertainty markers.
After syntax cleanup, rerun vault-wide QA and promote only unambiguous low-risk merges (starting with Flying Monkey Statue variant consolidation).
Batch Run — 2026-03-13 19:29 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Confirmed cross-type duplicate page title:
npcs/Angry scary ghost.md
monsters/Angry scary ghost.md
Candidate canonical match: same encounter/session scope (Session 6) with overlapping identity.
Decision label: merge-into-existing (or rename-to-canonical if monster taxonomy requires separate naming).
Systemic namespace collision pages (expected but noisy for automation):
Session-link naming remains inconsistent (Session 34a..., Session 34c..., Session 42b...) with missing normalization to exact filenames, increasing false broken-link counts.
Lore-note references (Recording 2026-02-13, Recording 2026-02-06, Recording 2026-01-30) are heavily linked by shorthand names rather than canonical page/file targets.
weak-evidence claims
Terms like Vael, Larel, and Sortian currently appear mostly as unresolved references; absent clear canonical pages, classify incoming transcript-derived additions using these tokens as hold-for-review until corroborated.
The Living Wheelbarrow appears as a repeated unresolved target (12 refs) but without immediate canonical anchor in this pass; treat as weak-evidence alias candidate pending source cross-check.
structure issues
Frontmatter quality issue is widespread: 307 markdown files currently contain duplicate tag entries (e.g., repeated npc in many NPC pages).
Duplicate-tag issue is low semantic risk but high maintenance noise for tooling that relies on deduplicated metadata.
Broken-link leaderboard continues to mix true missing pages with alias/name-style drift, indicating the need for an alias index before further semantic reconciliation.
suggested changes
Add a QA pre-filter that excludes structural docs (Index, README) from duplicate-entity scans.
Resolve Angry scary ghost cross-type duplication by selecting one canonical page and converting the other to a stub/alias.
Create a canonical alias map for high-frequency unresolved names (Archontean, Thorcin, Ioannes, Mithric, Vael, Larel, Sortian, Thothian).
Run a safe metadata cleanup pass to deduplicate frontmatter tags arrays (mechanical, low-risk, broad payoff).
Normalize session references to exact vault filenames (including lettered sessions like 34a/34c/42b) before next semantic QA batch.
Batch Run — 2026-03-13 19:39 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Cross-type duplicate still unresolved (high-confidence merge candidate):
Additional weak-evidence concentration detected in transcript-source material:
lore/recording-notes/Recording 2025-04-04.md (9)
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (8)
Reconciliation guidance unchanged: classify transcript-derived additions from these pages as hold-for-review unless recap/canon corroborates.
structure issues
Orphan-link surface is large: 127 content pages currently show zero inbound wiki links in this pass.
Representative orphans include: npcs/Jarnno the False.md, npcs/Domo Gribble.md, npcs/Voice of Thoth.md, npcs/Egill Flat-nose.md, npcs/Bastet.md.
While some may be legitimate edge entities, this scale suggests index/alias coverage gaps rather than purely intentional isolation.
suggested changes
Resolve Angry scary ghost as the next unambiguous, low-risk semantic reconciliation (single canonical page + stub on the deprecated path).
Add alias/index entries for the top unresolved drift names (Archontean, Thorcin, Ioannes, Lacrymosa, Thothian) before running deeper duplicate detection.
Build a mechanical normalizer for session-title links to exact existing filenames (particularly split-letter sessions like 34a/34c/42b).
Generate an “orphan triage” checklist from the 127 zero-inbound pages, starting with NPCs that are expected to be discoverable via index pages.
Keep transcript-heavy claims in hold-for-review until corroborated by recap/session canon sources.
Batch Run — 2026-03-13 19:49 UTC (cron dd4fc190, analysis-only)
QA report files themselves are now appearing in some automated weak-evidence/broken-link scans; these should be excluded from reconciliation metrics to avoid self-noise.
suggested changes
Reclassify Magae and Irthuin as lore-canonical entities; convert npcs/* versions to lightweight stubs or merge content into lore/* pages.
Keep Angry scary ghost queued as next semantic merge after link-shape cleanup.
Add a QA scanner exclusion list for vault/notes/* to prevent report text from polluting weak-evidence and broken-link counts.
Continue mechanical nested-link normalization first, then rerun semantic reconciliation so duplicate/alias decisions are based on clean link targets.
Add explicit alias entries for high-frequency unresolved ethnonyms/titles (Archontean, Thorcin, Thothian) before next batch.
Batch Run — 2026-03-13 19:59 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Duplicate set remains stable and focused (3 semantic clusters):
Irthuin/Magae → rename-to-canonical (prefer lore/* canon) with stubs at legacy NPC paths.
naming drift
Top unresolved targets are still concentrated in a small recurring set:
Archontean (20), Thorcin (18), Ioannes (16), The Living Wheelbarrow (12), Vael (8), Lacrymosa (7), Larel (7), Sortian (7), Thothian (6).
Session-title drift remains a major false-positive source:
Session 34a - Hunting the Thane (9)
Session 42b - Neferet and the Wraiths (7)
Session 34c - Burglary and Death (7)
Session 35 - The Scepter - Flute of the Goblins (6)
Additional unresolved proper nouns worth alias triage this pass: Obsidian Gates (6), Order of the Azure Shield (5), Kerbog Khan (4), Huge Green Dragon (4).
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (12)
lore/recording-notes/Recording 2025-04-04.md (9)
sessions/Session 8a - Never Trust a Scorpion.md (8)
sessions/Session 6 - Good Ghost, Bad Ghost.md (8)
sessions/Session 21 - The Library of Thoth.md (8)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (8)
Reconciliation policy remains: transcript-derived additions from these pages default to hold-for-review without recap/canon corroboration.
structure issues
The duplicate landscape is now clearly semantic rather than volumetric (only 3 non-structural normalized-title clusters detected vault-wide).
Broken-link/top-missing counts remain dominated by alias/session-link normalization gaps rather than clear absent-content gaps.
Arden.txt appears as a repeated unresolved target (5 refs), indicating a likely path/artifact leakage into wiki links.
suggested changes
Execute the three unambiguous duplicate reconciliations in one controlled pass (Angry scary ghost, Irthuin, Magae) with canonical stubs to preserve backlinks.
Build a small alias map for the new secondary unresolved set (Obsidian Gates, Order of the Azure Shield, Kerbog Khan, Huge Green Dragon) after the primary drift names.
Add a session-link normalizer rule for lettered session titles and long hyphenated names (especially Session 35 title variant).
Treat Arden.txt references as structural contamination; normalize/remove as a mechanical cleanup prior to further semantic QA.
Keep strict hold-for-review handling for claims sourced from the seven high-uncertainty files listed above.
Batch Run — 2026-03-13 20:09 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Non-structural duplicate set is unchanged (still only 3 clusters):
This stability suggests the remaining QA risk is mostly link corruption/alias drift, not discovery of new duplicate entities.
naming drift
Recurring unresolved names remain concentrated and stable:
Archontean (20), Thorcin (18), Ioannes (16), The Living Wheelbarrow (12), Vael (8), Larel (7), Sortian (7).
Session-title target drift persists in raw-target links (lettered/long-title sessions):
Session 34a - Hunting the Thane (9), Session 42b - Neferet and the Wraiths (7), Session 34c - Burglary and Death (7), Session 35 - The Scepter - Flute of the Goblins (6).
Corruption-heavy path fragments continue to dominate missing-target counts (e.g., locations/[[npcs/Arden, factions/Cult of [[npcs/Set), indicating syntax drift is still the primary upstream issue.
weak-evidence claims
Weak-confidence marker scan (excluding intent to treat QA-note files as canon evidence) still flags early/mid campaign sessions as highest-risk sources:
sessions/Session 2 - Halfling Rent-Seekers.md (9)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (7)
sessions/Session 6 - Good Ghost, Bad Ghost.md (6)
sessions/Session 19 - The Pool of Donkey Ears.md (5)
sessions/Session 8a - Never Trust a Scorpion.md (5)
Claims sourced primarily from these pages should remain hold-for-review unless corroborated by cleaner recap/canon pages.
structure issues
Malformed nested-link footprint increased slightly: 153 files with nested/malformed wiki links this pass.
Highest malformed-link density:
locations/Great Cavern.md (50)
pcs/Ioannes Grammatikos Byzantios.md (33)
npcs/Thoth.md (32)
pcs/Vallium Halcyon.md (31)
sessions/Session 32 - Fast Exploration.md (27)
Missing-target leaderboard remains dominated by malformed fragments rather than clear absent canonical pages, confirming structure cleanup should precede semantic reconciliation.
suggested changes
Keep semantic reconciliation queue focused on the 3 stable duplicate clusters; defer broader merge actions until link-shape cleanup reduces false matches.
Prioritize mechanical repair rules for top corruption signatures (locations/[[npcs/..., factions/...[[npcs/..., double-embedded NPC names) before next entity-level QA pass.
Add scanner-level exclusions for vault/notes/* in weak-evidence tallies so QA report text does not self-inflate uncertainty metrics.
Build a targeted alias table for persistent unresolved names (Archontean, Thorcin, Ioannes, The Living Wheelbarrow, Vael, Larel, Sortian) and apply during reconciliation classification.
After structural normalization, rerun duplicate detection and promote only unambiguous low-risk merges/renames.
Batch Run — 2026-03-13 20:19 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Duplicate set remains tight and unchanged (3 non-structural clusters):
Drift note: some unresolved names (e.g., Arden Vul, Cult of Set, Thoth) likely represent alias/namespace choices rather than absent content; they should be normalized through alias mapping rather than treated as new-entity proposals.
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (9)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (8)
sessions/Session 6 - Good Ghost, Bad Ghost.md (7)
sessions/Session 25 - Looking for the Back Door to the Forum of Set.md (7)
sessions/Session 28 - Teleport Rugs and Baboons.md (7)
Reconciliation classification guidance unchanged: transcript-derived claims from these files default to hold-for-review unless corroborated by stronger recap/canon pages.
Structural corruption is still large enough to produce misleading missing-target leaderboards, so deeper semantic reconciliation continues to be partially blocked by syntax noise.
suggested changes
Keep duplicate reconciliation queue constrained to the 3 stable, high-confidence clusters; avoid broad merge sweeps until malformed-link cleanup reduces false matches.
Prioritize mechanical cleanup for top corruption signatures (locations/[[npcs/..., factions/...[[npcs/..., double-embedded proper names) as the next low-risk/high-impact maintenance pass.
Add/expand alias mappings for high-volume unresolved canonical concepts (Arden Vul, Cult of Set, Thoth, Archontean/Archontean Empire) to reduce naming-drift false positives.
Maintain strict hold-for-review on claims sourced mainly from the six highest-uncertainty session files listed above.
After link-shape normalization, rerun vault-wide QA and only then promote additional low-risk semantic renames/merges.
Batch Run — 2026-03-13 20:29 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Semantic duplicate set remains exactly three clusters (stable for 50+ minutes):
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (12)
lore/recording-notes/Recording 2025-04-04.md (9)
sessions/Session 21 - The Library of Thoth.md (8)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (8)
sessions/Session 6 - Good Ghost, Bad Ghost.md (8)
sessions/Session 8a - Never Trust a Scorpion.md (8)
Reconciliation policy remains: transcript-derived additions from these sources default to hold-for-review unless corroborated by recap/canon pages.
structure issues
Malformed nested-link footprint remains severe: 784 malformed targets across 149 files this pass.
Highest malformed-link density files:
locations/Great Cavern.md (50)
pcs/Ioannes Grammatikos Byzantios.md (33)
pcs/Vallium Halcyon.md (31)
sessions/Session 32 - Fast Exploration.md (27)
npcs/Thoth.md (26)
Structural corruption still dominates broken-link signal, limiting confidence for broader semantic reconciliation.
suggested changes
Keep semantic reconciliation limited to the 3 stable duplicate clusters until malformed-link counts materially drop.
Triage lore/The Archontean Calendar.md vs lore/Arden Vul The Archontean Calendar.md as a targeted merge candidate, preferring the cleaner canonical page and preserving unique facts.
Prioritize mechanical cleanup of highest-impact malformed-link files (Great Cavern, Ioannes Grammatikos Byzantios, Vallium Halcyon, Session 32, Thoth).
Build/expand alias mappings for persistent unresolved names (Archontean, Thorcin, Ioannes, The Living Wheelbarrow, Vael, Lacrymosa, Larel, Sortian).
Maintain strict hold-for-review treatment for claims sourced from top uncertainty files until corroboration exists.
Batch Run — 2026-03-13 20:49 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Non-structural duplicate set remains exactly 3 clusters (stable):
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (12)
lore/recording-notes/Recording 2025-04-04.md (9)
sessions/Session 21 - The Library of Thoth.md (8)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (8)
sessions/Session 6 - Good Ghost, Bad Ghost.md (8)
sessions/Session 8a - Never Trust a Scorpion.md (8)
Any transcript-derived additions sourced primarily from these files should remain hold-for-review unless corroborated by recap/canon pages.
structure issues
Malformed nested-link footprint remains severe and flat: 800 malformed targets across 152 files this pass.
Highest malformed-link density files currently:
locations/Great Cavern.md (50)
pcs/Ioannes Grammatikos Byzantios.md (33)
pcs/Vallium Halcyon.md (31)
npcs/Thoth.md (27)
sessions/Session 32 - Fast Exploration.md (27)
sessions/Session 26 - The Scouring of the Shire.md (26)
sessions/Session 24b - The Set Cult Strikes Back, Larel's Stuff, and the Hall of Shrines.md (25)
pcs/Vaelethron 'Vael' Sunshadow.md (22)
suggested changes
Keep semantic reconciliation constrained to the 3 stable duplicate clusters until malformed-link counts decline materially.
Prioritize mechanical cleanup in the eight highest-density malformed-link files to maximize signal gain for the next QA pass.
Add alias mappings for persistent unresolved names (Archontean, Thorcin, Ioannes, The Living Wheelbarrow, Vael, Lacrymosa, Larel, Sortian) before broad merge decisions.
Add a session-title linker normalizer for letter-suffixed sessions and long hyphenated titles to reduce recurring broken-target noise.
Maintain strict hold-for-review for claims sourced from the listed high-uncertainty files until corroboration is available.
Batch Run — 2026-03-13 20:59 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Semantic duplicate set remains stable (3 confirmed clusters):
Archontean Calendar split now looks like a high-confidence duplicate pair after content check:
lore/The Archontean Calendar.md
lore/Arden Vul The Archontean Calendar.md
The second page appears transcript-derived and malformed-link contaminated; recommendation remains merge-into-existing with only unique GM-note details retained.
naming drift
Top unresolved targets continue to cluster around the same names/titles:
Archontean (20), Thorcin (18), Ioannes (16), The Living Wheelbarrow (12), Vael (8), Lacrymosa (7), Larel (7), Sortian (7).
Session-title references are still drifting as raw titles rather than canonical filenames:
Session 34a - Hunting the Thane (9)
Session 42b - Neferet and the Wraiths (7)
Session 34c - Burglary and Death (7)
Session 35 - The Scepter - Flute of the Goblins (6)
weak-evidence claims
Highest uncertainty-marker files this pass:
sessions/Session 2 - Halfling Rent-Seekers.md (9)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (7)
sessions/Session 6 - Good Ghost, Bad Ghost.md (6)
sessions/Session 8a - Never Trust a Scorpion.md (5)
sessions/Session 31 - I Want to Believe.md (5)
sessions/Session 25 - Looking for the Back Door to the Forum of Set.md (5)
Keep transcript-derived additions from these pages at hold-for-review unless recap/canon corroborates.
Additional drift indicator: The Archontean Calendar.md appears as unresolved target (23 refs), consistent with title/namespace inconsistency around Archontean Calendar pages.
Current marker scan highlights uncertainty-heavy session notes:
sessions/Session 2 - Halfling Rent-Seekers.md (9)
sessions/Session 16 - Random Scorpion Teleport to the Hall of Judgment.md (7)
sessions/Session 6 - Good Ghost, Bad Ghost.md (6)
notes/Arden Vul Vault-Wide QA Batch Report.md now appears in weak-evidence counts due to natural-language wording in the report itself; this is meta-noise, not canon risk.
structure issues
Nested/malformed wiki-link footprint remains high and steady: 807 malformed link targets across 151 files.
Highest-density files this run:
locations/Great Cavern.md (50)
pcs/Ioannes Grammatikos Byzantios.md (33)
pcs/Vallium Halcyon.md (31)
npcs/Thoth.md (27)
sessions/Session 32 - Fast Exploration.md (27)
Structural interpretation: syntax corruption is still the primary blocker; semantic reconciliation quality is capped until link-shape cleanup runs first.
suggested changes
Keep semantic duplicate decisions queued (no change) and prioritize a mechanical nested-link repair pass first.
Add scanner exclusions for vault/notes/* when computing weak-evidence and broken-link leaderboards to reduce self-generated report noise.
After mechanical cleanup, rerun reconciliation and then execute unambiguous low-risk merges in this order:
Angry scary ghost merge
Irthuin canonicalization to lore/
Magae canonicalization to lore/
Add/confirm alias mappings for top clean unresolved names (Archontean, Thorcin, Ioannes, The Living Wheelbarrow) once malformed-target noise is reduced.
Batch Run — 2026-03-13 21:39 UTC (cron dd4fc190, analysis-only)
duplicates/aliases
Semantic duplicate cluster remains unchanged and high-confidence:
Top malformed target fragments are still parser-hostile nested forms (locations/[[npcs/Arden.md, factions/Cult of [[npcs/Set.md, etc.), continuing to mask semantic QA signals.
suggested changes
Continue to defer semantic merges until after mechanical nested-link repair; syntax noise remains the primary reconciliation blocker.
Keep the queued low-risk semantic actions in the same order after cleanup:
Angry scary ghost merge
Irthuin canonicalization to lore/
Magae canonicalization to lore/
Add explicit alias/path normalization for top unresolved clean names (Archontean, Thorcin, Ioannes, The Living Wheelbarrow, Vael).
Run a dedicated session-link normalizer for lettered and long-title session pages (34a/34c/35/42b) to reduce recurring broken-link noise.
Keep vault/notes/* excluded from weak-evidence and malformed-link leaderboards to avoid report-induced metric drift.
Batch Run — 2026-03-13 21:49 UTC (cron dd4fc190, analysis-only)
No new non-structural same-title duplicates surfaced in this batch.
naming drift
Top unresolved clean-name targets (still recurring):
Arden Vul (128), Gosterwick (117), Cult of Set (43), Archontean Empire (26), Archontean (20), Narsileon (19).
Long-form session-title targets continue to appear as raw unresolved links (e.g., Session 8b and 9 - Muirasso's Tomb and the Broken Head at 39 refs), suggesting title/alias normalization drift rather than net-new entity creation.
Persistent entity-name drift remains visible around known canonical PCs/NPCs (Wicktrimmer, Demma, Thoth) due to mixed link styles and malformed nested targets.
sessions/Session 1 - First Visit to the Ruins of Arden Vul.md (12)
lore/recording-notes/Recording 2025-04-04.md (9)
sessions/Session 8a - Never Trust a Scorpion.md (8)
sessions/Session 6 - Good Ghost, Bad Ghost.md (8)
Reconciliation policy remains: transcript-heavy claims from these files should default to hold-for-review unless corroborated in stronger recap/canonical sources.
structure issues
Malformed nested-link surface remains the dominant blocker: 793 malformed targets across 149 files (excluding vault/notes/*).
Highest malformed-link density this batch:
locations/Great Cavern.md (50)
pcs/Ioannes Grammatikos Byzantios.md (33)
pcs/Vallium Halcyon.md (31)
npcs/Thoth.md (27)
sessions/Session 32 - Fast Exploration.md (27)
Most frequent malformed target fragments are still parser-hostile nested forms: