Phase 05 · Deep-DAG composition

The Archivist uses two packaged deep-DAGs:

book-search-fanout — the full 4-source scout cluster (extract query, decide tools, 4 parallel scouts, rank, merge, record, gate, recall). Placed three times in the parent: on-topic-search, author-search, and similar-search.
compose-retry-loop — the compose / validate / retry / respond terminal. Placed once as compose-loop; every successful search branch converges on it.

The parent DAG references both deep-DAGs by name via .deepDAG(placementName, dagName, routes, options). Each placement has its own stateMapping.output that copies the deep-DAG's writes back into the named parent state fields.

Flow

Deep-DAG: the packaged fan-out cluster

/**
 * BookSearchFanoutDAG — reusable query-extract + 4-source parallel scout cluster.
 *
 * Internal flow:
 *
 *   bsf-extract-query
 *     └─ success ──► bsf-decide-tools
 *   bsf-decide-tools
 *     └─ (tools | no-tools) ──► book-search-fan-out (parallel, combine: collect)
 *          ├─ bsf-ol       (OpenLibrary)
 *          ├─ bsf-gb       (Google Books)
 *          ├─ bsf-subject  (Subject search)
 *          └─ bsf-wiki     (Wikipedia enrichment)
 *     └─ bsf-rank-candidates
 *     └─ bsf-merge-candidates
 *          ├─ ranked ──► bsf-record-findings
 *          └─ empty  ──► bsf-no-results (collects error → deep-DAG exits error)
 *     └─ bsf-record-findings
 *     └─ bsf-has-citations-gate
 *          ├─ pass ──► bsf-recall-past-visits ──► END (success)
 *          └─ fail ──► bsf-no-results (collects error → deep-DAG exits error)
 *
 * Outputs:
 *   success — query extracted, candidates found, ranked, recorded, and recalled
 *   error   — no candidates after merge, or citations gate failed;
 *             signalled via collectError on childState so executeDeepDAG
 *             routes the parent to its 'error' branch
 *
 * Molecular import pattern:
 *   import { BookSearchFanoutDAG, registerBookSearchFanoutNodes } from './deepdags/BookSearchFanoutDAG.ts';
 *   registerBookSearchFanoutNodes(dispatcher);
 *   dispatcher.registerDAG(BookSearchFanoutDAG);
 *
 * The deep-DAG operates on the parent's state directly (no stateMapping
 * needed) — it reads `state.query` and writes `state.terms`, `state.toolPlan`,
 * `state.candidates`, `state.shortlist`, and `state.priorContext`, which are
 * the same fields every intent branch in the parent DAG expects.
 *
 * Three placements of this DAG replace three inlined fan-out clusters in
 * the parent `the-archivist` DAG. One definition, three usages:
 *   on-topic-search  — general web book search
 *   author-search    — author body-of-work search
 *   similar-search   — recommend-similar fan-out
 *
 * Reviews and describe branches are inlined in the parent because they use
 * distinct post-scout steps (rankByRating and pickBestMatch respectively).
 */

import type { ArchivistState }    from '../ArchivistState.ts';
import { decideTools }       from '../nodes/decideTools.ts';
import { extractQuery }      from '../nodes/extractQuery.ts';
import { hasCitationsGate }  from '../nodes/hasCitationsGate.ts';
import { mergeCandidates }   from '../nodes/mergeCandidates.ts';
import { rankCandidates }    from '../nodes/rankCandidates.ts';
import { recallPastVisits }  from '../nodes/recallPastVisits.ts';
import { recordFindings }    from '../nodes/recordFindings.ts';
import {
  openLibraryScout,
  googleBooksScout,
  subjectScout,
  wikipediaScout,
} from '../nodes/scouts.ts';
import type { ArchivistServices } from '../services.ts';

import type { NodeInterface, Dagonizer  } from '@noocodex/dagonizer';
import { DAGBuilder } from '@noocodex/dagonizer/builder';
import type { DAG } from '@noocodex/dagonizer/entities';

/**
 * Internal terminal node that collects a recoverable error and exits.
 *
 * Used when the fan-out cluster finds no usable candidates — either
 * because merge produced an empty shortlist, or because the citations
 * gate found nothing written in the state graph. Collecting the error
 * causes `executeDeepDAG` to route the parent placement to its `error`
 * branch so the parent can dispatch to its own empty-result handling.
 */
const bsfNoResults: NodeInterface<ArchivistState, 'no-results', ArchivistServices> = {
  'name':    'bsf-no-results',
  'outputs': ['no-results'],
  async execute(state, context) {
    context.services.logger.warn('book-search-fanout: no candidates found — routing error to parent');
    if (state.failureCause.trim().length === 0) {
      // No cause was accumulated by scouts — synthesise a generic one.
      state.failureCause = 'No candidates found after searching all available sources. ';
    }
    state.collectError({
      'code':        'NO_CANDIDATES',
      'message':     'book-search-fanout found no usable candidates after merge and gate',
      'operation':   'bsf-no-results',
      'recoverable': true,
      'timestamp':   new Date().toISOString(),
    });
    return { 'output': 'no-results' };
  },
};

/**
 * The `book-search-fanout` DAG — one packaged unit that any parent DAG
 * can reference via `.deepDAG('placement-name', 'book-search-fanout', routes)`.
 */
export const BookSearchFanoutDAG: DAG = new DAGBuilder('book-search-fanout', '1.0')

  // ── 1. extract-query ─────────────────────────────────────────────────────
  // LLM parses the raw visitor question into structured search terms.
  // Writes state.terms for the scouts and decide-tools to consume.
  .node('bsf-extract-query', extractQuery, {
    'success': 'bsf-decide-tools',
  })

  // ── 2. decide-tools ──────────────────────────────────────────────────────
  // LLM decides which external sources to invoke. Both outputs route into
  // the parallel fan-out — each scout gates internally on state.toolPlan.
  .node('bsf-decide-tools', decideTools, {
    'tools':    'book-search-fan-out',
    'no-tools': 'book-search-fan-out',
  })

  // ── 3. book-search-fan-out ───────────────────────────────────────────────
  // All four scouts run concurrently. combine:'collect' waits for all four
  // and merges their state mutations. Each scout writes to state.candidates.
  .parallel('book-search-fan-out', ['bsf-ol', 'bsf-gb', 'bsf-subject', 'bsf-wiki'], 'collect', {
    'success': 'bsf-rank-candidates',
    'error':   'bsf-rank-candidates',
  })
  .node('bsf-ol',      openLibraryScout, { 'success': null, 'empty': null })
  .node('bsf-gb',      googleBooksScout, { 'success': null, 'empty': null })
  .node('bsf-subject', subjectScout,     { 'success': null, 'empty': null })
  .node('bsf-wiki',    wikipediaScout,   { 'success': null, 'empty': null })

  // ── 4. rank-candidates ───────────────────────────────────────────────────
  // LLM-driven relevance scoring. Always routes 'ranked' — even an empty
  // set — so merge can soft-gate on zero candidates.
  .node('bsf-rank-candidates', rankCandidates, {
    'ranked': 'bsf-merge-candidates',
  })

  // ── 5. merge-candidates ──────────────────────────────────────────────────
  // Cross-source dedupe via CanonicalId, top-5. Routes 'empty' to
  // bsf-no-results which collects an error so executeDeepDAG routes the
  // parent to its 'error' branch.
  .node('bsf-merge-candidates', mergeCandidates, {
    'ranked': 'bsf-record-findings',
    'empty':  'bsf-no-results',
  })

  // ── 6. record-findings ───────────────────────────────────────────────────
  // Deterministic RDF write — same input always produces the same triples.
  .node('bsf-record-findings', recordFindings, {
    'recorded': 'bsf-has-citations-gate',
  })

  // ── 7. has-citations-gate ────────────────────────────────────────────────
  // SPARQL ASK over the per-run state graph. Symbolic fence for the LLM.
  // 'fail' routes to bsf-no-results so the parent receives 'error'.
  .node('bsf-has-citations-gate', hasCitationsGate, {
    'pass': 'bsf-recall-past-visits',
    'fail': 'bsf-no-results',
  })

  // ── 8. recall-past-visits ────────────────────────────────────────────────
  // Injects prior-session context (prior queries + shortlisted titles) into
  // state.priorContext. Terminal node — deep-DAG exits cleanly → 'success'.
  .node('bsf-recall-past-visits', recallPastVisits, {
    'recalled': null,
  })

  // ── 9. bsf-no-results ────────────────────────────────────────────────────
  // Internal error-signal node. Collects a recoverable error so
  // executeDeepDAG routes the parent placement to its 'error' branch.
  .node('bsf-no-results', bsfNoResults, {
    'no-results': null,
  })

  .build();

/**
 * Register all nodes used by `BookSearchFanoutDAG` onto a dispatcher.
 *
 * Call this before `dispatcher.registerDAG(BookSearchFanoutDAG)`. Accepts
 * any `Dagonizer`-compatible dispatcher to allow consumers to use their
 * own subclass while still pulling in the molecular node set.
 *
 * @example
 * ```ts
 * registerBookSearchFanoutNodes(dispatcher);
 * dispatcher.registerDAG(BookSearchFanoutDAG);
 * ```
 */
export function registerBookSearchFanoutNodes(
  dispatcher: Dagonizer<ArchivistState, ArchivistServices>,
): void {
  for (const node of [
    extractQuery,
    decideTools,
    openLibraryScout,
    googleBooksScout,
    subjectScout,
    wikipediaScout,
    rankCandidates,
    mergeCandidates,
    recordFindings,
    hasCitationsGate,
    recallPastVisits,
    bsfNoResults,
  ]) {
    dispatcher.registerNode(node);
  }
}

Parent DAG: the deep-DAG placements

The #deepdag-placements region covers only the .deepDAG(...) calls — the three placements of book-search-fanout and the one placement of compose-retry-loop:

// ── on-topic branch ──────────────────────────────────────────────────────
// Deep-DAG placement: book-search-fanout handles extract-query, decide-tools,
// all four scouts, rank-candidates, merge, record, gate, and recall.
// One packaged cluster — first of three placements of the same deep-DAG.
// stateMapping.output copies the fields the deep-DAG writes back to the
// parent state so compose-loop and group-by-year can read them.
.deepDAG('on-topic-search', 'book-search-fanout', {
  'success': 'compose-loop',
  'error':   'compose-empty',
}, {
  'stateMapping': {
    'output': {
      'terms':         'terms',
      'toolPlan':      'toolPlan',
      'candidates':    'candidates',
      'shortlist':     'shortlist',
      'priorContext':  'priorContext',
      'failureCause':  'failureCause',
    },
  },
})

// ── lookup-author branch ─────────────────────────────────────────────────
// Deep-DAG placement: same book-search-fanout cluster, second placement.
// After success, group-by-year sorts results chronologically before the
// compose loop — author surveys read better in publication-timeline order.
.deepDAG('author-search', 'book-search-fanout', {
  'success': 'group-by-year',
  'error':   'compose-empty',
}, {
  'stateMapping': {
    'output': {
      'terms':         'terms',
      'toolPlan':      'toolPlan',
      'candidates':    'candidates',
      'shortlist':     'shortlist',
      'priorContext':  'priorContext',
      'failureCause':  'failureCause',
    },
  },
})
// group-by-year is author-branch-specific: sorts shortlist chronologically.
.node('group-by-year', groupByYear, {
  'ordered': 'compose-loop',
})

// ── find-reviews branch ───────────────────────────────────────────────────
// Inlined — uses rankByRating (deterministic, rating-weighted) in place of
// rankCandidates (LLM-driven). The Google Books scout carries notes.rating /
// notes.ratingsCount; rankByRating weights those for reviews-style output.
.node('reviews-extract', extractQuery, {
  'success': 'reviews-decide-tools',
})
.node('reviews-decide-tools', decideTools, {
  'tools':    'reviews-fan-out',
  'no-tools': 'reviews-fan-out',
})
.parallel('reviews-fan-out', ['reviews-ol', 'reviews-gb', 'reviews-subject', 'reviews-wiki'], 'collect', {
  'success': 'reviews-rank',
  'error':   'reviews-rank',
})
.node('reviews-ol',      openLibraryScout, { 'success': null, 'empty': null })
.node('reviews-gb',      googleBooksScout, { 'success': null, 'empty': null })
.node('reviews-subject', subjectScout,     { 'success': null, 'empty': null })
.node('reviews-wiki',    wikipediaScout,   { 'success': null, 'empty': null })
.node('reviews-rank',    rankByRating,     { 'ranked': 'reviews-merge' })
.node('reviews-merge',   mergeCandidates,  { 'ranked': 'reviews-record', 'empty': 'compose-empty' })
.node('reviews-record',  recordFindings,   { 'recorded': 'reviews-gate' })
.node('reviews-gate',    hasCitationsGate, { 'pass': 'reviews-recall', 'fail': 'compose-empty' })
.node('reviews-recall',  recallPastVisits, { 'recalled': 'compose-loop' })

// ── describe-book branch ─────────────────────────────────────────────────
// Inlined — uses pickBestMatch to narrow multi-hit results to the top-3
// title-similar candidates before merge. Ensures the composer receives the
// specific book the visitor named, not arbitrary top-5 hits.
.node('describe-extract',      extractQuery,     { 'success': 'describe-decide-tools' })
.node('describe-decide-tools', decideTools,      { 'tools': 'describe-fan-out', 'no-tools': 'describe-fan-out' })
.parallel('describe-fan-out', ['describe-ol', 'describe-gb', 'describe-subject', 'describe-wiki'], 'collect', {
  'success': 'describe-pick',
  'error':   'compose-empty',
})
.node('describe-ol',      openLibraryScout, { 'success': null, 'empty': null })
.node('describe-gb',      googleBooksScout, { 'success': null, 'empty': null })
.node('describe-subject', subjectScout,     { 'success': null, 'empty': null })
.node('describe-wiki',    wikipediaScout,   { 'success': null, 'empty': null })
.node('describe-pick',   pickBestMatch,    { 'picked': 'describe-merge' })
.node('describe-merge',  mergeCandidates,  { 'ranked': 'describe-record', 'empty': 'compose-empty' })
.node('describe-record', recordFindings,   { 'recorded': 'describe-gate' })
.node('describe-gate',   hasCitationsGate, { 'pass': 'describe-recall', 'fail': 'compose-empty' })
.node('describe-recall', recallPastVisits, { 'recalled': 'compose-loop' })

// ── recommend-similar branch ─────────────────────────────────────────────
// recommendSimilar seeds state.terms from prior-run shortlist memory.
// 'seeded' routes to the book-search-fanout deep-DAG — third placement of
// the same packaged cluster. 'empty' routes to the decline terminal.
.node('recommend-similar', recommendSimilar, {
  'seeded': 'similar-search',
  'empty':  'compose-empty',
})

// Deep-DAG placement: same book-search-fanout, third and final placement.
.deepDAG('similar-search', 'book-search-fanout', {
  'success': 'compose-loop',
  'error':   'compose-empty',
}, {
  'stateMapping': {
    'output': {
      'terms':         'terms',
      'toolPlan':      'toolPlan',
      'candidates':    'candidates',
      'shortlist':     'shortlist',
      'priorContext':  'priorContext',
      'failureCause':  'failureCause',
    },
  },
})

// ── compose-loop — shared compose/validate deep-DAG ─────────────────────
// All branches that successfully find candidates converge here.
// composeResponse → validateResponse (retry loop, bounded by state.attempts.compose).
// One deep-DAG definition serves all four convergent branches.
// stateMapping.output copies the compose loop's writes back to the parent.
//
// Fan-in policy: 'success' routes to the shared respond-to-visitor terminal
// at the parent level — the deep-DAG produces state.draft and exits cleanly;
// exactly ONE respond-to-visitor fires per run regardless of branch count.
// 'error' (retry budget exhausted) falls through to compose-empty so the
// visitor always receives an in-character response rather than a silent drop.
.deepDAG('compose-loop', 'compose-retry-loop', {
  'success': 'respond-to-visitor',
  'error':   'compose-empty',
}, {
  'stateMapping': {
    'output': {
      'draft':    'draft',
      'approved': 'approved',
      'attempts': 'attempts',
    },
  },
})

What it demonstrates

.deepDAG(name, dagName, routes, options) — the placement references the deep-DAG by its registered name. The parent and child run in the same dispatcher; the child shares the same node registry.
stateMapping.output — after the deep-DAG completes, the dispatcher copies the listed fields from the child's final state back into the parent state. Fields not listed stay isolated.
One definition, three placements — book-search-fanout is registered once and placed three times with distinct placement names. Each placement routes its 'success' / 'error' outputs differently (compose-loop, group-by-year, or decline-empty).
Errors bubble up — anything the child collects via state.collectError reaches the parent's error accumulator automatically. The executeDeepDAG router uses child-state errors to decide the 'error' output.
registerBookSearchFanoutNodes / registerComposeRetryLoopNodes — each deep-DAG module exports a helper that registers exactly the nodes it needs. Call both before registering the parent DAG.

See this in action in the Archivist live demo.

Composing the same flow via `DAGDeriver.subDAGs`

The DAGBuilder .deepDAG(...) path above is the deterministic authoring journey. The same DeepDAGNode placement can be produced declaratively via the DAGDeriver subDAGs annotation when the surrounding flow is agent-style (operations declare dependencies; topology emerges):

DAGDeriver.derive({
  name: 'parent',
  version: '1',
  entrypoint: 'prepare',
  contracts: [
    { name: 'prepare',       hardRequired: ['input'],         produces: ['intermediate'], outputs: ['success'] },
    { name: 'invoke-plugin', hardRequired: ['intermediate'],  produces: ['childResult'],  outputs: ['success', 'error'] },
    { name: 'finalize',      hardRequired: ['childResult'],   produces: ['final'],        outputs: ['success'] },
  ],
  annotations: {
    subDAGs: {
      'invoke-plugin': {
        dag:     'plugin:transform',
        outputs: ['success', 'error'],
        stateMapping: {
          input:  { intermediate: 'intermediate' },
          output: { childResult:  'childResult' },
        },
      },
    },
  },
});

The contract's produces ↔ hardRequired still drives topology; the subDAGs annotation swaps the rendered placement from SingleNode to DeepDAGNode.
Every port in subDAG.outputs auto-wires to the next derived stage. terminals overrides individual ports if the error path needs a different target.
Sub-DAG references resolve at registerDAG time; the dispatcher's existing cycle check rejects self-referential subDAGs.
A runnable demonstration ships in examples/derive.ts (npm run example:derive).

See Authoring DAGs for the decision matrix between the imperative .deepDAG() path and the declarative subDAGs annotation.

Phase 05 · Deep-DAG composition ​

Flow ​

Deep-DAG: the packaged fan-out cluster ​

Parent DAG: the deep-DAG placements ​

What it demonstrates ​

Composing the same flow via DAGDeriver.subDAGs ​