Architecture

Your Engineering Deserves a Pipeline, Not a Chatbot

Why the five-stage AI pipeline in Cairn exists — and what it replaces

GregMarch 202610 min read

Open any MBSE vendor's 2025 release notes and search for "AI." You'll find the same story everywhere: a chat sidebar that answers questions about your model, a quality checker that flags passive voice in requirements, maybe an assistant that generates test cases from natural-language descriptions. These are useful features. They are also, architecturally, identical to bolting ChatGPT onto a spreadsheet and calling it intelligent.

The MBSE industry has adopted AI the way a government agency adopts new technology: cautiously, superficially, and in a way that changes nothing fundamental about how the tool works. The model is still manually decomposed. The architecture is still manually drawn. The requirements are still manually traced to verification. The AI is an advisor, whispering suggestions from the margins. It never touches the model. It never proposes structural changes. It never does engineering.

A chatbot that can discuss your system model is not the same thing as an AI that can operate on it. The difference is the difference between a consultant who writes a memo and an engineer who submits a pull request.

Chat Is the Wrong Interface for Model Operations

When an engineer asks an AI to "decompose the Power subsystem into functional assemblies," the expected output is not a paragraph of prose. It is a set of specific, structured model mutations: three new nodes with names, descriptions, and parent-child relationships; six new interfaces connecting them; twelve requirements allocated to the new structure. These mutations must be consistent with the existing model, reversible if wrong, and reviewable before they take effect.

A chat interface cannot do this. It can describe what it would change, in English, and hope the engineer translates that description into manual edits without error. This is the equivalent of a code review conducted entirely in prose, without a diff viewer, without line-level comments, without the ability to click "approve" or "reject." Software engineering abandoned that workflow twenty years ago. Systems engineering hasn't started.

The Chatbot Approach
AI as advisor
AI describes changes in natural language
Engineer manually applies suggestions
No structural diff, no undo, no audit trail
Context window sees entire model (wasteful)
One monolithic AI call per interaction
The Pipeline Approach
AI as operator
AI produces structured model mutations
Engineer reviews and accepts/rejects each
Full diff, instant undo, complete history
Context scoped to relevant nodes only
Staged pipeline: route → assemble → execute → validate → review

Five Stages, One Contract

The alternative to a monolithic chat call is a pipeline — a sequence of focused stages, each with a clear mandate and minimal context. The metaphor: instead of handing one engineer the entire specification binder and saying "figure it out," a project lead reads the table of contents, pulls the relevant chapters, and hands them to the right specialist.

The Agent Pipeline
01
Router
Classify intent, scope to relevant nodes
LLM · fast
02
Context
Fetch exactly what the specialist needs
Code · instant
03
Specialist
Domain expert generates structured changes
LLM · capable
04
Validator
Check consistency, enforce schema
Code · strict
05
Review
Engineer approves, modifies, or rejects
UI · human

Each stage is independent — no conversational threading between them. Each LLM call receives exactly the context it needs via its system prompt and assembled payload, nothing more. The Router uses a small, fast model to classify intent in under a second. The Specialist uses a capable model with domain-specific instructions. The Validator is deterministic code, not an LLM at all. This means stages can be retried, parallelized, or swapped without side effects.

The critical innovation is not the pipeline itself — it's the contract between stages. Every specialist, regardless of domain, produces the same output structure: a ChangeSet.

The ChangeSet: Git Diffs for System Models

In software engineering, the pull request solved a fundamental governance problem: how do you let multiple contributors modify a shared codebase without chaos? The answer was a structured diff — a machine-readable description of exactly what changed, presented in a reviewable format, with the ability to approve, comment, or reject.

Systems engineering models have no equivalent. When an AI chatbot suggests "you should decompose the Power subsystem into Battery Management, Distribution, and Thermal Regulation," there is no structured artifact representing that suggestion. No diff. No review interface. No undo if it was wrong. The suggestion exists as text in a chat log — ephemeral, unstructured, and disconnected from the model it describes.

A ChangeSet is the structured artifact that's been missing. It is a self-describing, atomic transaction containing every operation the AI proposes: nodes to create, requirements to allocate, interfaces to add, traces to establish. Each operation carries complete before-and-after snapshots — not partial patches — so the review interface can render exactly what changed, undo is trivial (swap "after" back to "before"), and the history log can reconstruct any past model state.

ChangeSetAI-GENERATED · ARCHITECT
3 operations · just now
CREATE
Battery Management Assembly
New node under Power Subsystem · Cell monitoring, charge balancing, thermal cutoff logic
CREATE
Power Distribution Assembly
New node under Power Subsystem · 48V bus routing, load switching, fuse protection
UPDATE
Power Subsystem
description updated to reflect decomposition into management and distribution functions

This is not a mockup of a future feature. This is the architecture. Every AI interaction — whether decomposing a system, writing requirements, defining interfaces, or generating state machines — flows through the same pipeline and produces the same reviewable ChangeSet. The engineer is never surprised by a model change they didn't approve. The history log records every AI-proposed change alongside every human edit, with the original prompt, the specialist that produced it, and the version of the system prompt used. Full auditability, forever.

The Wrapper Problem and the Way Out

The criticism leveled at most AI-powered tools today is that they are "thin wrappers" — a UI skin over an API call, adding no structural value that couldn't be replicated in an afternoon. For chatbot-style AI integrations, this criticism is largely correct. The chat interface adds convenience but not capability. The model could hallucinate an incorrect requirement, and nothing in the architecture would catch it before it entered the model.

01
Structural Depth
The pipeline is not a wrapper — it's a governance layer
The Router scopes context so the Specialist never reasons over irrelevant data. The Validator enforces schema compliance, referential integrity, and naming conventions before the human ever sees the proposal. The ChangeSet carries complete before/after snapshots so undo is instant and history is lossless. These are not features of the LLM — they are features of the architecture around it.
02
Model Independence
Swap the model, keep the governance
Because the ChangeSet is the universal contract — not the model's output format — the LLM provider can be swapped without rewriting the application. The pipeline validates the same schema regardless of whether it was produced by Claude, GPT, Gemini, or a fine-tuned open-source model. The governance layer outlives any single model generation.

This is the difference between building on quicksand and building on bedrock. The LLM is a powerful but unreliable component — it hallucinates, it varies between runs, it improves unpredictably with each new release. The pipeline absorbs that unreliability. The Router constrains scope. The Specialist operates on focused context. The Validator catches errors. The human reviews the result. No single point of failure. No unreviewed mutation. No silent corruption of the engineering model.

The MBSE tools that will matter in five years are not the ones that added a chat widget to their sidebar. They are the ones that rethought the relationship between human judgment and machine capability — treating AI not as an oracle to be consulted, but as an engineer to be supervised, through the same structured review processes that engineering teams already trust with their most critical decisions.

Cairn is the AI engineering workbench for systems that matter.

Sign up and start building for free.