Every major engineering software vendor shipped an AI feature in 2025. Ansys added a copilot. MathWorks launched MATLAB Copilot. Siemens put an assistant inside NX. SOLIDWORKS introduced three AI companions with names — AURA, LEO, MARIE — as if naming them would make them more capable. Open any vendor's release notes and you'll find the same promise: your existing tool, now with intelligence. What you won't find is a tool that actually reasons about engineering.
Meanwhile, the other end of the spectrum has exploded. Symbolab has 300 million users. Photomath was acquired by Google. MathGPT, Mathway, and a dozen other apps will solve your calculus homework with a photo. If you're a sophomore studying for a dynamics exam, AI has you covered. If you're a practicing engineer sizing a heat exchanger to meet ASME thermal design standards, you're on your own.
Two Extremes, No Center
The top end of the market is enterprise incumbents who have bolted LLM copilots onto software designed in the 1990s and 2000s. These tools require existing licenses that cost $1,000 to $3,500 per seat per year. They primarily answer documentation questions and generate boilerplate code. They do not perform engineering calculations, run trade studies, verify unit consistency, or propose design changes. They are reference librarians, not engineers.
The bottom end is consumer AI math tools targeting students. They excel at step-by-step problem solving through calculus, but they have no concept of units, material properties, engineering standards, or the iterative multi-variable reasoning that characterizes real engineering work. A tool that can integrate a polynomial is not a tool that can size a pressure vessel.
The gap isn't accidental. Enterprise vendors have no incentive to build downmarket — their business model is high-touch sales to procurement departments, not self-serve signups from individual engineers. Student tools have no incentive to build upmarket — engineering domain knowledge is hard to encode and the audience is smaller. And the foundation model providers (OpenAI, Anthropic, Google) build general-purpose tools that are good at everything and excellent at nothing domain-specific.
The result is that a senior mechanical engineer with twenty years of experience and a $150,000 salary uses the same AI tool to reason about a thermal management system as a high school student uses to check their algebra. Not because the engineer lacks sophistication, but because nothing better exists.
What Engineers Actually Do (and What AI Can't Help With)
Ask a mechanical engineer what they did last Tuesday and you'll hear something like: ran a trade study comparing three bearing types for a high-speed application, sized a cooling channel using Dittus-Boelter, checked a bolted joint against VDI 2230, pulled material properties for 17-4 PH at elevated temperature, wrote up the analysis in a report for the design review on Thursday. Five tasks. Five different tools. Five different contexts that have to be manually stitched together.
Notice what happened. The engineer used AI exactly once — at the beginning, to recall a formula. The actual engineering work — applying that formula with correct units, checking the Reynolds number to verify the turbulent flow assumption, pulling temperature-dependent fluid properties, comparing the result against a design allowable with an appropriate safety factor — happened entirely outside the AI. The AI was a search engine with better prose. It contributed nothing to the engineering reasoning.
This is not a criticism of ChatGPT or Claude. General-purpose LLMs are extraordinary tools. But they have specific, well-documented failure modes that make them dangerous for engineering without guardrails: they reverse-engineer solutions from pattern-matching rather than performing physics-informed reasoning, they silently drop or convert units, they hallucinate material properties, and they present incorrect answers with the same confidence as correct ones. A 2024 study showed GPT-4 ignoring physical context like tensorial order and dimensional constraints in favor of algebraic shortcuts. Another study found that LLMs would confidently cite nonexistent functions and fabricated references when generating MATLAB code.
What Engineers Want vs. What They Get
The gap becomes visible when you compare what engineers ask for with what the market provides. Industry surveys consistently show the same pattern: data processing, financial calculations, and scientific data analysis receive the lowest AI satisfaction scores among technical professionals. Engineers want reasoning tools. They get suggestion boxes.
The frustration isn't that AI doesn't work. It's that AI works well enough to be tantalizing — well enough that engineers use it daily for brainstorming and drafting — but fails precisely at the moment the engineering judgment matters. The LLM will confidently tell you the thermal conductivity of 6061-T6 aluminum, but if you ask for the same property at 300°C with a citation to MMPDS, you'll get a hallucination dressed in the language of authority. The engineer catches this — because engineers always verify — but catching it costs the same time the AI was supposed to save.
Why Bolting AI Onto Legacy Software Doesn't Work
There's a simple test for whether a product is AI-native or AI-bolted-on: remove the AI features and ask whether the product works the same way. If the answer is yes — if the product is fundamentally the same tool with or without the AI sidebar — then the AI is a feature, not an architecture. This describes virtually every enterprise engineering tool that shipped an AI copilot in 2025.
The distinction matters because the architecture determines what the AI can do. A chat widget layered onto a 30-year-old desktop CAD application can answer questions about the documentation and suggest parameter values. It cannot propose structural changes to the model, validate those changes against engineering constraints, present them for human review, and track the decision history. Those capabilities require the AI to be integrated into the data model, not floating above it in a sidebar.
The companies building from scratch have a structural advantage here. Zoo, the AI-native CAD startup, designed their modeling language (KCL) specifically so that LLMs could read and write it natively — because they didn't have thirty years of file format decisions to work around. ThunderGraph, an emerging MBSE startup, builds system models incrementally using an AI graph agent that constructs elements one at a time, then validates the graph through automated traversal. Neither company is trying to add AI to an existing product. They're building the product around the AI.
This is where the opportunity lives. Not in competing with Ansys on simulation fidelity or with SOLIDWORKS on geometric modeling — those are decades-deep technical moats. The opportunity is in the workflows that sit between and around those tools: the calculations, the trade studies, the decision documentation, the design rationale that currently lives in spreadsheets and PowerPoint files and dies the moment someone leaves the team.
Three Workflows Nobody Has Built
If you audit the daily work of a mechanical, systems, or aerospace engineer, three workflows consume enormous time, produce enormous value, and have exactly zero purpose-built AI tooling.
The engineering calculation. Not a homework problem — a multi-step, multi-variable calculation governed by a standard (ASME, API, MIL-STD), using temperature-dependent material properties, requiring dimensional analysis at every step, and producing a documented result suitable for a design review. The closest existing tool is the ChatGPT + Wolfram Alpha plugin, which requires manual orchestration between two separate subscriptions and has no concept of engineering standards or calculation documentation.
The trade study. Engineers compare design alternatives using structured methodologies — Pugh matrices, AHP, weighted scoring — but every tool for this is either a blank spreadsheet template or a whiteboard with sticky notes. No AI tool suggests evaluation criteria based on the application domain, helps populate performance ratings from specification data, runs sensitivity analysis on criterion weights, or generates the documented decision rationale that design reviews require. The Pugh matrix was published in 1981. Forty-five years later, the state of the art for running one is a spreadsheet.
The decision record. When an engineer chooses a bearing type, a cooling architecture, or a sensor technology, the reasoning behind that choice matters as much as the choice itself. In two years, someone will ask why this bearing was selected, or whether the runner-up option should be reconsidered given new requirements. The answer is almost always lost — buried in an email thread, an outdated slide deck, or the memory of an engineer who has since changed jobs. No tool captures design decisions as first-class, queryable, traceable objects connected to the system model they affect.
Why the Middle Will Be Built From Scratch
The missing middle will not be filled by enterprise vendors adding features downmarket. Their architecture doesn't support it — desktop-native, license-gated, file-based tools cannot become browser-native, self-serve, cloud-computed engineering reasoning platforms through incremental updates. It will not be filled by consumer AI tools adding engineering knowledge upmarket. Unit-aware, standards-referenced, governance-tracked computation requires a fundamentally different data model than homework step-solving.
The tools that fill this gap will be built from scratch, by people who understand both the engineering domain and the AI architecture, and who build the governance layer first rather than bolting it on later. They will be web-native because engineers work across devices and shouldn't need IT to install software. They will be affordable because the individual engineer and small team market only works at tens of dollars per month, not thousands per year. And they will treat AI output with the appropriate engineering skepticism — not as truth to be accepted, but as proposals to be reviewed.
The barbell is unstable. The middle will be built. The question is whether it will be built by the incumbents who created the gap, by general-purpose AI companies who don't understand the domain, or by engineers who know what's missing because they've lived without it.