00 / 08 · VISION

The road to end‑to‑end AI
for mechanical engineering

We have geometry generators. We have simulation tools. We have optimization algorithms. None of them talk to each other. This is what needs to change.

scroll

01 / 08 · THE PROBLEM

Today, every step still needs a human

The mechanical engineering loop hasn't fundamentally changed in 40 years. An expert sits at every handoff.

Concept Engineer manually describes requirements — dimensions, loads, constraints Human

CAD Expert models the geometry by hand — hours to weeks per part Human

Simulation FEA/CFD setup, meshing, boundary conditions — specialist required Human

DFM Review Manufacturing engineer checks tolerances, undercuts, wall thickness Human

Iteration Repeat from CAD step — typically 5–20 cycles before sign-off Human

02 / 08 · WHERE WE ARE

AI is arriving — but in isolated islands

Each layer is getting AI tooling independently. Nothing connects top to bottom.

        Text → CAD
        13+ models exist. Results are inconsistent. No benchmark. This is where we're focused.
        Active now
      

CAD → Sim Autodesk Fusion AI, Ansys AI, SimScale — AI-assisted setup but not end-to-end Early

Topo Opt Generative design (Fusion 360, nTopology) — well-established but siloed Mature

DFM Rule-based tools exist (Boothroyd Dewhurst, DFMPro) — not AI-native Pre-AI

Assembly No meaningful AI for multi-part assembly generation yet Unsolved

03 / 08 · THE CORE PROBLEM

You can't improve what
you can't measure

The text-to-CAD field has 13+ competing models, four different output formats, and no shared benchmark. Papers can't be compared. Progress is invisible.

What exists today

Static paper benchmarks. Each paper tests on its own dataset with its own metrics. Results die with the paper.

What's needed

A living leaderboard. Fixed prompt set. Automatic metrics + human preference votes. New models can submit anytime.

"Lack of comprehensive evaluation frameworks" — identified as the field's most critical gap in the 2025 LLMs for CAD survey (173 papers reviewed).

04 / 08 · THE STACK

What full-stack AI for mechanical
engineering actually requires

L1 · Valid geometry from text~80% solved

L2 · Dimensional accuracy + constraints~35% solved

L3 · Manufacturability (DFM-aware output)~5% solved

L4 · Physics-valid under load~10% solved

L5 · Multi-part assemblies~2% solved

L6 · Full product from specification0% solved

Estimates based on current SOTA across 173 papers reviewed. L1 validity = best models achieve ~80–93% on simple shapes.

05 / 08 · MISSING PIECES

The three gaps nobody has closed

        Manufacturability
        No model checks if a generated part can actually be made. Wall thickness, undercuts, tolerances, process-specific constraints — all ignored. A generated part that looks correct may be physically impossible to manufacture.
      

        Cross-model eval
        Sequence-based models (Text2CAD), code-based (CAD-Coder), B-rep direct (BrepGen) — never compared on the same benchmark. We don't know which paradigm wins or when.
      

        Academic vs commercial
        Zoo, AdamCAD, CADGPT have never appeared in any academic benchmark table. Academic SOTA models are never in commercial comparisons. Nobody has done both.
      

06 / 08 · CAD ARENA

The benchmark that drives progress

The history of ML is clear: ImageNet didn't just measure vision, it created it. SWE-bench didn't just measure coding agents, it shaped their development. A good benchmark is a forcing function.

200

benchmark prompts

13+

models evaluated

4

difficulty tiers

∞

open submissions

First benchmark to compare academic and commercial models side by side on the same fixed prompt set. Automatic validity + geometry metrics. Human preference voting. Living leaderboard.

07 / 08 · THE VISION

Where this ends up

A mechanical engineer describes what they need in plain language. The system generates geometry, checks it against manufacturing constraints, runs simulation, optimizes the design, and outputs a production-ready file.

Not a CAD copilot. A CAD engineer.

We're at step one: getting geometry generation right and measurable. But step one has to be done properly for the rest to follow.

See the benchmark → How we evaluate

The road to end‑to‑end AIfor mechanical engineering

Today, every step still needs a human

AI is arriving — but in isolated islands

You can't improve whatyou can't measure

What full-stack AI for mechanicalengineering actually requires

The three gaps nobody has closed

The benchmark that drives progress

Where this ends up

The road to end‑to‑end AI
for mechanical engineering

You can't improve what
you can't measure

What full-stack AI for mechanical
engineering actually requires