Languages · Safety-critical · Physical AI

ReflexScript: a language for safe robot reflexes

AI has eaten the bits world — search, writing, design, code, customer service — and is generating unprecedented economic value behind a computer screen. But the world's actual problems live in atoms. Factories, vehicles, hospitals, fields, kitchens. The prize for getting AI into the physical world is unfathomably bigger than what we've seen so far. After fifteen years shipping robots — at iRobot, Embark, OMRON and now Think Circuits — I'm convinced of two things at once: physical AI is genuinely happening, and the layer that's missing is the one underneath it — the one that lets you make a safety case. So I've been building it.

Why a new language?

Jensen Huang has been hammering this point on every keynote: every motor is a future robot, every robot needs a brain, and the trillion-dollar tail of AI lives in the physical world. He's right. The opportunity is the size of every appliance, vehicle, and machine that currently runs on hand-written firmware. Vision-language-action models are the most exciting research direction in robotics in a decade.

But that's the research story. The product story is what I spent most of my time on at iRobot, Embark, and OMRON and the gap between the two is the entire problem. No matter how clever the algorithm at the top, getting it out the door meant proving, line by line, that the system would behave safely on every input the field could throw at it. Demo-grade is not ship-grade. And the four properties you need to ship a system that touches the physical world are properties a black box does not have, no matter how smart it is:

  • Determinism. Same inputs, same outputs, same timing. Every cycle.
  • Explainability. A reviewer, an auditor, or a courtroom can read the artifact and say why it did what it did.
  • Observability. The internal state is visible from outside the box, at runtime, without changing what the box does.
  • Stability. Bounded behavior across the full operating envelope — no quiet drift, no edge cases that surface only in production.

A VLA can be brilliant and still fail all four. That's not a critique of VLAs; it's a job description for something else. ReflexScript is that something else. It's a small, statically-analyzable DSL for the layer that has to be deterministic, explainable, observable, and stable — the reflex layer between the learned model and the actuator. It compiles to MISRA-compliant C and to synthesizable SystemVerilog from the same source, so the same contract holds whether the reflex runs on an MCU, a Linux PREEMPT_RT box, or a piece of silicon.

The point isn't to compete with VLAs. The point is to enable them — to give the smart layer on top a trustworthy surface to act through, so it can actually be deployed in the real world without the safety case collapsing back into "hope nobody got close to the robot." The reasoning model emits intent; the reflex enforces what the world is allowed to do regardless of intent. That's the seam I think every physical-AI product is going to need.

Where a VLA is the wrong tool

Three places that come up over and over in product review, and that no amount of additional model capacity makes go away.

A warehouse AMR or a collaborative arm has to keep a stand-off from a human every cycle, on a known timing budget. The interesting work — recognising the human, predicting their intent, planning a polite path around them — is good VLA territory. The decision "if any range sensor reads below threshold, the velocity command is clamped to a deceleration profile that stops before contact" is not. It needs to execute at a fixed rate, with a verified worst case, against a known-good fault response.

reflex avoid @(rate(500Hz), wcet(60us), bounded) {
  input:  ranges: i16[m][8]
  output: v_cmd:  i16[mps]
  loop {
    let min_d: i16[m] = 32767
    for i in 0..7 {
      if (ranges[i] < min_d) { min_d = ranges[i] }
    }
    v_cmd = clamp(min_d - 200, 0, 300)
  }
}

What the smart layer can't promise: that v_cmd is bounded by sensor reading × deceleration profile on every cycle. The reflex can — the compiler proves the budget and the branch coverage before it ships.

An aerial vehicle's inner attitude loop runs at 1 kHz and has roughly a millisecond to turn three gyro readings into four motor PWM outputs. A planner stalling for 500 ms is a crash. The control law itself is twenty lines; the engineering effort is in proving it closes the loop on time, every time, on the target MCU.

reflex attitude_inner @(rate(1000Hz), wcet(80us), bounded) {
  input:  gyro: i16[radps][3], cmd: i16[radps][3]
  output: pwm:  u16[4]
  state:  i_err: i32[3] = [0, 0, 0]
  loop {
    for ax in 0..2 {
      let e = cmd[ax] - gyro[ax]
      i_err[ax] = clamp(i_err[ax] + e, -10000, 10000)
    }
    // mixer: PI terms + base thrust → 4 motors
  }
}

What the smart layer can't promise: that the loop closes at 1 kHz with a verified worst case of 80 µs on this MCU. The compiler can — emitted against the target's instruction-timing model, refused at build time if it doesn't fit.

The e-stop contract is a Boolean implication: if any of {door, scanner, watchdog, dual e-stop buttons} is asserted, motors disable and brakes engage. That has to be true on every input combination, not just the ones the programmer thought of. ReflexScript's safety block makes the contract part of the source, and the compiler verifies it exhaustively.

reflex e_stop @(rate(2000Hz), wcet(30us), bounded) {
  input:  estop_a: bool, estop_b: bool,
          door_closed: bool, watchdog_ok: bool
  output: motors_enable: bool, brake_engage: bool

  safety {
    require: {
      (!estop_a || !estop_b || !door_closed || !watchdog_ok)
        -> (motors_enable == false),
      motors_enable == false -> brake_engage == true,
    }
  }

  loop {
    let fault = !estop_a || !estop_b || !door_closed || !watchdog_ok
    motors_enable = !fault
    brake_engage  =  fault
  }
}

What the smart layer can't promise: that the implication holds on every reachable input. The compiler can — the safety block is discharged exhaustively before the binary exists.

How ReflexScript delivers it

The shape of the language falls out of those four properties. If everything has to be statically decidable, a lot of features I might have wanted are gone — and the ones that survive earn their keep.

  • No dynamic allocation, no recursion, no unbounded loops. Every loop has a compile-time iteration bound. Determinism falls out.
  • WCET, stack, and memory are first-class compile-time facts. Computed against the target ISA's instruction-timing model, not benchmarked. Stability falls out.
  • Units and dimensions are first-class. i16[m], i16[mps], i16[radps] — you cannot add a metre to a metre-per-second, and the type error is on the line that did it. Explainability falls out.
  • safety blocks with implication syntax. Write estop_active implies brake_engage == true and have it verified exhaustively over the reachable input space, or by Monte Carlo when the space is too large.
  • Same source, two targets. MISRA-compliant C for the MCU/Linux side, synthesizable SystemVerilog for the FPGA side. The reflex's contract doesn't change with the substrate. Observability is wired in: every input, output, and state field is named and exposed.

What it looks like

Here's an obstacle-avoidance reflex in full — eight range measurements in, forward and angular velocity commands out. The decorator says: 500 Hz, under 60 µs, 256-byte stack, bounded.

reflex avoid @(rate(500Hz), wcet(60us), stack(256bytes), bounded) {
  input:  ranges: i16[m][8], v_meas: i16[mps]
  output: v_cmd: i16[mps], w_cmd: i16[radps]
  state:  last_w: i16[radps] = 0

  loop {
    let min_d: i16[m] = 32767
    for i in 0..7 {
      if (ranges[i] < min_d) { min_d = ranges[i] }
    }

    v_cmd = clamp(min_d - 200, 0, 300)
    if      (min_d < 150) { w_cmd = 120 }
    elif    (min_d < 250) { w_cmd =  60 }
    else                  { w_cmd =   0 }

    last_w = w_cmd
  }

  tests {
    test near_wall inputs: { ranges = [100,100,100,100,100,100,100,100], v_meas = 0 },
                  expect: { v_cmd = 0, w_cmd = 120 }
    test clear    inputs: { ranges = [1000,1000,1000,1000,1000,1000,1000,1000], v_meas = 0 },
                  expect: { v_cmd = 300, w_cmd = 0 }
  }
}

If the loop body is too expensive on the target ISA, you get a build error, not a missed deadline in the field. The inline tests block runs one cycle of the reflex against fixed inputs and checks outputs — same harness in C and Verilog.

Try it in the browser

The whole compiler — lexer, parser, type-checker, WCET pass, code generator — is built to WebAssembly and embedded below. Pick an example on the left (the safety/ and robotics/ folders have the substantial ones), edit it, hit Compile, and the generated C or Verilog appears as a new tab. Diagnostics carry the license banner plus warnings, errors, and (with Strict WCET ticked on the right) the worst-case timing report. Ctrl/Cmd + Enter compiles without leaving the editor.

Best on desktop — the IDE is a three-pane editor. Tight on screen? Open the IDE in its own window →

Muscle memory: what this unlocks past safety

Safety is the floor. The interesting ceiling is everything else the same layer unlocks.

Think about how you hold a wet glass versus a fragile egg. The reasoning is the same — don't spill, don't crush — but the actuation is microsecond-fast and entirely below your conscious attention. You don't plan harder for the egg; you delegate the moment-to-moment control to a layer that runs much faster than your prefrontal cortex, and you don't burn attention on it. The brain is a hierarchy of these. The planning layer can be slow and approximate because the layer underneath is fast and exact.

A humanoid robot will need the same architecture. The grasp controller for a wet glass — picking the right grip force, the right finger trajectory, the right thumb counter-pressure, all in the microseconds before the glass slips — isn't a job for a VLA. It's a job for code that runs on the MCU bolted to the hand, or on a piece of silicon that is the hand. And there will be many of those reflexes — one per skill, one per actuator, eventually generated by the higher layer learning to push computation downward.

The pitch for ReflexScript past today is that it's the target language for that downward push. The high-level model decides which reflex to summon; ReflexScript is what gets compiled, verified, and burned into the actuator — MCU, FPGA, or, eventually, custom silicon. Robots writing their own muscle memory, not just for safety but for the energy efficiency and the capability that come from running the right code in the right place. The reflex layer stops being something humans write by hand. It becomes part of what gets learned.

Products built on this stack

ReflexScript is the language. Two products sit on it, at opposite ends of the audience.

About the project. ReflexScript is a Think Circuits LLC / Mirror AI research artifact. The WebAssembly build above is an evaluation release — the license banner in the IDE is the canonical copy of the terms. Email me if you'd like to talk about it: nerd256@gmail.com.