This is real CSS from one of our example apps:
@keyframes cube-spin {
0% { transform: rotateX(-18deg) rotateY(24deg) rotateZ(0deg); }
45% { transform: rotateX(44deg) rotateY(188deg) rotateZ(5deg); }
100% { transform: rotateX(342deg) rotateY(384deg) rotateZ(0deg); }
}
.cube-app {
background-image: linear-gradient(135deg, #12130f 0%, #171b14 45%, #251a15 100%);
perspective: 155vh;
perspective-origin: 50% 30%;
}
It spins a 3D cube — perspective projection, backface culling, translucent glass faces, a grid floor drawn with repeating linear gradients — at 42 frames per second on an AMOLED display.
It runs on an ESP32-S3: a microcontroller that costs about three dollars. No GPU. No operating system. No browser. No JavaScript engine. 512 KB of RAM.
CSS was not supposed to be here. Not on a tiny microcontroller, not on a watch-sized screen, not in a place with no DOM engine and no comforting heap of desktop memory to hide inside. And yet here it is.
We have stared at that cube longer than we would like to admit. It is the demo that hooked us, because everything we believe about this craft is sitting right there in its smooth rotation. This is the first post on the gea blog, so let us back up and explain what on earth is going on.
The runtime was always the problem
It started with an ESP32-S3 board, and one project led to the next. The obvious path was an embedded JavaScript SDK: write JS, ship an interpreter engine to the chip, run your app on top. We even got one running on hardware it didn't support yet, adding the device support ourselves. Fine. That is the fun part.
Then we tried to actually build something, and the performance was dire. Interpreting JavaScript on a chip like this is just slow — the stock firmware literally paints the screen top to bottom over a few seconds. That is not a UI; that is a progress bar pretending to be one. You feel the interpreter sitting between your intent and the silicon, taxing every frame.
And here is the thing we could not unsee: if the runtime is the problem, remove the runtime.
We refused to learn a new mental model
What we wanted was almost insultingly simple. Write TypeScript and CSS. Have it run native. Build an embedded app the exact way you build for the web — no new paradigm to memorize at the altar of the hardware. You already know the language. That should be enough.
That conviction already had a name on the web: geaJS, our frontend framework — the same mental model, a direct competitor to React and friends — with the motto "JavaScript is enough." geastack came after it, built on top of geaJS, to carry that model onto hardware.
But a motto is just a mood until something compiles. So we sat down and wrote a TypeScript-to-C++ compiler. The first results were astonishing enough that we kept pulling the thread, and it grew into geatsc — it takes any TypeScript app and compiles it to native C++ ahead of time. No interpreter, no VM left on the device. Your JSX and your CSS become machine code the chip runs directly, the way it always wanted to.
Which brings us back to the cube, and to what the project — gea — has become since.
What gea is
gea is a framework for building apps for small devices — smartwatch-sized AMOLED screens, round rotary dials, 7-inch panels — the way you'd build them for the web. You write TSX components with reactive state, you style them with plain CSS files, and the toolchain compiles the whole thing to native C++. The binary that gets flashed to the chip contains your app's logic as machine code, plus a runtime that is closer in spirit to a game engine than to a browser.
A modern browser engine is tens of millions of lines of code. Our entire style and layout engine is about six thousand lines of C++. The reason that's possible — and the reason any of this fits on a microcontroller — is a decision we made early: don't interpret CSS, compile it.
We did not port a browser to a chip. That would be the boring impossible version. We taught the chip just enough CSS to be useful.
The part we didn't ship
There is no CSS file on the chip. There is no parser sitting in a hot loop reading selectors. When you build a gea app, we let the normal frontend toolchain do what it is good at: CSS files are bundled, imports are resolved, assets are copied, and the final stylesheet is collected. Then the compiler — geatsc, with a gea plugin — walks that CSS and folds every rule into the generated program. What comes out the other end looks like this:
gea::embedded::ui::StyleSheet::instance()
.registerRule("cube-app", "background-color", "#12130f");
gea::embedded::ui::StyleSheet::instance()
.registerKeyframeRule("cube-spin", 0, "transform",
"rotateX(-18deg) rotateY(24deg) rotateZ(0deg)");
Those registrations execute exactly once, at boot. Values are parsed into typed storage on the spot —
colors become the panel's native RGB565, angles become tenths of degrees in an int16_t,
scales become permille. From that moment on, CSS is not text anymore. It's data. The device does not
need to understand CSS syntax. It only needs to apply the right value at the right time.
Every node in the UI tree carries a ComputedStyle: a fixed-layout struct of about 520
bytes. Not a hash map, not a property bag — a plain struct, with every supported property
pre-allocated as a typed field. Looking up border-radius is a struct field read. Setting
opacity is writing a uint8_t. There is no per-property allocation, ever,
which matters a lot on a chip where the largest free block of internal SRAM at render time can be 8
kilobytes.
What "running CSS" actually means
It would be easy to claim CSS support by handling color and width and
calling it a day. We didn't want a toy subset — we wanted the CSS you actually write. At the same
time, a microcontroller cannot afford to be general the way a desktop browser can, so every feature
has to earn its place: does it help people build real embedded apps, can it be represented compactly,
can it be rendered predictably? Here's what the runtime understands today:
-
Flexbox, implemented from scratch in about 1,300 lines: direction, wrap,
grow/shrink,
gap,justify-content,align-items, the lot. Plus a small grid mode (up to 8 tracks per axis), absolute/fixed positioning, andz-index. -
Selectors, matched at runtime against the live tree: classes, element names,
descendant and direct-child combinators,
::before/::after, active states. Parsed selectors are cached so matching never re-tokenizes a string. -
Animations and transitions:
@keyframeswith delay, duration, iteration counts, direction, fill modes, and cubic-bézier easing. A small animation engine ticks once per frame and interpolates colors, angles, and scalars directly into the typed style fields. -
Transforms in 2D and 3D:
rotate/rotateX/Y/Z,translate,scale,transform-origin,perspective,perspective-origin,backface-visibility. This is how the cube is a cube and not six flat rectangles pretending. -
Custom properties:
var(--accent)with fallbacks, pluscalc(),min(),max(), andclamp(). Variables are dependency-tracked — when--accentchanges, only the nodes that actually reference it recompute, not the whole tree. -
Media queries, evaluated at runtime against the real panel:
min-width,orientation,aspect-ratio,resolution. On the web this is for resizable windows; here it's how a single app ships to a 410×502 portrait AMOLED, a 480×480 round dial, and a 1280×800 panel without three stylesheets. -
The paint vocabulary you'd expect: linear and radial gradients with multiple stops,
border-radiusper corner,box-shadow(including inset),opacity,filter: blur(),overflow,text-overflow: ellipsis,vw/vh/vmin/vmaxunits.
Text gets the same treatment as everything else: fonts are rasterized at build time into grayscale atlases with per-pixel coverage, so glyphs come out anti-aliased on device without ever touching a TTF parser at runtime.
It is not "all of CSS." It is CSS as an embedded UI language.
Keyframes were the moment it felt real
Static styles are useful, but animation is where it starts to feel like CSS. A watch hand rotating smoothly. A loading surface pulsing. A cube turning on a tiny AMOLED screen.
Keyframes are parsed during the build and registered as animation data. At runtime, the animation engine samples the active keyframe values and applies them through the same style system as everything else. A small app can say:
@keyframes spin {
from { transform: rotate(0deg); }
to { transform: rotate(360deg); }
}
.cube {
animation: spin 4s linear infinite;
}
And the chip does the rest. That distinction is the whole project in miniature: preserve the expressive surface, compile away the expensive parts.
From a class change to photons
Supporting the properties is half the story. The other half is doing something about them 42 times a second on a 240 MHz core.
The renderer is a retained display list. Layout runs, and the tree is recorded into a flat command buffer — fill this rounded rect, blit this glyph run, fill this transformed quad with a gradient. When state changes, we don't redraw the screen; we track dirty regions (up to 32 coalesced rectangles) and replay only the commands that intersect them, in original paint order.
CSS animations get an even shorter path. When a frame changes only a transform — which is
exactly what a @keyframes rotation does — there's no reason to re-record the display list
at all. The renderer validates that every retained command is re-projectable (about 14 microseconds),
then pushes the existing commands' corner points through the new transform matrix and re-sorts them by
depth. For the cube, that's roughly 2.7 ms instead of a full layout-and-record pass. The display list
survives; only its geometry moves.
Then the pixels have to leave the chip. The framebuffer — 410×502 at 16 bits per pixel, 411 KB — doesn't fit in the ESP32-S3's 512 KB of internal SRAM alongside everything else, so it lives in external PSRAM. A DMA engine streams dirty rows from PSRAM straight into the QSPI peripheral in 32–64-row chunks, two chunks in flight at a time, while the CPU is already working on the next frame. The panel link runs at 80 MHz quad-SPI, which is about 40 MB/s — enough for a full-screen flush in ~10 ms, or a ceiling of roughly 52 fps if you redraw everything, every frame.
A frame of the spinning cube breaks down like this on device:
| Phase | Time |
|---|---|
| Replay (rasterize the translucent faces) | ~13 ms |
| Flush (DMA to the panel) | ~6 ms |
| Reproject (transform fast path) | ~2.7 ms |
| Layout, dirty tracking, overhead | ~2 ms |
That's ~24 ms, 42 fps, with alpha-blended faces compositing over a gradient backdrop.
The wall we hit (and it wasn't the one we expected)
Here's the part that surprised us. We assumed the bottleneck would be arithmetic — all that per-pixel blending and 3D math on a microcontroller. So we did the classic things: SIMD blend loops processing eight pixels per op, sine/cosine lookup tables, splitting rasterization across both cores.
The two-core split helped. The SIMD blend was within measurement noise of the scalar loop.
The renderer isn't compute-bound. It's memory-bound. Every framebuffer write and every node read goes over the bus to external PSRAM, and that bandwidth — shared between the CPU, the cache, and the DMA engine feeding the display — is the actual frame-rate ceiling. Once we understood that, a whole class of "optimizations" went in the bin, and the ones that mattered became obvious: touch fewer bytes. Dirty rectangles, the reproject path, backdrop caches that let opaque backgrounds skip re-rasterization, a line-break cache so scrolling text never re-measures glyphs — every win on this hardware is a story about bytes not moved.
On these devices, style is not separate from performance. A border radius is not just a design choice. A transform eventually becomes pixels moving over SPI and DMA, and the design language and the hardware reality have to meet each other directly. That has been the fun part.
It's not a browser. It's a bridge.
There's no :hover — these screens have fingers, not mice. The UI tree is capped at 512
nodes, grid stops at 8 tracks per axis, and animations interpolate colors, angles, and scalars rather
than arbitrary properties. These are honest constraints, and on a screen the size of a watch face,
none of them have bitten us. What's left is, we think, the subset of CSS you actually reach for at
410×502 pixels — and it behaves the way the spec says it should, because we kept the semantics and
threw away the interpreter.
We are not trying to make microcontrollers pretend to be laptops. We are trying to let developers bring a familiar mental model to hardware that usually demands a totally different one. Write components. Style them with CSS. Preview them in a browser-hosted simulator. Compile them into native code. Run them on the device. That loop is the product — and because the runtime doesn't know about any one specific board, with display details living behind platform backends, the same app can target the simulator, ESP32 displays, and other native shells without becoming a board-specific rewrite.
Why bother
Because the status quo for embedded UI is hand-positioning widgets from C structs — duplicating colors across headers, rebuilding firmware to move text three pixels, inventing a worse version of stylesheets one enum at a time. And the status quo for nice embedded UI is shipping an entire web runtime and a Linux board to draw a thermostat. Neither felt right. The web's authoring model — components, classes, the cascade, keyframes — is the best UI-description language we have. The web's runtime cost is the only reason it stays out of small hardware.
A compiler dissolves that trade-off. You write CSS; the chip runs machine code.
What comes next
There is still a long road ahead: more selectors, more properties, better layout coverage, more precise browser parity where it matters, and more ruthless embedded-specific optimization where parity would just be waste. This post is the first of a series. Next up: how the TypeScript side works — reactive stores compiled to plain C++ structs, with no garbage collector invited — and a deeper dive into the reproject trick that makes 3D transforms nearly free.
So yes, we taught a $3 chip to run CSS. Not all of CSS. Not browser CSS. Not the infinite cathedral of the web platform. We did not make a three-dollar chip pretend to run CSS — we taught it to run CSS for real, and then it had no trouble spinning a cube at 42 frames per second.
If you've ever wanted display: flex on something you can solder, stick around.