← The Log
Hardware·May 21, 2026·7 min read

Why the DGX Spark is the most important desktop AI computer in a decade

A petaflop-class AI computer with 128GB of coherent memory that sits on a desk and sips wall power is not an incremental workstation — it's a category reset. The NVIDIA DGX Spark moves serious local AI from the data center to the room you're already sitting in.

For most of the deep-learning era, the answer to "where do I run a real model?" has been the same: somewhere else. A rack you don't own, in a building you can't enter, billed by the token through a queue. The NVIDIA DGX Spark — the GB10-based desktop machine previously previewed as Project DIGITS — quietly deletes that assumption. It puts a coherent CPU+GPU AI computer on a desk, plugs it into a normal outlet, and lets it run models that used to demand a server room.

That sounds like a spec bump. It isn't. It's a change in who gets to do serious local AI at all.

From "you need a data center" to "you need a desk"

The DGX Spark is built around the NVIDIA Grace Blackwell GB10 superchip: a 20-core Arm Grace CPU and a Blackwell GPU on a single package, joined by NVLink-C2C, a coherent interconnect that lets the two halves share memory without copying data back and forth across a slow bus. The result is up to roughly one petaFLOP of AI performance (FP4, with sparsity) in a box the size of a small-form-factor desktop.

The number that matters more than the FLOPs is the footprint. This machine runs from a standard wall outlet. No 240V drop, no rack, no acoustic blast from a row of fans. The barrier to running frontier-scale inference locally drops from "lease space in a colocation facility" to "clear a corner of your desk." When the cost of entry falls by that much, the set of people who can build with serious models stops being labs with infrastructure budgets and starts being anyone with a workbench.

128GB of unified memory is the real unlock

Consumer GPUs hit a wall long before they run out of compute — they run out of memory. A 24GB or 32GB card simply cannot hold a large model's weights, and the workarounds (aggressive quantization, offloading layers to system RAM over PCIe, splitting across multiple cards) all trade quality or speed for the privilege of fitting.

The DGX Spark ships with 128GB of unified LPDDR5X shared coherently between the Grace CPU and the Blackwell GPU. There is no separate "VRAM" budget to ration against "system RAM." The model lives in one pool that both processors address directly. That is the difference between a machine that can demonstrate a model and a machine that can actually host one.

128 GBUnified coherent memory
~1 PFLOPAI performance (FP4, sparse)
~200BParams on a single unit

Concretely, a single DGX Spark targets models up to around 200 billion parameters — a tier that was, until recently, firmly server-class. The unified memory is what makes that practical instead of theoretical.

Power, noise, and the privacy that comes with ownership

A desktop AI computer is only useful if it can actually live on a desk. The DGX Spark draws from a standard wall outlet and fits in the footprint of a small-form-factor box, which means it belongs in an office, a studio, or a spare room rather than a dedicated, cooled, power-conditioned closet. That ordinariness is the point: the hardware disappears into the environment instead of dictating it.

And because the work happens locally, the data does too. Your model weights, your prompts, your customers' inputs — none of it has to leave the room to be useful. For anyone working with proprietary code, regulated records, or unreleased research, "the inference never crosses the building's wall" is not a nice-to-have; it's the entire reason to run locally in the first place.

Ownership changes the math

When you rent compute, every experiment has a meter running and your data sits on someone else's disk. When the machine is yours and it's in your room, the marginal cost of trying one more idea is the electricity to run it — and the privacy is structural, not contractual.

One desk today, two for tomorrow

The ceiling isn't fixed at a single box, either. The DGX Spark includes ConnectX networking, which lets two units be linked directly into one logical machine. Two linked Sparks push the addressable model size up to roughly 405 billion parameters — the kind of frontier-class workload most people still assume requires a cluster. That this scales by adding a second desktop, rather than a second data-center contract, is exactly the reframing this hardware represents.

The broader shift is the one worth watching. For a decade, the gravity of AI pulled everything toward the center — bigger clusters, more centralized capacity, thinner access doled out through APIs. A self-contained Grace Blackwell desktop pulls in the other direction. It puts real, frontier-adjacent capability back in the hands of individuals and small teams, on their own terms, in their own rooms. The most important thing about the DGX Spark isn't any single number on its spec sheet — it's that it makes "run it yourself" a serious answer again.

Want to try one before you buy?

Run a real Grace Blackwell GB10 by the hour — private inference, no hardware to rack, no commitment.

Try a GB10 by the hour