New The global GB10 compute marketplace is live — lend your Grace Blackwell, keep 85% of every session. Become a provider →
NVIDIA Grace Blackwell GB10 · Available Now

Private LLM
inference on
real hardware.

Pay-per-hour access to a local GB10 superchip running 70B+ parameter models. No queue. No cloud markup. OpenAI-compatible API — drop in your session token and go.

inference.py
from openai import OpenAI

client = OpenAI(
    base_url="https://gb10.studio/v1",
    api_key="st_your_session_token",
)

response = client.chat.completions.create(
    model="GB10",
    messages=[{
        "role": "user",
        "content": "Explain CUDA memory coalescing",
    }],
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].delta.content)
128 GB Unified Memory
1 PFLOP FP8 Performance
NVLink-C2C CPU–GPU Interconnect
70B+ Parameter Models
$1.00 Per Hour
01

OpenAI-Compatible API

Change the base URL and API key — nothing else changes. Works with LangChain, LlamaIndex, Cursor, and any OpenAI SDK client out of the box.

02

No Queue. No Sharing.

Reserve a slot, start a session, get the full chip. Not a slice of a shared cluster. Not a rate-limited API wrapper. The whole machine is yours.

03

Pay by the Minute

Pre-load credits, run inference, stop when done. Billed per minute rounded up. No monthly commit. No egress fees. No surprise invoices.

Global Compute Marketplace · Now Open

Own a GB10?
Put it to work.

Your Grace Blackwell sits idle most of the day. List it on the GB10 Studio marketplace and earn from every minute someone else runs inference on it. You set the rate and your hours — we handle billing, payouts, and the customer.

Read the announcement
85% Revenue share — yours
  1. 01

    Apply & get verified

    Tell us about your GB10. A $5 application fee keeps the marketplace serious; another $5 is held in escrow against chargebacks.

  2. 02

    List your slot

    Point us at your HTTPS endpoint, set an hourly rate and weekly availability. Your slot token is encrypted at rest and rotates on demand.

  3. 03

    Earn & cash out

    Keep 85% of every session served. Watch earnings accrue live and request a payout once you clear $25.