NVIDIA Grace Blackwell GB10 · Available Now

Private LLM
inference on
real hardware.

Pay-per-hour access to a local GB10 superchip running 70B+ parameter models. No queue. No cloud markup. OpenAI-compatible API — drop in your session token and go.

inference.py

from openai import OpenAI

client = OpenAI(
    base_url="https://gb10.studio/v1",
    api_key="st_your_session_token",
)

response = client.chat.completions.create(
    model="GB10",
    messages=[{
        "role": "user",
        "content": "Explain CUDA memory coalescing",
    }],
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].delta.content)▊

128 GB Unified Memory

1 PFLOP FP8 Performance

NVLink-C2C CPU–GPU Interconnect

70B+ Parameter Models

$1.00 Per Hour

01

OpenAI-Compatible API

Change the base URL and API key — nothing else changes. Works with LangChain, LlamaIndex, Cursor, and any OpenAI SDK client out of the box.

02

No Queue. No Sharing.

Reserve a slot, start a session, get the full chip. Not a slice of a shared cluster. Not a rate-limited API wrapper. The whole machine is yours.

03

Pay by the Minute

Pre-load credits, run inference, stop when done. Billed per minute rounded up. No monthly commit. No egress fees. No surprise invoices.

The Hardware Is Real

See the GB10 in action

YouTube · NVIDIA GB10 at CES 2025 — NVIDIA

YouTube · NVIDIA Blackwell Architecture Deep Dive — NVIDIA

Sign In

Don't have an account? Request Access

Request Access

GB10 Studio is invite-only. Submit your details and we'll review your request.

Already have an account? Sign In

Have an invite code? Complete Registration

Complete Registration

Enter your invite code to finish setting up your account.

Already have an account? Sign In

Don't have a code? Request Access

Dashboard

Account Balance

$0.00

Active Sessions

0

Reservations

0

Recent Transactions

No transactions yet

Add Funds

Current Balance

$0.00

Add Funds

Transaction History

Loading...

Available Compute Slots

Loading...

My Reservations

Loading...

Sessions

Loading...

Session Details

gb10@studio:~

00:00

you >

My Account

Loading...

Transaction History

Loading...

Private LLMinference onreal hardware.

OpenAI-Compatible API

No Queue. No Sharing.

Pay by the Minute

See the GB10 in action

Sign In

Request Access

Request Received

Complete Registration

Dashboard

Account Balance

Active Sessions

Reservations

Recent Transactions

Add Funds

Current Balance

Add Funds

Transaction History

Available Compute Slots

My Reservations

Sessions

Session Details

My Account

Transaction History

Create Reservation

Private LLM
inference on
real hardware.