For developers·Powered by Gonka inference

Your fastest path to production for AI apps.

Cheap inference is built into every container. Define your stack in YAML, fork a block, deploy with one command. Zero ops, zero surprises.

Inference & computeFreeright now
CLI1 lineto deploy
BlocksOpen sourceforkable
gonkablocks.yamlvalid
# A chat backed by Gonka inference
name: aurora-chat
type: session
model: qwen3-235b
inputs:
system_prompt:
"You are a precise, concise research assistant."
resources:
cpu: 1
memory_mb: 1024
$ gonkablocks deploydeploys to /chat in 12s
Click, click, done

From terminal to running tool in minutes.

Same idea Render popularised, with free inference baked into every container so you don't have to wire up API keys or worry about per-token cost.

1

Pick a deployment type

Tools, workflows, chats, live APIs, cron, or workers — same platform, six runtimes.

2

Connect code or fork a block

Push your repo with a gonkablocks.yaml, or fork any public block and edit from the browser.

3

Gonkablocks does the rest

Image build, scoped key, public URL, autoscaling, logs, metrics, billing — wired by default.

Deployment types

Whatever you're shipping, there's a runtime for it.

Six block types share one platform: same observability, same secrets system, same free inference baked in.

Tools

One-shot jobs. Take inputs, run a container, return outputs. Perfect for translation, summarization, image generation.

type: job

Workflows

Visual DAGs that wire blocks together. Outputs flow as inputs. Run any block as a node — no glue code.

type: workflow

Chats

Long-lived sessions with shared state, streamed token-by-token. OpenAI-compatible /v1/chat/completions API.

type: session

Live APIs

Always-on HTTP endpoints fronted by autoscaled containers. Per-route auth, scoped keys, audit logs.

type: service

Cron

Scheduled runs of any block. Use it for daily summaries, periodic ingest, drift checks. Same observability as one-off runs.

type: job + cron

Workers

Persistent background processes. Watch a queue, listen on a webhook, keep a model warm.

type: worker
Infrastructure as code

Define your stack in one file.

Wire up multiple blocks, model defaults, autoscaling, and cron schedules in a single gonkablocks.yaml. Validated on every push. Version-controlled with your code. Forkable in one click.

  • Validated

    Manifests are linted server-side before any container starts. Bad input schemas fail fast, not at run time.

  • Version-controlled

    Lives in your repo, reviews on PRs, rolls back like normal code. No clicking through dashboards to undo a mistake.

  • Forkable

    Any public block on the platform can be forked into your account by someone reading the manifest. The community remix loop, by default.

gonkablocks.yamlvalid
# A research workflow plus a chat fronted by a live API
blocks:
- name: research-job
type: workflow
schedule: "0 7 * * *"
cron: daily
- name: aurora-chat
type: session
model: qwen3-235b
autoscale:
min: 0
max: 8
- name: public-api
type: service
route: /v1/agents
auth: scoped-key
One repo, three blocks, one deploy.
Why inference is built in

Free, multi-model, no keys.

Other platforms make you BYO inference key. We bundle a decentralized GPU network into the runtime so a fresh container is already wired to talk to a model — and you don't pay for any of it while we're getting started.

Free for everyone, right now

Every run, every model call — sponsored by Gonka Labs. We'll let you know if/when this changes; for now there's nothing to wire up, no card to enter.

Open weights, real models

Qwen3 235B today, Kimi K2.6 next. Same OpenAI-compatible API surface across every model.

No API keys to manage

Each run gets a scoped key minted on the fly. No .env handling, no rotation, no leaks to grep for.

Local-first dev loop

One command from terminal to running tool.

gonkablocks deploy packages your block, builds the image, mints a scoped key, and gives you a public URL. No browser steps.

Install the CLI
$ curl -fsSL https://blocks.gonka.gg/install.sh | sh

Requires Node.js 20+. Two binaries get installed: gonkablocks and the short alias gbk.

Built-in primitives

The boring stuff, handled.

The platform ships with the things you'd otherwise have to wire up yourself.

Sandboxed Docker

Every block runs isolated. Optional gVisor runtime for full hard-multitenancy.

OpenAI-compatible

Drop-in /v1/chat/completions and /v1/embeddings. Use any client SDK you already have.

Streaming + logs

Server-sent stdout, structured run events, and per-call inference traces in one viewer.

Secrets vault

Per-user encrypted secrets, auto-matched to block input names, never exposed to client code.

Per-run scoped keys

Each run gets a fresh sk-run-* key bounded by a spend cap. Zero standing credentials in containers.

First-class CLI

`gonkablocks deploy`, `exec`, `connect`. Same primitives as the dashboard, scriptable.

Deploy your first block in five minutes.

gonkablocks deploy — that's the whole onboarding.