The fastest transcription framework for builders and tinkerers.

A minimal macOS runtime. Swift services, Bun CLI, TypeScript SDK. On-device, measurable, yours.

Get started with the docs
Operator view
$ vox perf --client raycast
p50=132ms p95=197ms 35x realtime
$ vox transcribe /tmp/sample.wav
done: 127ms (35x realtime)

Built for products with more than one caller.

Warm, local inference

Swift services host Parakeet locally so apps can warm the model ahead of speech instead of paying a cold-start tax on the first command.

Instrumentation first

Every transcription carries stage timings and dimensions for clientId, route, and modelId so operators can inspect real latency instead of guessing.

Multi-client runtime

One daemon serves menu bar apps, Raycast commands, browser extensions, editor integrations, and the CLI without each reinventing the runtime.

One protocol surface

Bun CLI for operator workflows, TypeScript SDK for embedded app integrations, both over local WebSocket JSON-RPC.

Measure what operators actually care about.

Stage timings for every transcription. Slice by client, route, and model.

performance.jsonl
{"clientId":"vox-cli","totalMs":127}
{"clientId":"raycast","inferenceMs":268}
{"clientId":"browser-ext","audioDurationMs":5110}

Start local, export when you need to.

Up and running in four commands.

Clone, install, build, verify. The doctor command confirms the daemon, model, and backend are healthy on your Mac.

$ git clone https://github.com/arach/vox.git
$ cd vox && bun install
resolved 148 packages
$ bun run build
sdk, cli, daemon built
$ vox doctor
daemon: running
backend: parakeet
ready