Documentation

How Tokenbase collects AI coding telemetry source-side, normalizes it per agent, and rolls it up by team, without proxying model traffic or uploading source by default.

Getting started

Three steps from an empty console to team-level spend and usage.

1
Run one command on the machine
It installs the collector, a small read-only agent, links this workspace, and starts syncing.
curl -fsSL https://tokenbasehq.com/install.sh | sh
Read-only and code-signed. See what it runs
Set it up with your coding agent
2
Approve the device
The installer opens a browser tab here to approve it. If it doesn't open (SSH or headless), go to tokenbasehq.com/activate and enter the code shown in your terminal.
3
See your numbers
Tokenbase scans your history. The first sync can take a few minutes; the dashboard fills in as it catches up.

After your first sync: map developers to teams in the console so usage rolls up by team. Solo setups roll up on their own.

Roll out to many machines

For more than a machine or two, no one wants to approve each one in a browser. Make one key and reuse it everywhere.

In the console, open Connections, create an enrollment key, and turn on Reusable so the same key works on every machine. Then run this on each one: it installs, links, and starts the collector, with nothing to approve:

curl -fsSL https://tokenbasehq.com/install.sh | TOKENBASE_AUTH_KEY=tbek_… sh

Drop that command into however you set machines up: your device-management tool (Jamf, Intune), an Ansible playbook, or the image your CI builds from.

Two switches if you need them: TOKENBASE_NO_SERVICE=1 installs and links without starting it in the background (for images that run the collector themselves), and TOKENBASE_NO_ENROLL=1 installs only, so you can link it later with tokenbase login.

The installer is safe to run again. It checks the current state first and only does the work that is missing, so wrapping it in Ansible, Chef, or Puppet will not churn. Pass --check for a dry run that reports what would change without touching anything.

On a headless server, a per-user background service stops at logout and will not start at boot. After the install, register a boot-persistent system service instead:

sudo tokenbase service install --system

Security and enterprise notes

Verify what you run. Before installing, the script checks the download's signature and checksum against a key built into the collector, and the macOS build is notarized by Apple. To serve the binaries from your own copy, which is useful on networks with no direct internet access, point TOKENBASE_INSTALL_BASE at an internal mirror.

Keep the key out of logs. Rather than putting the enrollment key in the command, set TOKENBASE_AUTH_KEY_FILE to a path your secret manager mounts. The key is read from the file, so it never appears on a command line, in shell history, or in your tooling's logs:

curl -fsSL https://tokenbasehq.com/install.sh | TOKENBASE_AUTH_KEY_FILE=/run/secrets/tokenbase-key sh

Rotate and revoke. Keys expire automatically (30 days by default; set a longer window when you create one). Once a rollout is done, rotate a key in Connections to mint a replacement and retire the old secret, or revoke it outright. A reusable key is a standing credential, so treat it like one. Behind a proxy, the installer follows the usual HTTPS_PROXY setting.

What leaves the machine. Metadata only by default: token counts, models, hashed paths, and the machine name your device reports at enrollment, never your source or message text. Run tokenbase preview on a machine to see the exact bytes before anything is sent.

The collector

A signed, device-scoped binary tails the local session logs your AI coding agents already write: read-only, incremental, checkpointed by offset. It enrolls as a device with its own scoped credential and never depends on a live browser session. It is never in the model request path.

Metadata-only is the default. A local preview shows exactly what would be sent before anything leaves the machine.

How each agent is read

There's one reader per agent, each versioned and tested against captured real session logs. Every reader reports how much of an agent's data it covers and flags anything it doesn't recognize; if one reader breaks, the rest keep working. v1 reads Claude Code and Codex CLI fully, Copilot CLI (real token counts and premium-request usage are read from the session log; cost is estimated, since Copilot bills by premium requests rather than tokens), and Cursor (in beta; its local storage was worked out by inspection, so its coverage is labeled rather than assumed).

Confidence tiers

Tier A is factual (straight from logs): tokens, models, session counts, tool calls, cache reuse, model mix, and estimated cost. Tier B is inferred (heuristics on local data): abandonment, retry loops, likely task type, and is always labeled as inferred, never presented as fact.

Deployment modes

SaaS metadata-only (default), SaaS redacted-text (opt-in), VPC / on-prem, and local-only pilot. In zero-egress modes raw transcripts never leave the device and analysis runs against a local or VPC model.

Run it on your own logs in metadata-only mode.

Get started