Superlog onboarding
Wire OpenTelemetry traces, logs, and metrics into the user's project so telemetry streams to Superlog. Cover every app and service in the repo — not just the one the user is currently sitting in.
Prefer native OpenTelemetry APIs and the framework's documented bootstrap over custom helper layers. If a specific stack stumps you, search the OTel docs for that language; don't guess.
Before editing, read the applicable companion skills:
- for general OTel taste.
- for Python services.
- for FastAPI services.
- for LiveKit agents.
- for Next.js/Vercel apps.
- for Expo / React Native apps.
- for Supabase Edge Functions.
- for any other language (Go, Java/Kotlin, Ruby, Rust, .NET/C#, PHP, Elixir, plain Node, …) — use this as the fallback when none of the above match.
Step 0 — Endpoint and key handling
The
OTLP endpoint is always
https://intake.superlog.sh
and
goes inline in the bootstrap code — it's not a secret, no env-var indirection needed.
The
ingest API key starts with
and is
project-scoped + write-only — it can only ingest events into one project, can't read anything, can't change settings. Treat it like a Sentry DSN, a PostHog public key, or a Datadog RUM client token:
inline it directly in the OTel bootstrap source alongside the endpoint. No
files, no deploy-target wiring, no
process.env.OTEL_EXPORTER_OTLP_HEADERS
. The user deploys their code and events flow.
Two paths, no questions asked:
Key in the prompt
If the invoking prompt already contains a
key, validate the prefix and inline it in every bootstrap file you write. Done — move on to Step 1.
No key
Kick off the device flow immediately, then keep working in parallel — don't block install on signup.
POST https://api.superlog.sh/oauth/device
with Content-Type: application/json
and body . Response includes , , verification_uri_complete
(a https://superlog.sh/activate?code=…&flow=skill
URL), (seconds), (poll interval seconds).
- Open
verification_uri_complete
in the user's default browser ( / / ). Print the URL too so they can copy it if the open command silently fails. Tell the user briefly what's happening: signup is open in their browser, the key flows back here automatically, you're going to keep working.
- While the user signs up, do not block. Keep going with Steps 1–4. Inline the literal sentinel in the bootstrap source as a placeholder — Superlog's ingest accepts it from anyone (returns 200 without forwarding anywhere), so the user's app can boot and exercise the OTel bootstrap path while signup is in flight.
- At Step 5, poll
POST https://api.superlog.sh/oauth/token
with every seconds. = , keep waiting. returns {ingest_key, project_id, user, org, flow:"skill"}
. = expired.
- On 200: walk the source files you wrote and replace the literal with the real . Never print the key back to chat (transcripts get logged). The web page already confirms hand-off to the user.
- On 410 / user closed the tab: leave in place and tell the user to sign up at https://superlog.sh/ and swap it later.
Step 1 — Map every app/service in the repo
Before instrumenting anything, enumerate what's here. Check workspace manifests (
, root
,
, Cargo workspace, Python
workspace setups,
and
conventions). Identify each service: web frontend, API, workers, background jobs, CLIs, sample/demo apps, mobile apps, Supabase and/or server functions. Mobile and serverless/edge functions are in scope; do not skip them merely because they are client-side or short-lived. Skip pure type/config packages with no runtime entry point. Do not skip any runnable services or leave them "out of scope": instrument absolutely everything in this run; there may be no follow-up.
Show the user the list before you start, so they can correct it.
Step 2 — For each service, install native OTel and bootstrap
Use the language's native OpenTelemetry SDK. Don't reach for vendor wrappers or hand-rolled helpers when an official package exists. Examples of what "native" means here:
for Node servers,
for normal Next.js/Vercel apps (sdk-node breaks Next's webpack and misses the framework bootstrap),
@opentelemetry/sdk-trace-web
+ browser/mobile-compatible exporters for Vite/SPA/Expo,
opentelemetry-instrumentation-*
+
for Python,
for Go.
No broad wrapper APIs. Avoid reusable helpers like
,
,
,
, or
. Acquire native tracers/meters/loggers at module scope and use the SDK's own APIs directly. In TypeScript/JavaScript, use the published
helper for bounded business spans and add
to
; this is required when the package can be installed because it avoids expanding a whole function into
plus
/
/
. Do not use helpers around provider SDK calls that OpenInference/provider instrumentation can observe directly. If an edge runtime genuinely cannot load an upstream OTel SDK, keep the shim tiny, provider-neutral, and OTel-shaped:
,
,
,
,
.
Wire all three signals — traces, logs, metrics.
Logs go through OTLP, not just stdout — set up the OTel log bridge for the language so app logs (with their existing log levels and structured fields) carry the active
/
automatically. The user's existing logger keeps working; you're just adding an OTLP handler/processor underneath.
The log bridge is not optional and is not "covered" by the SDK init alone — most language SDKs and framework wrappers wire traces (and sometimes metrics) by default but require an
explicit + OTLP log exporter + log-record processor + a bridge to the existing logger. Examples: Python stdlib needs
+
+
attached to the root logger (and
for trace correlation on existing records); Node needs
+ an instrumentation for the project's logger (
/
/
);
requires the
option — without it, no logs leave the process. The companion style skills spell out the exact pieces per stack — read them.
Common log-bridge mistakes to actively check for:
- Handler attached to a named logger when the app uses the root logger (or vice versa) — nothing flows.
- Default level filter (e.g. WARNING) swallowing the INFO/DEBUG lines the user actually wants in Superlog.
- not flushed on shutdown → short-lived CLIs, serverless, and edge functions drop the last batch. Wire / into the runtime's exit hook.
- An existing vendor transport (Pino → Logtail, Winston → Datadog, etc.) left in place is fine and expected — but make sure you're bridging from the logger, not adding a second transport that double-emits the same line through a different formatter.
- Logs emitted outside any span will arrive without / . That is correct and expected; do not "fix" it by starting throwaway spans around log calls.
Bootstrap rules:
- The bootstrap file must run before any framework imports. Use the language/framework's documented hook ( flag, , top-of- import, etc.).
- Inline the endpoint (
https://intake.superlog.sh
) and the project's ingest key directly in the bootstrap source. Don't read from process.env.OTEL_EXPORTER_OTLP_*
or write any files — the key is write-only, and inline configuration removes a whole class of "OTel didn't start because env vars weren't set" deploy failures. (See the framework-specific style skills for the exact shape per stack.)
- Use HTTP OTLP exporters, not gRPC. gRPC pulls in native bindings that break bundlers and complicate containers.
- Use the project's existing package manager (detect via lockfile).
- Prefer idempotent edits. If a config file already exists, edit don't overwrite.
- Set resource attributes on the OTel resource for every service: , ,
deployment.environment.name
, and — the canonical https URL of the repo (e.g. https://github.com/acme/api
). The repo URL is the important one and is fine to hardcode alongside in the SDK init; if the build platform exposes the slug (Vercel /, Railway /), prefer reading from env. Also set (commit SHA) on a best-effort basis from whatever env the runtime already injects (, , , , , , …). Do not shell out to from the running process. Skipping the SHA is fine, skipping the URL is not. Use the OTel semantic-convention keys exactly — do not invent / .
Framework rules:
- Next.js/Vercel: use with as the bootstrap. Do not substitute a raw / bootstrap unless the repo already uses that architecture and you are extending it. Use tracers/meters inside route handlers only where auto-instrumentation is blind. does not export logs by default — pass (v1) / (v2) with an from
@opentelemetry/exporter-logs-otlp-http
, or no logs will leave the process. Match the option name to the installed major version.
- Expo/React Native: preserve existing Expo Go / unsupported-runtime guards. In supported builds, call before Sentry and before app registration/user code. Inline the endpoint + ingest key in the observability module — no env vars. The bootstrap reads them straight from constants.
- Supabase Edge Functions: native Deno OpenTelemetry does not work in hosted Supabase Edge today. Use the tiny OTel-shaped shim pattern above; keep exporter endpoint/headers in one setup area and avoid Superlog-specific function/file names.
- Python/FastAPI: use native instrumentation such as
FastAPIInstrumentor.instrument_app(app)
rather than replacing request handling with manual middleware.
- Python/LiveKit: lifecycle spans that cross shutdown callbacks may use +
trace.use_span(..., end_on_exit=False)
and end in the shutdown callback. Bounded work should still use decorators or context managers.
Coexist with existing observability vendors. Don't remove Sentry, Datadog, New Relic, Honeycomb, Logtail, Pino transports, etc. OTel sits alongside them. The user explicitly wants both signals flowing during migration; ripping out the incumbent is not your call.
Step 3 — Add custom spans, metrics, and logs around business operations
Auto-instrumentation gets you HTTP in/out, DB queries, framework lifecycle. That's the floor, not the ceiling. Read the project to find the operations a human operator would actually want to see when something looks wrong.
Traces
Wrap every critical business operation with an active span. Auto-instrumented spans are fine where they exist — but if an operation isn't already getting a span, add one.
-
-
Attributes: entity IDs (order.id, user.id, workspace.id, tenant.id), counts, key boolean branch outcomes, model name / provider for LLM calls.
-
Record exceptions:
span.recordException(err)
+
span.setStatus({ code: ERROR })
on failure paths.
-
For Python functions with clear boundaries, prefer
@tracer.start_as_current_span("operation.name")
— the same call works as a decorator and as a context manager, and the decorator form is usually what you want for a whole function:
python
@tracer.start_as_current_span("do_work")
def do_work():
print("doing some work...")
Use a context manager when a decorator does not fit (partial scope, dynamic span name, etc.). Do not use detached
+ manual
for bounded work.
-
Skip trivial getters, pure transforms, internal helpers — anything with no real latency or failure mode.
-
Never put PII in attributes (emails, passwords, tokens, full request bodies).
Logs
Make sure logs are
structured and carry operation context. Concretely: every log line emitted inside a span should arrive at Superlog with
/
populated and any structured fields (orderId, userId, etc.) preserved as attributes. Trace/span context may be added natively by the log bridge or integration, or may require additional work.
Use logs for narrative ("starting batch reconcile", "retrying after 3xx") and exceptional events. An error log must only be emitted if the operation cannot recover and manual intervention is required.
Metrics
Cover business + performance + cost. Three categories to look for:
- Business logic counters. Every meaningful state transition: created, started, completed, failed, retried. Per-tenant, per-channel, per-status — low-cardinality dimensions only (never user/order IDs).
- Performance histograms. Latency of operations the user cares about, queue depth, batch sizes, payload sizes. Reuse existing timing instrumentation if the project already has any ( blocks, custom s, "[TIMING]" log lines) — emit a histogram from those measurements rather than measuring twice.
- Costs — especially LLM costs. If the project calls OpenAI / Anthropic / Google / any LLM provider, prefer provider instrumentation such as OpenInference where available so native SDK calls stay readable. Do not add pricing constants or LLM cost math in product handlers; Superlog computes estimated cost centrally in the UI/query layer from captured provider/model/token attributes. Avoid duplicating token counters already captured by provider instrumentation.
Get the meter once at module level, create instruments at module level, increment in the hot path. Don't create a fresh meter per call.
Step 4 — Verify the app still works
Per service:
- Run the project's own dev or build command (whatever its / / already wires up). Confirm it starts cleanly with no errors that trace back to your OTel install. Also run a telemetry bootstrap smoke that imports or starts the app, so provider setup, exporter construction, log bridging, and framework instrumentation all initialize. For a Python server this can be an import/startup command such as
uv run python -c 'from app.main import app; print(app.title)'
; for Node/Next use the repo's build/start path. For a server, hit at least one route with curl so traffic flows through the instrumentation; choose a route that exercises an instrumented operation when practical, not only a static health route. For a CLI, invoke a real command. Don't ship if the app's own startup is now broken — that's a regression.
- Confirm telemetry leaves the process — for all three signals. With the inline (or real key) in the bootstrap, OTLP POSTs from the dev server should return 2xx for each of , , and — that proves the full bootstrap is reaching the network, not just the trace pipeline. The signal is the running app's own POSTs to all three paths succeeding by the time the dev server shuts down. To force the logs path specifically, hit a route (or invoke a CLI command) that you know calls the project's logger inside an instrumented operation, then watch the dev server's outbound traffic / debug exporter output for a POST. If only shows up, the log bridge isn't wired (most common causes: never set, handler attached to the wrong logger, level filter too strict, missing , or shutdown not flushing the batch processor).
A bootstrap that loads but never POSTs — or POSTs traces but no logs/metrics — is not a partial success. Fix it before moving on.
Step 5 — Hand-off (final message to the user)
If you started a device flow in Step 0, collect the key first. Print one line that you're waiting for sign-up to finish (so the user knows the terminal isn't frozen), then poll
POST https://api.superlog.sh/oauth/token
with
at the
returned earlier. Cap the wait at
(default 600s).
- On 200: walk every source file where you inlined and replace it with the real . Never print the key back to chat (transcripts get logged); the web page already confirms hand-off to the user.
- On 410 / poll timeout: leave inline, tell the user "sign-up didn't finish in time — sign up at https://superlog.sh/ when you're ready and swap the literal in the bootstrap files I wrote." Continue with the rest of the closing message.
If the key was already supplied in the prompt, no polling needed — it's been inline from the start.
What changed
3–7 short factual bullets covering: packages installed, files created/modified, business spans/metrics added. Per service if changes differed, grouped if uniform. Mention any existing observability vendor (Sentry, Datadog, Logtail, Pino transports, etc.) you intentionally left in place so the coexistence is explicit.
Deploy
Tell the user to deploy as they normally would — push to their hosting platform, run their existing CI, or run locally. There are no env vars to wire and nothing platform-specific to configure: the endpoint and key are inline in the bootstrap, so events start flowing the moment the instrumented code runs.
If the user asks "where do I put the key in production?" — the answer is "you already did, it's in the source you just deployed."
Step 6 — Drive GitHub, Slack, and MCP install
Only run this step if the device flow completed successfully (you have a real
and the
from the
response). Skip it on poll timeout / sentinel — the user can install integrations from the dashboard later.
Walk the user through three short steps in this order. Pause for confirmation between each so they can keep up.
GitHub
"Opening the GitHub install page — pick the repos you want Superlog to read and approve. Press enter when you're back."
Open
https://api.superlog.sh/github/install?user_code=<USER_CODE>
in their default browser (
/
/
). The browser walks them through GitHub's app install; the page bounces back to
with a "GitHub connected" confirmation.
When they hit enter (or say done in chat), move on. If they say "skip" or close the tab, move on without complaint.
Slack
"Opening Slack OAuth — pick the workspace and approve. Press enter when you're back."
Open
https://api.superlog.sh/slack/install?user_code=<USER_CODE>
the same way. Slack returns to
.
Same skip semantics.
Superlog MCP
Suggest installing the Superlog MCP server so the agent (Claude Code, Codex, Cursor, etc.) can query telemetry directly next time they're debugging — search logs, pull traces, check error rates from inside the chat without context-switching to the dashboard.
For Claude Code (most common — that's where this skill is running), offer to run it for them:
claude mcp add --transport http superlog https://api.superlog.sh/mcp
This edits the user's Claude Code config. Confirm before running (the user may have a custom MCP scope or want to install elsewhere). If they decline, print the command so they can run it themselves later.
For other agents the user might also use, mention but do not run:
- Codex:
codex mcp add superlog --url https://api.superlog.sh/mcp && codex mcp login superlog
- Cursor / others: copy the snippet from https://superlog.sh/ → Connect.
When all three are done (or skipped), close out with a single line directing the user to deploy their app — they're ready to ship.
Hard rules
- Never modify files outside the project root.
- Never commit, push, or open PRs.
- Inline the ingest key in source. It's a project-scoped, write-only token (think Sentry DSN); env-var indirection just adds deploy-time failure modes for no gain.
- Never remove an existing observability vendor unless the user asks for it.
- Use the project's existing package manager and existing logger.
- Prefer native OTel packages for the language; don't reinvent telemetry plumbing the SDK already provides.
- If the dev/build command errors out because of your instrumentation, that's a failure — fix it or report it, don't paper over it.