One slug per service
b/runbook-<service> is the same shortcut for every engineer, every browser, every device. Page received → type slug → on the runbook in 40ms.
Use case · Runbooks
Most runbooks rot in a Confluence space nobody touches between incidents. A BookSlash runbook board lives at b/runbook-<service>: live dashboards, code blocks, decision logs, post-mortems — current because the team uses it, not because someone audits it.
The shape of the work
A runbook is only useful if someone opens it under stress. Make it impossible to lose, fast to scan, and easy to update.
b/runbook-checkout, b/runbook-payments. The on-call rotation references slugs, not Confluence URLs. The slug stays even when the service moves.
Embed the live dashboard, the alert thresholds, the recent SEV history. The runbook is the dashboard, not a link to it.
After every incident, the on-call engineer updates the runbook from the same board they used to mitigate. The audit log records who changed what.
Runbook board template
Designed for fast scanning under stress. Live dashboards first, decision criteria second, the human stuff (who to escalate to) at the bottom.
One paragraph: what the service does, who owns it, where it sits in the architecture mind map.
Embedded Datadog/Grafana board. Latency, error rate, saturation. Loads at 40ms.
Table node: which alert fires at which value, who pages, on which schedule. Single source of truth.
Flow diagram: 5xx spike? Check this. Database slow? Check that. Three branches, ten nodes, calm under stress.
Last five incidents with one-line summaries and links to post-mortems. Pattern recognition built in.
kubectl, pg query, deploy rollback. Code blocks with copy buttons. No retyping at 3 a.m.
Who to page, in what order, with their slug-resolved phone number. b/escalate-checkout if the on-call cannot reach them.
Pinned at the bottom. Filled in after every SEV. The board grows; institutional knowledge stays.
Why this works
b/runbook-<service> is the same shortcut for every engineer, every browser, every device. Page received → type slug → on the runbook in 40ms.
The runbook IS the dashboard. Latency, errors, saturation in real time. Numbers from a screenshot are already wrong.
Mitigation queries, kubectl invocations, deploy rollbacks — copy-paste ready. No retyping a Confluence code block at 3 a.m.
Branches make better runbooks than paragraphs. Look at the symptom, follow the line, take the action.
Who changed the alert threshold last Tuesday? The audit log answers in two clicks. 90 days on Pro, 365 on Enterprise.
The mitigation team writes the post-mortem on the board they used during the incident. No transcribing into a doc the next day.
We ditched two wikis and a "links" channel. The on-call rotation went from "where's the dashboard?" to "type b/oncall."
Renata Coleman
Eng Lead · Halberd Mobility
Frequently asked
No — those handle paging and rotation. BookSlash holds the runbook content the on-call engineer reads after they get paged. Most teams put a slug to the runbook directly in their PagerDuty escalation policy notes.
BookSlash embeds load the source dashboard (Datadog, Grafana, Honeycomb, etc.) in an iframe with the workspace’s authenticated session. The runbook viewer sees current data, not a snapshot.
Yes. Per-board permissions let you restrict edit rights to specific roles or members. Most teams give all engineers read access, senior engineers and managers edit access, and use the audit log to track changes.
Audit logs are tamper-evident and exportable (90 days on Pro, 365 days on Enterprise with NDJSON). Most SOC 2 audits accept BookSlash audit log exports as evidence of change control. Email [email protected] for the SOC 2 status update.
Start with one team. Roll out when it sticks.
2,400+ teams reach every important destination in their stack with a single keystroke. Save the first slug in 30 seconds.
Free for personal use · No credit card · 14-day team trial