Reliability runbook
Upgrade, backup and disaster recovery for BigBlueButton
A snapshot is not a recovery plan. BigBlueButton spans replaceable software, persistent recordings, application databases, secrets, DNS and external services—each with a different restore method.
Executive brief
What matters
- 01
Use persistent override files and automation so a clean node can be rebuilt without reconstructing history.
- 02
Define recovery objectives separately for live classes, frontend data and historical recordings.
- 03
A backup is accepted only after a representative restore has been completed and timed.
01
Classify what must survive
Document host and application configuration, API and identity secrets, Greenlight or Scalelite databases, recording files and metadata, certificates, DNS, monitoring and automation. Identify which items are reproducible and which are unique. Encrypt backups and separate their credentials from production administrators.
02
Upgrade through a controlled path
Read the documentation for source and target versions. BigBlueButton major upgrades may require a clean server and recording migration. Take application-aware backups, drain new meetings, preserve configuration through supported override mechanisms, test externally and maintain a rollback condition.
03
Design recovery by objective
Live-class recovery may mean routing new meetings to healthy pool nodes; it does not revive a meeting lost with its backend. Frontend recovery requires database and configuration restore. Recording recovery may tolerate a longer objective but involve much larger data. State RTO and RPO for each rather than one vague “24-hour backup” promise.
04
Exercise the runbook
Restore Greenlight to an isolated environment, rebuild a BBB node, retrieve selected recordings and rotate a compromised credential. Verify joins from an external network and historical playback. Record duration, missing dependencies and manual decisions, then update contacts and procedures after every exercise.
Evidence base
Sources and further reading
We prefer project documentation and first-party product guidance. Community links are included where they reveal recurring operational questions rather than establish product guarantees.
- BigBlueButton installation and upgrade guidance docs.bigbluebutton.org ↗ (opens in a new tab)
- BigBlueButton customisation and recording transfer docs.bigbluebutton.org ↗ (opens in a new tab)
- BigBlueButton monitoring docs.bigbluebutton.org ↗ (opens in a new tab)
- Scalelite architecture github.com ↗ (opens in a new tab)
Practical answers
Questions teams ask
Are VM snapshots enough for BigBlueButton?+
No. They may assist short-term rollback but do not replace application-aware database backups, recording protection, off-site copies and tested clean rebuilds.
Can a Scalelite pool prevent every outage?+
It can direct new meetings away from unhealthy nodes, but a live meeting on a failed backend is interrupted.
What should be restored first?+
Follow business objectives and dependencies: routing/identity, frontend state, clean media capacity and then historical recording service as your runbook defines.
Continue the research
Related guides and infrastructure
Migrating BigBlueButton from another provider
Move BigBlueButton, Greenlight, recordings, integrations and DNS to a new provider with a tested rollback plan.
Read next → Security & governanceBigBlueButton recordings and privacy
Understand BigBlueButton capture, processing, publication, access, retention and deletion before enabling classroom recordings.
Read next → Planning & architectureScalelite architecture and operations
Understand Scalelite load balancing, PostgreSQL, Redis, shared recordings, node lifecycle, monitoring and failure modes.
Read next →