Production readiness

Have you ever launched a new service to production? Have you ever been maintaining a production service? If you answer “yes” to one of these questions, have you been guided during the process? What’s good or bad to do in production? And how do you transfer knowledge when new team members want to release production services or take the ownership of existing services?

Most companies end up having organically grown wild west approaches when it comes to production practices. Each team would figure out their tools and best practices themselves with trial-error. This reality often has a real tax not only on the success of the projects but also on engineers.

Trial-error culture creates an environment where finger pointing and blaming is more common. Once these behaviors are common, it becomes harder to learn from mistakes or not to repeat them again.

Successful organizations:

Production readiness involve a “review” process. Reviews can be a checklist or a questionnaire. Reviews can be done manually, automatically or both. Organizations can produce checklist templates rather than a static list of requirements that can be customized based on the needs. By doing so, it is possible to give engineers a way to inherit knowledge but also enough flexibility when it is required.

When to review a service for production readiness?

Production readiness reviews is not only useful right before pushing to production, they can be a protocol when handing off operational responsibilities to a different team or to a new hire. Use reviews when:

Production readiness checklists

A while ago, I published an example checklist for production readiness as an example of what they can cover. Even though the list came to existence when working with Google Cloud customers, it is useful and applicable outside of Google Cloud.

Design and Development

Configuration Management

Release Management

Observability

Security and Protection

Capacity planning