The Deployment Problem
The ‘It works on my machine’ problem.
Deployment Antipatterns
Doing it manually:
- Requires extensive/verbose deployment
- Relies on manual testing
- Requires calling the development team to figure out what went wrong
- Requires updating the deployment sheet since the production environment is different
- Requires manual tweaking
Waiting until the very end to deploy a release:
- The software is tested on a new platform for the first time
- e.g. firewall rules preventing something from working
- Developers and IT often separated
- The bigger the system, the more uncertainties
More Releases
Increase release frequency: you will either suffer more or find a way to make it easy.
Continuous Integration:
- Has automated builds with testing
- Enforces self-contained sources
- Requires deployment scripts to be written
- Is done many times after each commit
Benefits of Continuous Delivery:
- Empowers teams to test and deploy the build you want
- Reduces errors; avoid error-prone, un-versioned configurations
- Lowers stress: frequent and smaller changes and a rollback process
Deployment Strategies
Adequate preparation.
You need to:
- Model the target environment
- Prepare the deployment infrastructure
- Understand server configuration
- Define the deployment strategy
Automate as much as possible:
- Use third-party libraries/servers that use textual config files; avoid GUI
- Use build chains and dependency management systems (e.g. maven, gradle)
- Prepare virtual environments (e.g. docker)
Create a disaster recovery process (e.g. rolling back DB to previous schema)
Configuration Management
Keep everything under version control:
- Source files
- Configuration files including DNS zone files, firewall settings etc.
- BUT NOT PASSWORDS/SENSITIVE DATA; store these only locally (environment variables)
Think about dependencies: automated dependency retrieval is handy, but try to avoid having to ‘download the internet’.
Deployment Pipeline
- Commit stage: on the development branch, commit, test and review, run CI pipeline
- Acceptance stage: acceptance tests
- User Acceptance Tests (pre-release stage): user tests
- Capacity stage: capacity tests
- Production: smoke tests
After each stage, review the results/metadata from the pipeline.
Store the last few binaries in the artefact repository so that you can easily rollback.
Practices
Build only once:
- Even a slight difference in build environment may cause issues
- Use the build from acceptance stage
- Production servers cannot be updated as frequently so the libraries in there may be different; by making the build constant, it narrows down where the issue may be
- Deploy the same way everywhere
- Smoke-test deployments’ script the automated start-up of the system with a few simple requests
- Deploy a copy of production with the same networks, firewalls, OS and application stack
- Propagate changes into the whole pipeline as they appear; if a bug appears, go through the entire process again
Creating Deployment Strategies
- Collaborate with all parties in charge of the environments
- Create a deployment pipeline plan and configuration management strategy
- Create a list of environment variables (secrets) and a process of adding them
- A list of monitoring requirements and solutions
- Discuss when third-parties are part of the testing
- Create a disaster recovery plan
- Agree on a service level agreement (SLA) and on support
- Create an archiving strategy of outdated data
Zero Downtime Releases
Negate as much downtime as possible (or at least, during peak times).
Hotplug facilities exist:
- Decouple the system, modularizing the code so that pieces can be migrated
- Offers roll-back to the previous version if necessary
If not possible:
- Reroute web-based systems
- If data migrations are required, a data freeze may be required
Blue-Green Deployment
In distributed systems e.g. web apps, re-routing is easy and it is possible to have multiple versions available at the same time.
Have two identical environments, blue and green for each server type (e.g. web server, application server, database server), using a router to determine which slice is being used at any given time.
Canary Deployment
Real-world systems are not easily cloneable:
- Real-work constraints not fully testable
- Scalability/response-time
Hence, use canary deployment development:
- Deploy a subset of servers with the new version
- Define a subset of users that will receive the new version (e.g. by IP address)
- Gradually increase the size of the set