Why does our deployment always break?

April 22, 2025

Every software developer knows this: there are problems with deployment again. Sometimes the build fails, even though it’s been running the whole time; other times everything is slow. And in the end, everyone sits at the application, frantically clicking around to make sure everything works. I think there’s a lot wrong here, and that’s why I’m trying to shed some light on it in this article. That’s why we’re looking at a few problems I see again and again.

Deadline, Deployment Today

In my career, I’ve seen many different approaches to this phenomenon. For many Scum-like teams, this day is sometime near the end of the sprint. In more traditional models, it’s often just before the handover to the customer or the other team. Sometimes, though, it’s just every Tuesday or Wednesday.

Notice anything here? — A day is set, nothing more!

Often, deployment is simply ritualized. The deployment is simply done on the deadline. The changes included are often large and risky, but at the same time, not fully tested.

What’s interesting is that I often find that the teams are aware of this. Very often, they also try to solve the problem, and I’ve observed a few patterns repeatedly:

Our tickets are too large — we need to create smaller tickets
Our tickets aren’t specific enough — the requirements need to be more precise
Deploying on a Friday — that’s too risky, we won’t do it
Deploying one day before the end of the sprint — that always fails, we’ll do it two days before the end of the sprint

And do the measures help?

Taken on their own, these measures are often useful. However, they rarely improve deployment and merely reduce the pressure to succeed.

The real problem here is the commitment to a specific time! This creates unnecessary pressure, for example, when a solution absolutely has to be deployed. It also creates the impression that you can only deploy on these days. Conversely, some developers practically rely on the fact that extra caution is exercised on these days. If such code fails on deployment day, it’s often so old that no one remembers it anymore.

Thus, these deployments on certain days generate unnecessary work and tie up many developers. That’s a lot of valuable time for a relatively small return.

No deadline, but deployment still happens

The solution to this problem is simply to deploy more frequently. Here, however, we have to deal with other questions, although we can see that they are already more concrete than before.

How do we ensure that only what is being deployed is actually pending?
What happens to incomplete developments?
A rollback still costs us a lot of time and energy.
Do we now test everything every day after deployment?

All of these questions should be answered in detail, which is why I will address them in more detail in other articles. For now, I’d just like to throw out a few bullet points that will at least help answer these questions:

Test pyramid
Test-driven development
Feature flags / feature toggles
Forward-only deployments

Everyone comes here, we have to test the application

I’ve seen this behavior frequently as well. After lunch, everyone meets and discusses what needs to be done. One person has to execute the deployment, another checks that the database is still alive. Another person types the URL into the browser and tries to find out if the application is still running. Due to all the caches, various requests are false positives and only become apparent as errors later. Another person checks the numbers. Are the users staying or leaving? Everyone works together on an Excel list that supposedly contains important test scenarios.

Teams with a very poor structure and unclear rules in particular often take this approach. There’s simply a lack of solid tests here! Unfortunately, the latest software trends are taking us even further in this direction. Newer JavaScript/TypeScript frameworks are often barely tested and documented! Our ubiquitous artificial intelligence also often produces duct-taped code, and tests are flawed or missing altogether.

Run the tests, we want to deploy

To prevent teams from wasting unnecessary time monitoring deployments, three measures are important:

The most important measure is good tests that run before deployment
Good monitoring, especially of critical components
Effective, quick, and targeted smoke tests after deployment

These are again the most important points, but I will discuss them in detail elsewhere.

Test pyramid
Test-driven development
Feature flags / feature toggles

Teams that have never written many tests often fall into the misconception that they have to cover everything with E2E tests and smoke tests. As a result, a test suite in Selenium / Playwright / Ghostinspector runs the entire time, trying to capture all the details. This type of testing is a classic Testing-Cone and is a very, very bad practice.

The deployment is built from the deployment branch

I haven’t come across this habit very often, but it does occur from time to time. Here, teams work on one branch within their revision control system, e.g., develop, but deployment to different systems is done from other branches, e.g., main for production, or release for staging. However, a particularly insidious problem lurks here, because, strictly speaking, code that has never been tested can often be executed in production! What’s particularly annoying is that several prominent strategies, such as Gitlab Flow, advocate this approach, or at least don’t explain it easily.

What exactly is the problem?

The problem is that there can be two build processes for the same (but not identical) code, resulting in different artifacts. I’ve often observed that developers submit code to one branch (development). Then, periodically, this branch is merged into another, such as main. The code is then built a second time, and this code is then deployed. Sometimes this second process fails, e.g., because the runtime has changed. And then it becomes clear that deployment depends on successfully building everything twice!

The deployment is created from an artifact

This is the most complex solution, as entire processes need to be adapted. For most teams, a process based on branches simply isn’t suitable. As a rule, it’s better to choose a simpler approach, with trunk-based being the most important keyword here. You can also base your approach on Microsoft’s strategy.

Furthermore, the CI/CD process must be adapted so that every build creates a valid artifact that isn’t subsequently modified.

It’s important to note that, in principle, every successful build can represent a valid deployment!

However, I’ll explain this conversion in detail elsewhere.

An artifact is an archive containing everything necessary for deployment. A typical artifact might be a Docker image, a jar file for Java projects, or a zip file containing the entire website, including images and other assets.

Conclusion

I certainly haven’t described everything here that can cause a deployment to fail.

Nevertheless, I hope I’ve provided some food for thought without being too pretentious.

I plan to write a series of articles on the vast topic of deployments to delve deeper into individual aspects.