Avoid rewriting a legacy system from scratch, by strangling it

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Sometimes, code is risky to change and expensive to refactor.

In such a situation, a seemingly good idea would be to rewrite it.

From scratch.

Here’s how it goes:

  1. You discuss with management about the strategy of stopping new features for some time, while you rewrite the existing app.
  2. You estimate the rewrite will take 6 months to cover what the existing app does.
  3. A few months in, a nasty bug is discovered and ABSOLUTELY needs to be fixed in the old code. So you patch the old code and the new one too.
  4. A few months later, a new feature has been sold to the client. It HAS TO BE implemented in the old code—the new version is not ready yet! You need to go back to the old code but also add a TODO to implement this in the new version.
  5. After 5 months, you realize the project will be late. The old app was doing way more things than expected. You start hustling more.
  6. After 7 months, you start testing the new version. QA raises up a lot of things that should be fixed.
  7. After 9 months, the business can’t stand “not developing features” anymore. Leadership is not happy with the situation, you are tired. You start making changes to the old, painful code while trying to keep up with the rewrite.
  8. Eventually, you end up with the 2 systems in production. The long-term goal is to get rid of the old one, but the new one is not ready yet. Every feature needs to be implemented twice.

Sounds fictional? Or familiar?

Don’t be shamed, it’s a very common mistake.

I’ve rewritten legacy systems from scratch before. Sometimes it’s all worked out, and sometimes it hasn’t, but either way: it’s always been a lot more work than I could have possibly estimated. I’ve learned now to try to avoid doing so: at least, to avoid replacing a single monolithic (living) system in a monolithic way. Nicholas gives an even-better description of the true horror of legacy reimplementation, and promotes progressive strangulation as a candidate solution.

2 comments

  1. Spencer Spencer says:

    I don’t see this. I feel like the motivation for a rewrite is that you want to make some fundamental change, like switching frameworks or languages. In this case it’s hard to proxy the old code and will probably incur a performance hit (e.g. from calling the old tool as a separate process and parsing the output).

    1. Dan Q Dan Q says:

      Interesting take. But Moore’s Law is still on the side of (some kind of) proxying. So long as your legacy system ran for at least a few years (it did, right?) then you’d expect to be in a reasonable position to emulate it, all being well, by the time you’re replacing it. Or move it behind a reverse proxy (reverse proxies are so fast these days that people introduce third-party doglegs like Cloudflare in order to gain their other benefits), whichever’s appropriate. For some legacy applications, such a move can actually be an improvement, e.g. there are sometimes performance benefits in emulated filesystems or in caching reverse proxies and there are often security benefits in adding a more modern, patchable system in front of your old, hard-to-maintain one.

      I’m not defending the author of this piece completely, mind. I’ve also done reimplementations that were absolutely the right choice. But I agree with them that rewrites are overused.

Reply here

Your email address will not be published. Required fields are marked *

Reply on your own site

Reply by email

I'd love to hear what you think. Send an email to b16772@danq.me; be sure to let me know if you're happy for your comment to appear on the Web!