How to deal with spaghetti code

The other day, one of my students asked me how I deal with tangled, messy code when inheriting projects other people wrote or started.

It was timely, because on a recent client project, I inherited a project with a single 2,200+ line JavaScript containing code for half a dozen different pages. A lot of it was copy/paste repeated, and none of it was documented.

Let’s dig in!

The first rule of refactoring: don’t break stuff!

One of my other students quickly pointed out that the most important thing when tackling a project like this is to not break stuff.

Working, imperfect code is better than perfect code that breaks a bunch of stuff in the process. And if you try to do a full from-scratch rewrite, you WILL break a bunch of stuff.

Tech debt, unfortunately, needs to get paid down slowly over time. Big rewrites often just introduce their own tech debt.

My process

Here’s my general process for doing something like this (in this order)…

Modularize the code base.
Add documentation to everything.
Write tests, once you figure out what the code is supposed to do.
Refactor/rewrite things.

The idea behind this structure is to break things up into smaller and more readable or manageable parts, then identify what everything does and how it should work.

Once you understand the system, then you can start to change it.

I’m admittedly terrible about writing tests, but they provide the added safety of catching things that break as you refactor.

Some examples

With that big client project I mentioned earlier, the first thing I did was break the 2,200+ lines of JavaScript into about 8 smaller files.

It was actually relative easy for this project because the original author had grouped collections of functions (for different pages) into objects…

const homepage = {
	// ...
};

const scheduling = {
	// ...
};

On some projects, it’s less clear what goes where.

I originally just pulled them all into standalone files that I loaded globally. The next step is to convert them into ES modules so that you can import them only where needed, but that’s the kind of refactor that can break working code, so it waits.

Next, I added documentation to everything using JSDoc.

This is important because it forces you to read the function line-by-line, parse out what’s happening, and document it. It’s in my experience the best way to really understand what’s happening in a code base, and in particular, which parts are unclear or that need to be looked at further.

Once that’s done, if you can write tests for the things you understand. I personally favor integration tests for stuff like this, but unit tests can be useful, too.

Then, finally, you can refactor things.

I usually start with the easy, low-hanging fruit. For example, there was a function in the code base to get query string parameters from the URL. It used an older regex-based approach.

I was able to swap that out for the URLSearchParams() object, without having to change anything about how the function works or is run in the code base.

I also located any repeated code, pulled it out into a shared utilities.js file, and eliminated some redundancies there.

Every project will be different

Some projects may already be modularized but poorly documented. Some may be very well documented, but with everything in one giant file. Some may have tests already.

In my experience, a lot of spaghetti code is cooked up because developers are being pushed too fast and the output expectations are too high. The result is code that works well in the short term but because a maintenance liability in the long run.