What Is Software Regression — And Why It Hits Hardest When You Can’t Stop to Fix It



// TECHNICALLY SPEAKING

What Is Software Regression — And Why It Hits Hardest When You Can’t Stop to Fix It

$ git log –oneline –since=yesterday
Updated 2026-06-16

Yesterday felt like going backwards. I fixed eight things and broke three of them again by the end of the day. The word for that in software is regression — and it’s one of the most demoralizing bugs to catch when you’re also the one answering phones and driving to serve papers.

I run a real California field-services company. It’s fully automated — intake, invoicing, dispatch, proof generation, client comms, all of it — on software I wrote myself. No computer science degree. I built it because I had to, and because I could see exactly what a working system needed to do. Over 1,000 cases through it now. It works.

But yesterday was one of those days where the system works and you’re the problem. You fix something that’s been broken for a week, push it live, go serve someone’s legal papers, come back an hour later, and a thing that was working before your push isn’t working anymore. You opened a hole somewhere. And you don’t know where. And there’s another serve in 45 minutes.

This is what software people call regression. Not because you failed. Because the thing you changed had a relationship with something else you didn’t think about. Let me explain it from first principles — what it is, why it happens, and the one discipline that actually stops it from compounding. You don’t need a degree. You need the right mental model.

1. What regression actually is

// THE PRINCIPLE

A software regression is when working behavior stops working after you change something. The thing you changed might be totally correct. The break happens somewhere else — in a piece of code that depended on the old behavior and didn’t know the behavior changed.

The key word is “was working.” A regression is not a new bug. It’s an old feature, breaking. That distinction matters because it means your fix introduced the problem — and the person who introduced it is you, right now, with code you thought you understood.

// WHY IT WORKS THIS WAY

Software is not a set of independent tools. It’s a web of relationships. Every function you write has expectations about the world it lives in — what format data arrives in, what keys exist in a dict, what a status value means. When you change one thing in that web, every function that touched the thing you changed gets a new neighbor it wasn’t introduced to.

Most of the time the new neighbor is compatible. Sometimes it isn’t. And the gap between “I think this is compatible” and “I know this is compatible” is exactly where regressions live.

// A REAL EXAMPLE

Yesterday I fixed a bug where my draft detection logic was using a string comparison that was too strict. A case outcome of "SERVED — left with resident" wasn’t matching my classifier that looked for "served". Fixed it. Made the comparison case-insensitive and prefix-aware. Pushed it. Came back from a serve two hours later to find that my dedup system — which also called the same string comparison — was now treating cases as “re-serveable” that it was supposed to leave alone. Same fix, different consumer, opposite effect. The classifier got smarter and the dedup guard got confused, because neither one was consulting a shared definition of “done.” Two independent interpretations of the same string. One fix, one regression.

2. Why it’s brutal when you’re building and operating at the same time

// THE PRINCIPLE

In a normal software company there’s a wall between the people who build the thing and the people who run it. QA tests the build before it ships. A staging environment absorbs the regressions before real users do. And if something breaks in production, there’s a whole on-call rotation to absorb the heat while engineering patches it.

Solo operators don’t have any of that. When I push a bug at noon, I find out about it at 2 PM when a client emails asking why their update didn’t fire. I’m usually in the car. I pull over, diagnose it on my phone, write the hotfix in my head, and either wait until I’m back at the desk or do something that’s almost definitely going to introduce the next regression.

// WHY IT COMPOUNDS

The danger isn’t a single regression. It’s the emergency patch you write for the regression, under time pressure, without proper context, that introduces the next one. This is the regression cascade — and it’s a real phenomenon that veteran engineers are trained to avoid by slowing down when the urge to sprint is highest.

I failed at this yesterday. Fixed a broken dedup window, which uncovered a classifier issue, which my rushed fix for the classifier broke the email batching logic downstream. By 6 PM I had solved the original problem from the morning and created two new problems that hadn’t existed at 8 AM. That’s what regression days feel like. Every step forward reveals the next hole.

# yesterday, in roughly chronological order
08:00 fix dedup window bug → pushed
10:30 dedup fix exposes classifier mismatch
11:15 fix classifier → pushed, between serves
14:00 client asks why their status email didn’t fire
14:12 classifier fix broke email batching
14:45 fix batching → pushed from parking lot
17:00 batching fix introduced timing race
18:30 one more push. holding my breath.

3. The root cause: hidden dependencies

// THE PRINCIPLE

Regressions have one root cause: a piece of code depends on something you changed, and that dependency wasn’t visible when you made the change. The dependency is hidden — either because it’s implicit (inferred from context rather than named), because it’s not tested, or because the system grew complex enough that the full dependency graph isn’t in anyone’s head anymore.

// WHY IT’S HIDDEN

Every function you write knows about its inputs and outputs. What it doesn’t know is who’s calling it, and what they’re assuming about what it returns. The callers are the hidden dependents. When you change what a function returns — even to make it more correct — every caller absorbs that change silently, with no warning, until the program runs and something breaks.

In a large team this gets handled by code review, integration tests, and keeping modules small enough that a change’s blast radius is predictable. Solo, you have to substitute rigor for headcount. You can’t have a second pair of eyes on every diff. But you can have a test that acts like one.

# the hidden dependency pattern
 
def classify_outcome(raw: str) -> str:
    return “served” if “served” in raw else “unknown”
 
# caller A — knows about this
if classify_outcome(job.outcome) == “served”:
    send_completion_email(job)
 
# caller B — also depends on “served” meaning “definitely done”
# but nobody told caller B the definition could change
if classify_outcome(job.outcome) != “served”:
    schedule_retry(job)  # 💀 fires on every “partially_served” too now

The fix to Caller A was right. But Caller B existed in the same system, reading the same function, and had a completely different interpretation of what the edge cases mean. I didn’t touch Caller B. Caller B broke anyway.

4. The discipline that actually helps: reproduce first, lock it after

// THE PRINCIPLE

The single most valuable thing you can do when you find a bug — before you write a single line of fix — is write a test that reproduces it. Not to understand it better, though that happens. To prove it’s actually broken, so you know what broken looks like, so you can confirm the fix without guessing.

Then after the fix passes, you leave the test in. That test is now your regression guard. It runs every time you push. The specific breakage you just fixed can never silently come back.

// WHY THIS WORKS

The instinct when you’re under pressure is to jump straight to the fix. You think you know what’s wrong. You’re probably right. But “probably right” is not a system — it’s a guess with good odds. The reproduction test converts the guess into a verifiable claim. You’re not fixing the thing you think is broken; you’re fixing the specific condition that the test failed on. That’s a much narrower, more precise target.

And leaving the test behind is what stops the regression. You’re creating a permanent record of a failure mode. Every future change to the system runs against that record. If anything breaks that exact behavior again, you know immediately — not two hours later when a client notices. The fix cost you an hour. The test returns that hour, compounded, every time it fires. Over 1,000 cases I’ve found that the regressions that never came back are the ones that got a test. The ones that came back twice are the ones I fixed in a hurry without writing one.

// APPLY IT

# 1. reproduce the bug first — a test that FAILS right now
def test_served_with_note_classifies_correctly():
    # this is the exact input that was breaking yesterday
    result = classify_outcome(“SERVED — left with resident at 2:14 PM”)
    assert result == “served”  # was returning “unknown” — proves the bug exists
 
# 2. write the fix
def classify_outcome(raw: str) -> str:
    text = (raw or “”).strip().lower()
    if text.startswith(“served”): return “served”
    if text.startswith(“attempted”): return “attempted”
    return “unknown”  # safe default
 
# 3. run the test again — it should pass now
# 4. leave the test in — this specific break can never silently return

The test runs in under a second. It lives in my test suite. If I change classify_outcome again six months from now for a completely unrelated reason, the test screams if it breaks this behavior. That’s the compound return on one hour of attention.

The thing I did wrong yesterday is what most solo builders do under pressure: fixed the output first, then skimmed the change, then pushed because there was somewhere to be. The fix was correct. The test wasn’t there to catch the downstream caller it broke. Now both are there.

5. The real cost — and why persistence is a technical skill

Yesterday was exhausting. Not because the bugs were hard — they were all findable. It was exhausting because every fix happened between serves, between phone calls, between doing the actual job the business runs on. There’s no pause button on operations. The pipeline doesn’t stop moving because the engineer is also in the field.

The thing I’ve learned after 1,000+ cases through this system is that regression days are part of the job for a solo builder-operator. Not a sign the system is broken. A sign the system is alive enough to have surface area. Static software doesn’t regress. Evolving software does — and the evolution is what makes it better.

The temptation when a day goes like this is to stop pushing changes. Leave it stable. Don’t touch anything that isn’t on fire. I understand why people do that. But it’s the wrong instinct. The answer isn’t to stop shipping — it’s to ship with better guards. Reproduction tests. Regression tests. A short list of behaviors that must stay true every time you change anything.

You build that list one bug at a time. Yesterday I added six tests I didn’t have this morning. The day felt like regression. The code is more resilient than it was 24 hours ago.

That’s the discipline. Not never breaking things. Knowing, every time you fix something, that the break can’t come back without you finding out immediately. You write that guarantee into the system one test at a time, and eventually you’ve got a machine that tells on itself before your clients do.

$ whoami

Jesse Moraga — registered California process server, PS-124, who built the system that runs his company from scratch. Fresno, CA. Over 1,000 cases handled personally. github.com/JesseMoraga · jessemoraga.com/system

Leave a comment