The Strangler Fig Migration That Saved a 10-Year-Old Monolith

February 11, 2026·ScaledByDesign·

migrationmonolithmicroserviceslegacy

The Rewrite That Almost Killed the Company

A Series B e-commerce company came to us with a familiar story: their 10-year-old PHP monolith was "unmaintainable." The previous CTO had started a ground-up rewrite in Node.js. Eighteen months and $2M later, the rewrite was 40% complete, had zero production traffic, and the original monolith was still getting worse.

They were about to double down on the rewrite. We talked them out of it.

If you've ever inherited a legacy codebase and thought "we should just rewrite this from scratch," you're not alone. It's one of the most tempting — and most dangerous — decisions in software engineering. Let us show you what we did instead.

Why Big-Bang Rewrites Fail

Here's the uncomfortable truth: most ground-up rewrites of production systems fail or dramatically exceed their timeline. And the reasons aren't what you'd expect — they're structural, not technical:

The moving target problem: The old system keeps getting new features while you rewrite. Feature parity is a finish line that keeps moving.
Hidden complexity: That "ugly" legacy code handles thousands of edge cases discovered over a decade. Clean-room rewrites rediscover each one painfully.
Team fatigue: Rewrite projects feel exciting in month 1 and soul-crushing by month 12. No new features, no user impact, just endless catch-up.
The 80/20 trap: You can rewrite 80% of functionality in 20% of the time. The last 20% — the weird edge cases, integrations, and business rules — takes forever.

The Strangler Fig Pattern

So what's the alternative? It's called the strangler fig pattern, named after the tropical fig that grows around a host tree and eventually replaces it. (Nature's been doing incremental migrations for millions of years — we might as well learn from it.)

The idea is simple: you put a routing layer in front of the legacy system, then redirect traffic to new implementations one endpoint at a time. The old system keeps handling everything you haven't migrated yet.

The key insight — and this is what makes it psychologically different from a rewrite — you never turn off the old system. There's no big-bang cutover. No "go/no-go" meeting at 2am on a Saturday.

Step 1: Put a Router in Front

Before writing a single line of new application code, we put an nginx reverse proxy in front of the PHP monolith:

upstream legacy {
  server php-monolith:80;
}
 
upstream new_api {
  server node-api:3000;
}
 
server {
  # Default: everything to legacy
  location / {
    proxy_pass http://legacy;
  }
 
  # Migrated endpoints go to new service
  # (empty for now — we add these incrementally)
}

At this point, nothing changed about how the system worked. The proxy was completely transparent. But it gave us something powerful: the ability to redirect any endpoint to a new service with a single config change. That's the foundation everything else builds on.

Step 2: Identify the Migration Order

Here's where teams often go wrong — they start migrating the easiest endpoints, not the most valuable ones. We scored every endpoint on three dimensions:

Factor	Question	Weight
Business value	How critical is this endpoint to revenue?	40%
Change frequency	How often does this code change?	35%
Complexity	How tangled is this with other code?	25%

Priority matrix:

High value + High change + Low complexity = Migrate FIRST
High value + Low change + Any complexity = Migrate LATER
Low value + Any change + High complexity = Migrate LAST (or never)

The product catalog API was our first target: high traffic, changed weekly, and relatively self-contained.

Step 3: Build, Shadow, Switch

This is where it gets interesting. For each endpoint, we didn't just build and ship. We ran all three phases in parallel — and the shadow phase is what saved us from countless production issues:

// Phase A: Build the new implementation
// New service reads from same database as legacy
app.get("/api/products/:id", async (req, res) => {
  const product = await productService.getById(req.params.id);
  res.json(product);
});
 
// Phase B: Shadow traffic — both handle requests, compare results
app.get("/api/products/:id", async (req, res) => {
  const [newResult, legacyResult] = await Promise.all([
    productService.getById(req.params.id),
    legacyProxy.get(`/api/products/${req.params.id}`),
  ]);
 
  if (!deepEqual(newResult, legacyResult)) {
    logger.warn("Response mismatch", {
      endpoint: `/api/products/${req.params.id}`,
      diff: generateDiff(legacyResult, newResult),
    });
  }
 
  // Still serve legacy response during shadow phase
  res.json(legacyResult);
});
 
// Phase C: Switch traffic to new service via nginx config
// One config change. Instant. Reversible. Zero downtime.

The shadow phase was the secret weapon. We ran it for 2 weeks per endpoint, comparing every response between old and new. And here's the fun part — most of the mismatches we caught? They were bugs in the legacy system that nobody knew about. The migration actually improved data quality as a side effect.

Step 4: Strangle Incrementally

With the pattern established, we settled into a rhythm. Over 8 months, we migrated endpoints in priority order:

Month 1: Product catalog API (read-only, high traffic)
Month 2: Search API (complex but self-contained)
Month 3: User authentication (needed modern JWT)
Month 4: Cart and checkout (highest business value)
Month 5: Order management (complex, many edge cases)
Month 6: Inventory sync (integration-heavy)
Month 7: Reporting and analytics (new data models)
Month 8: Admin tools (lowest priority, highest complexity)

Each migration followed build → shadow → switch. Each was independently deployable and reversible.

The Database Problem

Now, everything we've talked about so far is the easy part. Seriously. The hardest part of any migration isn't the code — it's the data. During the transition period, both systems need to read and write the same data without stepping on each other.

We used CDC (Change Data Capture) to keep databases in sync during the transition:

const cdcStream = createCDCStream({
  source: "legacy_postgres",
  tables: ["products", "categories", "inventory"],
  target: "new_postgres",
  transform: (record) => schemaMapper.transform(record),
  onConflict: "source_wins", // Legacy is truth until cutover
});

The hybrid approach added complexity, but it let us migrate databases one domain at a time — just like the code.

The Results

So, did it work? After 8 months of incremental migration:

Metric	Before	After
Deploy frequency	Every 2 weeks	Multiple times/day
Deploy time	45 minutes	3 minutes
Incidents/month	3-4	~1
Developer onboarding	6 weeks	2 weeks
Rollbacks needed	N/A	3 (resolved in <5 min each)

Total cost: ~$400K over 8 months. The failed rewrite had already burned $2M with nothing to show for it.

Zero downtime during the entire migration. Not a single customer-facing outage caused by the migration itself.

When the Strangler Fig Doesn't Work

We're not going to pretend this is a silver bullet. It's not always the right choice, and we'd be doing you a disservice if we didn't say so:

Use the strangler fig when:

The legacy system is in production with real traffic
The business can't afford downtime or feature freezes
The system is large enough that a rewrite would take 6+ months
You can put a routing layer in front of the legacy system

Consider a rewrite when:

The system is small (under 10K lines of code)
No production traffic depends on it
The technology is truly dead (no security patches available)
You're changing the fundamental architecture, not just the language

What This Means for Your Legacy System

Here's the uncomfortable truth that nobody talks about: every system becomes legacy eventually. The PHP monolith we strangled was someone's clean, modern architecture 10 years ago. The Node.js services we built will be someone else's legacy in 2036. Probably sooner.

The goal isn't to build a system that never needs replacing — that's a fantasy. It's to build systems that can be replaced incrementally when the time comes. Clear boundaries. Well-defined APIs. Services that are independent enough to swap out one at a time.

The strangler fig doesn't just replace old code — it teaches you how to build systems that are replaceable by design. And honestly? That's a more valuable lesson than any technology choice you'll ever make.

Prompt Engineering Is Dead — Context Engineering Is What Matters

The Strangler Fig Migration That Saved a 10-Year-Old Monolith

February 11, 2026·ScaledByDesign·

migrationmonolithmicroserviceslegacy

The Rewrite That Almost Killed the Company

They were about to double down on the rewrite. We talked them out of it.

Why Big-Bang Rewrites Fail

The moving target problem: The old system keeps getting new features while you rewrite. Feature parity is a finish line that keeps moving.
Hidden complexity: That "ugly" legacy code handles thousands of edge cases discovered over a decade. Clean-room rewrites rediscover each one painfully.
Team fatigue: Rewrite projects feel exciting in month 1 and soul-crushing by month 12. No new features, no user impact, just endless catch-up.
The 80/20 trap: You can rewrite 80% of functionality in 20% of the time. The last 20% — the weird edge cases, integrations, and business rules — takes forever.

The Strangler Fig Pattern

Step 1: Put a Router in Front

Before writing a single line of new application code, we put an nginx reverse proxy in front of the PHP monolith:

upstream legacy {
  server php-monolith:80;
}
 
upstream new_api {
  server node-api:3000;
}
 
server {
  # Default: everything to legacy
  location / {
    proxy_pass http://legacy;
  }
 
  # Migrated endpoints go to new service
  # (empty for now — we add these incrementally)
}

Step 2: Identify the Migration Order

Here's where teams often go wrong — they start migrating the easiest endpoints, not the most valuable ones. We scored every endpoint on three dimensions:

Factor	Question	Weight
Business value	How critical is this endpoint to revenue?	40%
Change frequency	How often does this code change?	35%
Complexity	How tangled is this with other code?	25%

Priority matrix:

High value + High change + Low complexity = Migrate FIRST
High value + Low change + Any complexity = Migrate LATER
Low value + Any change + High complexity = Migrate LAST (or never)

The product catalog API was our first target: high traffic, changed weekly, and relatively self-contained.

Step 3: Build, Shadow, Switch

This is where it gets interesting. For each endpoint, we didn't just build and ship. We ran all three phases in parallel — and the shadow phase is what saved us from countless production issues:

// Phase A: Build the new implementation
// New service reads from same database as legacy
app.get("/api/products/:id", async (req, res) => {
  const product = await productService.getById(req.params.id);
  res.json(product);
});
 
// Phase B: Shadow traffic — both handle requests, compare results
app.get("/api/products/:id", async (req, res) => {
  const [newResult, legacyResult] = await Promise.all([
    productService.getById(req.params.id),
    legacyProxy.get(`/api/products/${req.params.id}`),
  ]);
 
  if (!deepEqual(newResult, legacyResult)) {
    logger.warn("Response mismatch", {
      endpoint: `/api/products/${req.params.id}`,
      diff: generateDiff(legacyResult, newResult),
    });
  }
 
  // Still serve legacy response during shadow phase
  res.json(legacyResult);
});
 
// Phase C: Switch traffic to new service via nginx config
// One config change. Instant. Reversible. Zero downtime.

Step 4: Strangle Incrementally

With the pattern established, we settled into a rhythm. Over 8 months, we migrated endpoints in priority order:

Month 1: Product catalog API (read-only, high traffic)
Month 2: Search API (complex but self-contained)
Month 3: User authentication (needed modern JWT)
Month 4: Cart and checkout (highest business value)
Month 5: Order management (complex, many edge cases)
Month 6: Inventory sync (integration-heavy)
Month 7: Reporting and analytics (new data models)
Month 8: Admin tools (lowest priority, highest complexity)

Each migration followed build → shadow → switch. Each was independently deployable and reversible.

The Database Problem

We used CDC (Change Data Capture) to keep databases in sync during the transition:

const cdcStream = createCDCStream({
  source: "legacy_postgres",
  tables: ["products", "categories", "inventory"],
  target: "new_postgres",
  transform: (record) => schemaMapper.transform(record),
  onConflict: "source_wins", // Legacy is truth until cutover
});

The hybrid approach added complexity, but it let us migrate databases one domain at a time — just like the code.

The Results

So, did it work? After 8 months of incremental migration:

Metric	Before	After
Deploy frequency	Every 2 weeks	Multiple times/day
Deploy time	45 minutes	3 minutes
Incidents/month	3-4	~1
Developer onboarding	6 weeks	2 weeks
Rollbacks needed	N/A	3 (resolved in <5 min each)

Total cost: ~$400K over 8 months. The failed rewrite had already burned $2M with nothing to show for it.

Zero downtime during the entire migration. Not a single customer-facing outage caused by the migration itself.

When the Strangler Fig Doesn't Work

We're not going to pretend this is a silver bullet. It's not always the right choice, and we'd be doing you a disservice if we didn't say so:

Use the strangler fig when:

The legacy system is in production with real traffic
The business can't afford downtime or feature freezes
The system is large enough that a rewrite would take 6+ months
You can put a routing layer in front of the legacy system

Consider a rewrite when:

The system is small (under 10K lines of code)
No production traffic depends on it
The technology is truly dead (no security patches available)
You're changing the fundamental architecture, not just the language

What This Means for Your Legacy System

Prompt Engineering Is Dead — Context Engineering Is What Matters

The Strangler Fig Migration That Saved a 10-Year-Old Monolith

The Rewrite That Almost Killed the Company

Why Big-Bang Rewrites Fail

The Strangler Fig Pattern

Step 1: Put a Router in Front

Step 2: Identify the Migration Order

Step 3: Build, Shadow, Switch

Step 4: Strangle Incrementally

The Database Problem

The Results

When the Strangler Fig Doesn't Work

What This Means for Your Legacy System

Ready to Ship?

The Strangler Fig Migration That Saved a 10-Year-Old Monolith

The Rewrite That Almost Killed the Company

Why Big-Bang Rewrites Fail

The Strangler Fig Pattern

Step 1: Put a Router in Front

Step 2: Identify the Migration Order

Step 3: Build, Shadow, Switch

Step 4: Strangle Incrementally

The Database Problem

The Results

When the Strangler Fig Doesn't Work

What This Means for Your Legacy System

Ready to Ship?