ReDoS: Regular Expression Denial of Service Attacks

A single HTTP request containing a carefully crafted string can make your Node.js server spend minutes — or hours — evaluating a regular expression. No buffer overflow. No memory corruption. Just a mathematical property of backtracking regex engines combined with a poorly written pattern. This is ReDoS: Regular Expression Denial of Service. It is easier to introduce than SQL injection, harder to spot in code review, and surprisingly common in production applications.

How Regex Backtracking Works

Most programming languages use backtracking NFA (non-deterministic finite automaton) engines for regex matching. When a match fails, the engine backs up and tries alternative paths. For most patterns on most inputs, this is fast. But certain pattern-input combinations trigger exponential backtracking — the number of paths the engine must explore grows exponentially with input length.

The classic vulnerable pattern structure involves:

A group that can match the same string in multiple ways (ambiguous quantifiers)
Something at the end that will fail to match

Pattern: (a+)+$
Input:   aaaaaaaaaaaaaaab

The engine tries to match (a+)+ against the as in multiple ways: one group of 16 as, two groups of 8, four groups of 4, and so on. With each combination failing at the trailing b, the number of attempts is 2^16 = 65,536. Add one more a and it doubles. This is catastrophic backtracking.

Real-World Vulnerable Patterns

Vulnerable patterns follow recognizable structures. The key indicator is nested quantifiers on overlapping character classes:

// Vulnerable: nested + with overlapping alternatives
/^(a+)+$/
/(a|aa)+$/
/([a-zA-Z]+)*$/

// Vulnerable: overlapping alternatives
/(a|a?)+$/
/^(([a-z])+.)+[A-Z]([a-z])+$/

// Real-world validator.js vulnerability (CVE-2021-3749)
// Email regex that was catastrophically slow on long invalid inputs
/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}...)/

The danger in real code is that these patterns appear in input validation — exactly where untrusted user data arrives.

Timing-Based Detection

You can often detect ReDoS vulnerabilities by testing with crafted inputs and measuring execution time:

const { performance } = require('perf_hooks');

function testReDoS(regex, safeInput, evilInput) {
  const t1 = performance.now();
  regex.test(safeInput);
  const safeTime = performance.now() - t1;

  const t2 = performance.now();
  regex.test(evilInput);
  const evilTime = performance.now() - t2;

  console.log(`Safe input: ${safeTime.toFixed(2)}ms`);
  console.log(`Evil input: ${evilTime.toFixed(2)}ms`);
  console.log(`Ratio: ${(evilTime / safeTime).toFixed(0)}x slower`);
}

// Test a pattern
testReDoS(
  /^([a-zA-Z0-9])(([a-zA-Z0-9-]*)([a-zA-Z0-9]))?(\.[a-zA-Z]{2,6})+$/,
  'example.com',
  'aaaaaaaaaaaaaaaaaaaaaaaaaaaa!' // Evil input — no match + lots of 'a's
);

If the evil input is orders of magnitude slower than the safe input, you have a vulnerable pattern.

Static Analysis with safe_regex

safe-regex is an npm package that analyzes regex patterns for catastrophic backtracking potential without executing them:

npm install safe-regex

const safe = require('safe-regex');

console.log(safe(/^(a+)+$/));           // false — vulnerable
console.log(safe(/^[a-zA-Z0-9]+$/));    // true — safe
console.log(safe(/([a-z]+)+\d/));       // false — vulnerable

Integrate this into your test suite or linting pipeline to catch new vulnerable patterns:

// Jest test
describe('Regex safety', () => {
  const regexesToCheck = [
    /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/,
    /^(\+?)[0-9]{10,15}$/,
    /^[a-f0-9]{32}$/,
  ];

  regexesToCheck.forEach((regex) => {
    it(`${regex} should be safe`, () => {
      expect(safe(regex)).toBe(true);
    });
  });
});

ESLint Plugin for ReDoS Detection

eslint-plugin-security flags potentially vulnerable regex patterns as part of your normal lint workflow:

npm install --save-dev eslint-plugin-security

// .eslintrc.json
{
  "plugins": ["security"],
  "rules": {
    "security/detect-unsafe-regex": "error"
  }
}

This catches the most obvious vulnerable patterns during development, before they reach production.

The validator.js Risk

validator.js is one of the most popular input validation libraries, with over 10 million weekly downloads. It has historically contained ReDoS-vulnerable email, URL, and other validators. Several CVEs have been filed:

CVE-2021-3749 — ReDoS in isEmail() via crafted email-like strings
Multiple URL validation patterns with nested quantifiers

Always use the latest version of validator.js and test your validation functions with long, invalid-but-pattern-matching inputs:

const validator = require('validator');
const { performance } = require('perf_hooks');

// Generate an evil input for email validation
const evil = 'a'.repeat(100) + '@' + 'b'.repeat(100);

const t = performance.now();
validator.isEmail(evil);
console.log(`isEmail with evil input: ${(performance.now() - t).toFixed(0)}ms`);
// Should be <10ms. If it's seconds, you have a problem.

Mitigation 1: Input Length Limits

The simplest and most reliable mitigation is to limit input length before it reaches any regex:

app.post('/register', (req, res) => {
  const { email } = req.body;

  // Limit before regex validation
  if (!email || email.length > 254) {
    return res.status(400).json({ error: 'Invalid email' });
  }

  if (!validator.isEmail(email)) {
    return res.status(400).json({ error: 'Invalid email format' });
  }
});

ReDoS attacks require long inputs to be effective. A 254-character limit on email (the RFC 5321 maximum) makes catastrophic backtracking impractical — even vulnerable patterns cannot produce exponential behavior on short inputs.

Mitigation 2: Atomic Grouping and Possessive Quantifiers

Some regex engines (Java, PCRE) support atomic grouping (?>...) and possessive quantifiers (a++) which prevent backtracking into a matched group. Node.js's built-in regex engine does not support these, but the @nicolo-ribaudo/proposal-regexp-v-flag and the RE2 library do:

const RE2 = require('re2');

// RE2 uses a linear-time algorithm — no catastrophic backtracking possible
const safeRegex = new RE2(/^([a-zA-Z0-9])+$/);
safeRegex.test('a'.repeat(10000) + '!'); // Fast, regardless of input length

RE2 (Google's regex library) uses a finite automaton algorithm that guarantees O(n) matching time. The trade-off is that it does not support backreferences or lookaheads/lookbehinds. For most validation patterns, this is acceptable.

WAF Regex Hardening

WAFs using regex-based rules are themselves vulnerable to ReDoS. If an attacker can inject carefully crafted strings that match your WAF's inspection patterns slowly, they can degrade your WAF's processing capacity. When writing WAF rules:

Avoid nested quantifiers on the same character class
Test all WAF regex rules with long inputs before deployment
Use dedicated WAF rule testing tools (OWASP ModSecurity test harness)
Set per-request inspection time limits in your WAF configuration

ReDoS is underestimated because it looks benign — it is "just" slow, not obviously wrong. But a 30-second regex match on an event-loop-blocked Node.js server means no other requests are served during that time. For an API handling thousands of requests per second, one ReDoS payload causes cascading failures across the entire service.