Quick answer
A useful cheatsheet should not stop at "\d = digit". This one pairs each token with a tested example, calls out the differences between JavaScript, Python, and PCRE, and tells you when a feature is missing in your runtime. Open Regex Tester Pro alongside it to try every pattern live.
Most regex cheatsheets are reference tables that assume you already know what to do with them. This one is built around examples you can paste into a tester and see working in 10 seconds. Each section links the tokens to a concrete pattern.
// Anchors and character classes in 4 lines
/^\s*$/ // blank line
/[a-zA-Z][a-zA-Z0-9_]*/ // identifier-ish
/\b(?:GET|POST|PUT|DELETE)\b/ // HTTP verb
/(?<=#)[a-z0-9-]+/ // anchor tag, no leading #
Anchors and word boundaries
Anchors do not match characters, they match positions. ^ anchors to the start of the input (or start of a line with the m flag). $ anchors to the end. \b is the boundary between a word character and a non-word character, and is the single most underused token in everyday regex.
/^Error/matches lines that start with Error./\.$/matches lines that end with a period./\bcat\b/matches "cat" but not "category" or "catalog".
JavaScript and PCRE both treat \b as ASCII-only by default. Python's re uses Unicode word characters when the re.UNICODE flag is on (the default in Python 3). If you mix language flavors, this is the most common silent bug.
Character classes you actually use
The shorthand classes are the same across flavors:
\ddigit,\Dnon-digit\wword character,\Wnon-word\swhitespace,\Snon-whitespace.any character (in JS and Python, except newline; with thesflag, includes newline)
Custom classes use brackets, with negation via ^ as the first character: [a-fA-F0-9] matches a hex digit, [^"]+ matches one or more non-quote characters. Escaping inside a class is mostly the same, except ], \, and (sometimes) - need backslashes.
Quantifiers, greedy and lazy
Quantifiers attach to the previous atom: ? (zero or one), * (zero or more), + (one or more), {n,m} (between n and m). They are greedy by default. Adding ? makes them lazy.
<.+>on <a>hi</a> matches the entire string. Greedy.<.+?>on the same input matches just <a>. Lazy.
Catastrophic backtracking happens when nested greedy quantifiers create exponential possibilities. The classic offender is (a+)+$ against a long string of as ending in b. Always test long inputs against any pattern with nested quantifiers before shipping it.
Groups, alternation, references
Parentheses do two things: they group, and they capture. (cat|dog)s? matches "cat", "cats", "dog", "dogs". The captured text is available as $1 in replacements and as match[1] in JS or m.group(1) in Python.
Non-capturing groups, (?:...), group without saving the text. Use them for alternation or quantification when you do not need the value back. Named groups, (?<name>...), are supported in JS (since ES2018), Python (since forever), and PCRE.
Backreferences inside the pattern, \1 or \k<name>, match the same text the group captured. (\w)\1 matches doubled letters: letter, book, tomorrow.
Lookaround: the assertion family
Lookarounds are zero-width assertions, they look at characters but do not consume them. The four forms:
(?=...)positive lookahead: "what follows must match"(?!...)negative lookahead: "what follows must not match"(?<=...)positive lookbehind: "what precedes must match"(?<!...)negative lookbehind: "what precedes must not match"
Lookbehind is fully supported in modern JS engines (V8, SpiderMonkey, JavaScriptCore) and in Python's re (with fixed-width patterns) and regex module (variable width). PCRE supports it natively. Old Safari (pre 16.4) did not, which is why "lookbehind broke my site" was a bug class for years.
Replacement patterns
Replacements are where regex pays its rent. The basics:
- Plain text:
str.replace(/foo/, 'bar') - With backreference:
str.replace(/(\d{4})-(\d{2})/, '$2/$1') - With named group:
str.replace(/(?<y>\d{4})/, '$<y>') - With function:
str.replace(/\d+/g, n => +n * 2)
The function form is the strongest tool in the kit. The callback receives the full match plus each capture group, lets you compute the replacement procedurally, and works in JavaScript, Python (re.sub with a function), and PCRE through callouts.