⚙️ DevOps Utilities
Regex Basics: A Beginner's Guide to Regular Expressions
By Justin Le
· 8 min read · Updated June 27, 2026 Regular expressions — "regex" — are a compact language for matching patterns in text. They look intimidating, but they're built from a small set of pieces. Learn those, and you can read and write most patterns you'll meet. This guide walks through the essentials with examples you can try as you go.
What is a regex?
A regex is a pattern that describes a set of strings. You use it to search for matches, validate input, or replace text. The same patterns work across JavaScript, Python, grep and most editors, with minor dialect differences.
Literal characters and the dot
Most characters match themselves: the pattern cat matches the text "cat".
The special character . (dot) matches any single character, so
c.t matches "cat", "cot" and "c9t". To match a literal dot, escape it:
\..
Character classes
Square brackets define a set of allowed characters. [aeiou] matches any one
vowel; [a-z] matches any lowercase letter; [0-9] any digit. A
caret inverts it: [^0-9] matches anything that is not a digit.
There are handy shorthands too:
\d— a digit, same as[0-9].\w— a "word" character: letters, digits or underscore.\s— any whitespace (space, tab, newline).- Their uppercase versions (
\D,\W,\S) mean the opposite.
Quantifiers: how many
Quantifiers say how many times the previous item may repeat:
*— zero or more.+— one or more.?— zero or one (optional).{3}— exactly three;{2,4}— between two and four.
So \d+ matches one or more digits, and colou?r matches both
"color" and "colour".
Anchors: where it matches
Anchors match a position rather than a character. ^ matches the start of
the string (or line) and $ matches the end. Use them when you mean to match
a whole value: ^\d{5}$ matches a string that is exactly five digits —
perfect for a US ZIP code — and nothing else.
Groups and alternation
Parentheses create a group, which you can quantify or capture.
(ab)+ matches "ab", "abab", and so on. Groups also let you pull out parts of
a match — (\w+)@(\w+) captures the parts before and after an
@. The pipe | means "or": cat|dog matches either
word.
Flags
Flags change how the whole pattern behaves:
g— global: find all matches, not just the first.i— case-insensitive.m— multiline:^and$match line boundaries.s— dotall:.also matches newlines.
A worked example: matching an email-ish string
Putting it together, ^[\w.]+@[\w.]+\.\w{2,}$ reads as: start, one or
more word/dot characters, an @, more word/dot characters, a literal dot, and
at least two letters, then end. It's not a full email validator — those are famously
complex — but it's a solid practical pattern, and it shows how the pieces combine.
Tips and pitfalls
- Anchor with
^and$when validating a whole string. - Prefer specific classes (
[a-z]) over a broad.for clarity. - Watch for "catastrophic backtracking" with nested quantifiers like
(a+)+on long input. - Escape special characters (
. * + ? ( ) [ ] \) when you mean them literally.
Practice as you learn
The fastest way to learn regex is to experiment. Build patterns up one piece at a time in our regex tester — it shows every match and capture group instantly. To extract structured data afterward, the JSON formatter and JSON ↔ CSV converter are useful companions.
Frequently asked questions
What does \d mean in regex?
\d matches a single digit, equivalent to [0-9]. Add a quantifier like \d+ to match one or more digits in a row.
What's the difference between * and + in regex?
* matches zero or more of the preceding item, so it can match nothing. + requires at least one. Use ? for zero or one (optional).
How do I match an exact whole string?
Anchor the pattern with ^ at the start and $ at the end. For example ^\d{5}$ matches a string that is exactly five digits and nothing more.
Try the related tools
Related guides
- JSON vs YAML: When to Use Each JSON and YAML describe the same data in very different styles. Here's how they compare, the YAML traps to watch for, and which to reach for when.
- Unix Timestamps Explained: Epoch, Seconds vs Milliseconds Epoch time, demystified: what the number really means, the seconds-vs-milliseconds bug that bites everyone, and why timestamps have no timezone.
- 301 vs 302 Redirects: Which to Use (and Why It Matters for SEO) 301 or 302? The wrong choice can quietly tank your SEO. Here's what each redirect means, when to use it, and the chain mistakes to avoid.