ToolSec

⚙️ DevOps Utilities

Regex Basics: A Beginner's Guide to Regular Expressions

· 8 min read · Updated June 27, 2026

Regular expressions — "regex" — are a compact language for matching patterns in text. They look intimidating, but they're built from a small set of pieces. Learn those, and you can read and write most patterns you'll meet. This guide walks through the essentials with examples you can try as you go.

What is a regex?

A regex is a pattern that describes a set of strings. You use it to search for matches, validate input, or replace text. The same patterns work across JavaScript, Python, grep and most editors, with minor dialect differences.

Literal characters and the dot

Most characters match themselves: the pattern cat matches the text "cat". The special character . (dot) matches any single character, so c.t matches "cat", "cot" and "c9t". To match a literal dot, escape it: \..

Character classes

Square brackets define a set of allowed characters. [aeiou] matches any one vowel; [a-z] matches any lowercase letter; [0-9] any digit. A caret inverts it: [^0-9] matches anything that is not a digit.

There are handy shorthands too:

  • \d — a digit, same as [0-9].
  • \w — a "word" character: letters, digits or underscore.
  • \s — any whitespace (space, tab, newline).
  • Their uppercase versions (\D, \W, \S) mean the opposite.

Quantifiers: how many

Quantifiers say how many times the previous item may repeat:

  • * — zero or more.
  • + — one or more.
  • ? — zero or one (optional).
  • {3} — exactly three; {2,4} — between two and four.

So \d+ matches one or more digits, and colou?r matches both "color" and "colour".

Anchors: where it matches

Anchors match a position rather than a character. ^ matches the start of the string (or line) and $ matches the end. Use them when you mean to match a whole value: ^\d{5}$ matches a string that is exactly five digits — perfect for a US ZIP code — and nothing else.

Groups and alternation

Parentheses create a group, which you can quantify or capture. (ab)+ matches "ab", "abab", and so on. Groups also let you pull out parts of a match — (\w+)@(\w+) captures the parts before and after an @. The pipe | means "or": cat|dog matches either word.

Flags

Flags change how the whole pattern behaves:

  • g — global: find all matches, not just the first.
  • i — case-insensitive.
  • m — multiline: ^ and $ match line boundaries.
  • s — dotall: . also matches newlines.

A worked example: matching an email-ish string

Putting it together, ^[\w.]+@[\w.]+\.\w{2,}$ reads as: start, one or more word/dot characters, an @, more word/dot characters, a literal dot, and at least two letters, then end. It's not a full email validator — those are famously complex — but it's a solid practical pattern, and it shows how the pieces combine.

Tips and pitfalls

  • Anchor with ^ and $ when validating a whole string.
  • Prefer specific classes ([a-z]) over a broad . for clarity.
  • Watch for "catastrophic backtracking" with nested quantifiers like (a+)+ on long input.
  • Escape special characters (. * + ? ( ) [ ] \) when you mean them literally.

Practice as you learn

The fastest way to learn regex is to experiment. Build patterns up one piece at a time in our regex tester — it shows every match and capture group instantly. To extract structured data afterward, the JSON formatter and JSON ↔ CSV converter are useful companions.

Frequently asked questions

What does \d mean in regex?

\d matches a single digit, equivalent to [0-9]. Add a quantifier like \d+ to match one or more digits in a row.

What's the difference between * and + in regex?

* matches zero or more of the preceding item, so it can match nothing. + requires at least one. Use ? for zero or one (optional).

How do I match an exact whole string?

Anchor the pattern with ^ at the start and $ at the end. For example ^\d{5}$ matches a string that is exactly five digits and nothing more.

Try the related tools

Related guides