Regex Tutorial for Beginners: Patterns, Flags, and Examples
Regular expressions (regex) give you a concise, powerful way to search, match, and manipulate text using patterns. Every major programming language supports them, and the core syntax is nearly identical across JavaScript, Python, Java, Go, and PHP. This tutorial walks you through regex fundamentals so you can start writing your own patterns with confidence.
What Is Regex?
A regular expression is a sequence of characters that defines a search pattern. You use it to validate input, extract substrings, or perform find-and-replace operations across text. If you have ever needed to check whether an email address is valid, pull phone numbers out of a document, or clean up formatting in a large file, regex is the right tool.
At first glance, a pattern like ^[\w.-]+@[\w.-]+\.\w{2,}$ looks intimidating. By the end of this guide, you will be able to read and write patterns like this without hesitation.
Basic Patterns
The simplest regex is a literal string. The pattern hello matches the exact text "hello" inside any larger string. Most regex engines return the first occurrence by default.
Special characters, called metacharacters, give regex its power. The most important ones are:
.matches any single character except a newline|acts as an OR operator:cat|dogmatches "cat" or "dog"\escapes a metacharacter so it is treated literally:\.matches an actual period
If you want to match the literal text 1+1=2, you need to escape the plus sign: 1\+1=2.
Character Classes
A character class matches one character from a defined set. You write them inside square brackets:
[abc]matches "a", "b", or "c"[a-z]matches any lowercase letter from a to z[0-9]matches any digit[A-Za-z0-9]matches any alphanumeric character[^0-9]matches any character that is NOT a digit (the^inside brackets negates the class)
Regex also provides shorthand character classes for the most common sets:
| Shorthand | Meaning |
|---|---|
\d | Any digit (same as [0-9]) |
\D | Any non-digit |
\w | Any word character [A-Za-z0-9_] |
\W | Any non-word character |
\s | Any whitespace (space, tab, newline) |
\S | Any non-whitespace character |
For a complete reference of every character class, quantifier, and flag in one place, check out the Regex Cheat Sheet.
Quantifiers
Quantifiers control how many times a character or group is allowed to repeat:
| Quantifier | Meaning |
|---|---|
* | Zero or more times |
+ | One or more times |
? | Zero or one time (optional) |
{3} | Exactly 3 times |
{2,5} | Between 2 and 5 times |
{2,} | 2 or more times |
For example, \d{3}-\d{4} matches a pattern like "555-1234", which is exactly three digits, a hyphen, then exactly four digits.
By default, quantifiers are greedy, meaning they match as much text as possible. Adding a ? after the quantifier makes it lazy (match as little as possible): .*? instead of .*.
Anchors
Anchors do not match a character. They match a position in the string:
^matches the start of the string (or the start of a line in multiline mode)$matches the end of the string (or end of a line in multiline mode)\bmatches a word boundary (the position between a word character and a non-word character)
The pattern ^\d+$ matches a string that contains only digits from start to finish. Without the anchors, \d+ would match the digits inside "abc123def". The anchors ensure the entire string must be digits.
Flags
Flags modify how the regex engine interprets your pattern. In JavaScript, they are placed after the closing delimiter (e.g., /pattern/gi). See the MDN RegExp documentation for the full list of supported flags.
g(global) finds all matches, not just the first onei(case-insensitive) treats uppercase and lowercase letters as equivalent:/hello/imatches "Hello", "HELLO", and "hello"m(multiline) makes^and$match the start and end of each line, not just the start and end of the entire string
You can combine flags: /^error/gim finds every line that starts with "error" regardless of case.
Capture Groups
Parentheses () create capture groups. They serve two purposes: grouping parts of a pattern together and extracting matched text.
The pattern (\d{3})-(\d{4}) matches "555-1234" and captures "555" in group 1 and "1234" in group 2. In most languages, you can reference these groups by index to extract the values.
Non-Capturing Groups
Non-capturing groups use the syntax (?:...). They group without capturing, which is useful when you need grouping for alternation or quantifiers but do not need to extract the match:
(?:cat|dog)s
This matches "cats" or "dogs" without storing "cat" or "dog" as a captured group.
Named Groups
Named groups use (?<name>...) to give a capture group a descriptive label instead of a numeric index:
(?<area>\d{3})-(?<number>\d{4})
This makes your code more readable when extracting values, since you can reference matches by name rather than position.
Practical Examples
Here are regex patterns for common real-world validation tasks. You can paste any of these directly into the NebulaTool Regex Tester to see them in action.
Email Address (Basic)
^[\w.-]+@[\w.-]+\.\w{2,}$
This matches strings like user@example.com or first.last@company.co.uk. It checks for one or more word characters before the @, a domain name, and a top-level domain of at least two characters.
Phone Number (US)
^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$
Matches formats like (555) 123-4567, 555-123-4567, and 555.123.4567. The \(? and \)? make parentheses optional, and [-.\s]? allows a hyphen, period, or space as the separator.
URL
^https?:\/\/[\w.-]+\.[a-z]{2,}(\/\S*)?$
Matches URLs starting with http:// or https:// followed by a domain and an optional path.
Hex Color Code
^#([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6})$
Matches both shorthand (#FFF) and full (#FF00AA) hex color values. If you are working with color data alongside structured formats, the JSON Formatter can help you organize and validate your configuration files.
Tips for Beginners
Start simple. Build your regex one piece at a time. Get the first part working before adding complexity. Test after every change.
Use a visual tester. Tools that highlight matches in real time make learning dramatically easier. You can see exactly what your pattern does as you type it.
Be specific. A pattern like .* matches everything and is almost never what you actually want. Use character classes and quantifiers that describe your expected input precisely.
Watch out for escaping. Characters like ., *, +, ?, (, ), [, {, ^, $, and | have special meaning. If you want to match them literally, escape them with a backslash.
Read your pattern aloud. Translating ^\d{3}-\d{4}$ as "start, three digits, a hyphen, four digits, end" makes complex patterns much easier to reason about.
Anchor when validating. If you are checking that an entire string matches a format (like an email or phone number), always use ^ and $ to prevent partial matches.
Frequently Asked Questions
What does regex stand for?
Regex is short for "regular expression." It refers to a formal notation for describing text patterns. The concept originates from formal language theory in computer science, but today it is a practical, everyday tool used by developers, data analysts, and system administrators.
Is regex the same in every programming language?
The core syntax (character classes, quantifiers, anchors, and basic groups) is consistent across most languages. However, advanced features like lookbehind, named groups, and Unicode support vary. JavaScript, Python, and Java each have slight differences in their regex engines. Always check the documentation for your specific language. The MDN Regular Expressions guide is an excellent reference for JavaScript.
When should I avoid using regex?
Regex is not the right tool for parsing deeply nested or recursive structures like HTML or JSON. For those, use a proper parser. Regex also becomes hard to maintain when patterns grow beyond a few dozen characters. If your pattern is unreadable, consider breaking the logic into multiple smaller checks or using a parsing library instead.
How do I debug a regex that is not working?
Start by simplifying. Remove parts of the pattern until something matches, then add pieces back one at a time. Use a live tester like the NebulaTool Regex Tester to see exactly which parts of your input string are matching. Pay close attention to anchors, escaping, and whether you need the global flag.
What is the difference between greedy and lazy matching?
Greedy quantifiers (*, +, {n,}) match as much text as possible. Lazy quantifiers (*?, +?, {n,}?) match as little as possible. For example, given the input "foo" and "bar", the greedy pattern ".*" matches the entire string from the first quote to the last, while the lazy pattern ".*?" matches "foo" and "bar" separately.
Start Testing Your Patterns
The fastest way to learn regex is to practice. The NebulaTool Regex Tester highlights matches in real time, explains each part of your pattern, and runs entirely in your browser with no data sent to a server. Paste in one of the examples from this guide, tweak it, break it, and rebuild it. That is how regex stops being intimidating and starts being useful.
Ready to try it yourself?
Open Regex Tester