Skip to content

search, match, findall

Mental Model

search scans the entire string for the first match anywhere. match only checks at the very beginning of the string. findall collects every non-overlapping match into a list. Choosing the right function is the first decision in any regex task: "Do I need the first occurrence, a start-of-string check, or all occurrences?"

Overview

Python's re module provides several functions for finding patterns in text. The three most commonly used are search(), match(), and findall(), each with distinct behavior.

Function Searches Where Returns Use Case
re.search() Anywhere in string First Match or None Find first occurrence
re.match() Beginning of string only Match or None Validate string start
re.fullmatch() Entire string Match or None Validate entire string
re.findall() Entire string List of strings/tuples Extract all occurrences
re.finditer() Entire string Iterator of Match objects Process matches one by one

re.search()

re.search() scans the entire string and returns the first match:

```python import re

text = "Error 404: Page not found at 14:30"

Finds the first sequence of digits

match = re.search(r'\d+', text) print(match.group()) # '404' print(match.span()) # (6, 9) ```

If no match exists, it returns None:

python match = re.search(r'\d+', 'no numbers here') print(match) # None

Common Pattern: Guard with if

```python import re

text = "Temperature: 72.5°F" match = re.search(r'([\d.]+)°([FC])', text)

if match: value = float(match.group(1)) unit = match.group(2) print(f"{value} degrees {unit}") # 72.5 degrees F ```

Walrus Operator (Python 3.8+)

The walrus operator := combines the search and check in one expression:

```python import re

text = "Price: $42.99" if m := re.search(r'$(\d+.\d{2})', text): print(f"Found price: {m.group(1)}") # Found price: 42.99 ```

re.match()

re.match() checks for a match only at the beginning of the string:

```python import re

Matches — pattern is at the start

re.match(r'\d+', '123abc')

No match — digits are not at the start

re.match(r'\d+', 'abc123')

None

```

match() vs search() with ^

re.match() is equivalent to re.search() with a ^ anchor:

```python import re

text = "hello world"

These are equivalent

re.match(r'hello', text) # Match re.search(r'^hello', text) # Match

These differ

re.match(r'world', text) # None — not at start re.search(r'world', text) # Match — found in string ```

When to Use match() vs search()

Use re.match() when you specifically need to validate the beginning of a string. Use re.search() for general-purpose pattern finding anywhere in the string. In practice, re.search() is more commonly used.

re.fullmatch()

re.fullmatch() requires the pattern to match the entire string (equivalent to anchoring with ^...$):

```python import re

Validate that the entire string is a date

re.fullmatch(r'\d{4}-\d{2}-\d{2}', '2024-01-15')

re.fullmatch(r'\d{4}-\d{2}-\d{2}', '2024-01-15 extra')

None — extra text after the date

```

fullmatch() is ideal for input validation:

```python import re

def is_valid_email_simple(email): """Basic email format check (not production-grade).""" return bool(re.fullmatch(r'[\w.+-]+@[\w-]+.[\w.]+', email))

print(is_valid_email_simple("user@example.com")) # True print(is_valid_email_simple("not an email")) # False print(is_valid_email_simple("user@example.com foo")) # False ```

re.findall()

re.findall() returns a list of all non-overlapping matches:

```python import re

text = "Prices: $10, $25, $100, and $3.50"

No groups — returns list of full matches

re.findall(r'$[\d.]+', text)

['$10', '$25', '$100', '$3.50']

One group — returns list of group contents

re.findall(r'$([\d.]+)', text)

['10', '25', '100', '3.50']

Multiple groups — returns list of tuples

re.findall(r'$(\d+).?(\d*)', text)

[('10', ''), ('25', ''), ('100', ''), ('3', '50')]

```

findall() with No Match

If no matches are found, findall() returns an empty list (not None):

python result = re.findall(r'\d+', 'no numbers') print(result) # [] print(len(result)) # 0 print(bool(result)) # False

re.finditer()

re.finditer() returns an iterator of Match objects, giving you access to all match metadata (position, groups):

```python import re

text = "Alice: 85, Bob: 92, Carol: 78"

for match in re.finditer(r'(\w+): (\d+)', text): name = match.group(1) score = int(match.group(2)) pos = match.span() print(f"{name} scored {score} (at position {pos})")

Alice scored 85 (at position (0, 9))

Bob scored 92 (at position (11, 17))

Carol scored 78 (at position (19, 28))

```

finditer() vs findall()

Use finditer() when you need:

  • The position of each match (.start(), .end(), .span())
  • Named groups (.groupdict())
  • Memory efficiency with large texts (lazy iteration)
  • Both the full match and group contents

```python import re

text = "2024-01-15 and 2024-12-31"

findall — only group contents

re.findall(r'(?P\d{4})-(?P\d{2})-(?P\d{2})', text)

[('2024', '01', '15'), ('2024', '12', '31')]

finditer — full Match objects

for m in re.finditer(r'(?P\d{4})-(?P\d{2})-(?P\d{2})', text): print(m.group(0), m.groupdict())

2024-01-15

2024-12-31

```

Comparison Table

```python import re

text = "cat bat hat" pattern = r'[cbh]at'

search — first match only

re.search(pattern, text).group() # 'cat'

findall — all matches as list

re.findall(pattern, text) # ['cat', 'bat', 'hat']

finditer — all matches as Match objects

[m.group() for m in re.finditer(pattern, text)] # ['cat', 'bat', 'hat']

match — beginning of string only

re.match(pattern, text).group() # 'cat'

fullmatch — entire string

re.fullmatch(pattern, text) # None (text has spaces) re.fullmatch(pattern, 'cat') # ```

Summary

Function Scope Returns Best For
search() First match anywhere Match / None Finding first occurrence
match() Start of string Match / None Validating beginning
fullmatch() Entire string Match / None Input validation
findall() All matches list Extracting all occurrences
finditer() All matches Iterator of Match Position-aware extraction

Exercises

Exercise 1. Write a function validate_email that uses re.fullmatch to check if a string is a valid email address (simplified pattern). Return True or False. Test with "user@example.com", "@invalid.com", "user@.com", and "name@domain.org".

Solution to Exercise 1

```python import re

def validate_email(email): pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}' return bool(re.fullmatch(pattern, email))

Test

print(validate_email("user@example.com")) # True print(validate_email("@invalid.com")) # False print(validate_email("user@.com")) # False print(validate_email("name@domain.org")) # True ```


Exercise 2. Write a function find_all_hashtags that uses re.findall to extract all hashtags from a social media post. A hashtag starts with # followed by one or more word characters. For example, "Love #Python and #coding! #100DaysOfCode" should return ["#Python", "#coding", "#100DaysOfCode"].

Solution to Exercise 2

```python import re

def find_all_hashtags(text): return re.findall(r'#\w+', text)

Test

post = "Love #Python and #coding! #100DaysOfCode" print(find_all_hashtags(post))

['#Python', '#coding', '#100DaysOfCode']

```


Exercise 3. Write a function search_vs_match_demo that takes a pattern and a string, and returns a dictionary with keys "search", "match", and "fullmatch", each containing True or False depending on whether the respective function found a match. Demonstrate with the pattern r"\d+" and the string "abc123".

Solution to Exercise 3

```python import re

def search_vs_match_demo(pattern, text): return { "search": bool(re.search(pattern, text)), "match": bool(re.match(pattern, text)), "fullmatch": bool(re.fullmatch(pattern, text)), }

Test

result = search_vs_match_demo(r"\d+", "abc123") print(result)

{'search': True, 'match': False, 'fullmatch': False}

result2 = search_vs_match_demo(r"\d+", "123") print(result2)

{'search': True, 'match': True, 'fullmatch': True}

```