RegEx – Regular Expressions

Useful When Using Google Analytics and validating stuff like forms

Borat picture
“I like RegEx!”


Dot .  

The dot or period, matches any single character
For example, MM. matches MMA and MMB

Backslash
Used a lot, to escape characters with special meanings.
U.S.A. would look to match, the fullstops or dots with other characters, e.g.
U.S.A. would match UISAAB because the dot looks to match with any character
What if you just want to match U.S.A. though, without any additional characters?
Using a backslash, takes away the special function of the full stop, and makes it search for an actual fullstop
For example U.S.A. would match U.S.A.

Square brackets []
The square brackets basically say to the computer “match any of these”
For example, [MBA] means will find matches with M, B or A

You can add a hyphen to create a range.
For example [a-z] will match anything from a to z in the alphabet (lowercase).

Adding a carrot (little roof thing) ^ means DO NOT match.
e.g. [^0-9] means DO NOT match to a single digit

The Question Mark ?
“The question mark matches zero or one of the previous items”
The best example for this is colour

colou?r matches color and colour 

The plus sign +
Similar to the question mark, but this time it must “match one of more of the previous item”
29+ matches 29, 29999
a+ matches a, aa, aaa, aaab etc.

Asterix *
Similar to the question mark.
b* will match bc and bbbc

The dot star – the WildCard .*
The wildcard basically matches everything
So if you looked for matches for Drew.*Griffiths, the computer will search for DrewANYTHINGGriffiths
So it would match DrewHeroGriffiths, DrewHandsomeGriffiths, etc.  

The Pipe or vertical Line |
Think of it as “either or”
Pink|Wink will match both Pink and Wink
(P|W)ink will do the same

The caret ^
This indicates that your selection has to begin with, whatever you put after it
For example in Google Analytics you could use a Matching RegExp filter – ^/products
This would match all the URIs in the products folder.
Remember ^ indicates the start.
Like a carrot can get a donkey started:

caret regexp
http://www.dougmather.co.uk/


The Dollar Sign $
matches the end of a URL or something.
Martial Arts$ would match Mixed Martial Arts but not Martial Arts UK

 

Examples:
I robbed all of these, and changed them a little bit, from the Google RegEx page listed in the reference.

1. Match the word “Dave”  – without the quotations marks, obvs
(W|^)Dave(W|$)

2. Match the phrase “round house”
(W|^)roundhouse(W|$)

W  prevents the matching of anything, before or after the phrase

3. Match any of the following

Drew, sweep, judo, round house, protein

(W|^)(Drew|sweep|judo|round house|protein)(W|$)

Judo Gi Regex

4. Match any URL that contains the text “seoandMMA”

For example
seoandMMA.wordpress.com
seoandMMA.com
http://seoandMMA.com
seoandMMA(w.+%-){0,11}.com

[w.+-] tells the computer to match any character, number, full stop, percentage sign, or hyphen.  These are chosen, because these are the only valid characters in a URL

{0,11} tells the computer that 0 to 11 characters can occurs after the word “seoandMMA”

Curly braces usually indicate how many times a character is repeated.
You can also express a range, by adding the comma, as we did before with {0,11}
{minimum, maximum}
Curly Braces Regex

The final backslash, before the fullstop/period . and before the hyphen  “escapes the hyphen and full stop.  It takes away any special RegEx functionality, and instead makes them a normal fullstop and a normal hyphen

Some great resources include:
http://rubular.com/
and http://www.ultrapico.com/expresso.htm

References:
https://support.google.com/a/answer/1371417?hl=en#Match-Whole-Word-Only
http://www.youtube.com/watch?v=BnEjyhoyKlw



RegEx Converter

@DannyRichman created this great RegEx converter

You need to get an API Key for it to work

The first step is to register a free account at OpenAI.com

https://docs.google.com/spreadsheets/d/1wlpy72KftY32uxZ2vrpK_m-IKGvzSa8W0AGmM4Fi2TY/edit#gid=0



Search Console Regex

I’ll add more regex as I use them in search Console.

For a query that contains 2 words:

\bresistance\b.*\bguide\b

This returns stats for queries/keywords containing “resistance” and “guide”

Leave a comment