RegEx – Regular Expressions

Useful When Using Google Analytics and validating stuff like forms

Borat picture
“I like RegEx!”


Dot .  

The dot or period, matches any single character
For example, MM. matches MMA and MMB

Backslash
Used a lot, to escape characters with special meanings.
U.S.A. would look to match, the fullstops or dots with other characters, e.g.
U.S.A. would match UISAAB because the dot looks to match with any character
What if you just want to match U.S.A. though, without any additional characters?
Using a backslash, takes away the special function of the full stop, and makes it search for an actual fullstop
For example U.S.A. would match U.S.A.

Square brackets []
The square brackets basically say to the computer “match any of these”
For example, [MBA] means will find matches with M, B or A

You can add a hyphen to create a range.
For example [a-z] will match anything from a to z in the alphabet (lowercase).

Adding a carrot (little roof thing) ^ means DO NOT match.
e.g. [^0-9] means DO NOT match to a single digit

The Question Mark ?
“The question mark matches zero or one of the previous items”
The best example for this is colour

colou?r matches color and colour 

The plus sign +
Similar to the question mark, but this time it must “match one of more of the previous item”
29+ matches 29, 29999
a+ matches a, aa, aaa, aaab etc.

Asterix *
Similar to the question mark.
b* will match bc and bbbc

The dot star – the WildCard .*
The wildcard basically matches everything
So if you looked for matches for Drew.*Griffiths, the computer will search for DrewANYTHINGGriffiths
So it would match DrewHeroGriffiths, DrewHandsomeGriffiths, etc.  

The Pipe or vertical Line |
Think of it as “either or”
Pink|Wink will match both Pink and Wink
(P|W)ink will do the same

The caret ^
This indicates that your selection has to begin with, whatever you put after it
For example in Google Analytics you could use a Matching RegExp filter – ^/products
This would match all the URIs in the products folder.
Remember ^ indicates the start.
Like a carrot can get a donkey started:

caret regexp
http://www.dougmather.co.uk/


The Dollar Sign $
matches the end of a URL or something.
Martial Arts$ would match Mixed Martial Arts but not Martial Arts UK

 

Examples:
I robbed all of these, and changed them a little bit, from the Google RegEx page listed in the reference.

1. Match the word “Dave”  – without the quotations marks, obvs
(W|^)Dave(W|$)

2. Match the phrase “round house”
(W|^)roundhouse(W|$)

W  prevents the matching of anything, before or after the phrase

3. Match any of the following

Drew, sweep, judo, round house, protein

(W|^)(Drew|sweep|judo|round house|protein)(W|$)

Judo Gi Regex

4. Match any URL that contains the text “seoandMMA”

For example
seoandMMA.wordpress.com
seoandMMA.com
http://seoandMMA.com
seoandMMA(w.+%-){0,11}.com

[w.+-] tells the computer to match any character, number, full stop, percentage sign, or hyphen.  These are chosen, because these are the only valid characters in a URL

{0,11} tells the computer that 0 to 11 characters can occurs after the word “seoandMMA”

Curly braces usually indicate how many times a character is repeated.
You can also express a range, by adding the comma, as we did before with {0,11}
{minimum, maximum}
Curly Braces Regex

The final backslash, before the fullstop/period . and before the hyphen  “escapes the hyphen and full stop.  It takes away any special RegEx functionality, and instead makes them a normal fullstop and a normal hyphen

Some great resources include:
http://rubular.com/
and http://www.ultrapico.com/expresso.htm

References:
https://support.google.com/a/answer/1371417?hl=en#Match-Whole-Word-Only
http://www.youtube.com/watch?v=BnEjyhoyKlw

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s