in ,

Use Regular Expressions with Javascript

Everything is a character. Most patterns use normal ASCII which include letters, digits, punctuation and symbols like %#$@!. You can also use unicode characters to match international text.

The abcs and 123s

\d is a metacharacter that means any digit from 0 to 9
\D any non-digit character

The Dot

. match any single character(letter, digit, whitespace, everything)

/. > period means full stop.

If you want to skip abc1 but match abc. You can do …/. This way the period stop matches the . from abc.

[abc] – matching specific characters

[abc] will only match a single a b or c

[abc]de will only match bde but it will not match bcd

[^abc] – excluding specific characters

[^abc]de will not match if there is the letter a, b, or c

[^b]og will match hog and dog but not bog

Character ranges with bracket notation

[a-z] means a to z
[^d-f] means no letters d,e,f
[A-Za-z]: uppercase and lowercase letters

\w is a meta character that means all characters in the range [A-Za-z0-9_]
If you want to define letter by letter in a 3 letters sequence, you can define several brackets. This will test for each letter.

[A-C][n-p][a-c]

Character repetition

x{3} will match if the character is repeated exactly 3 times.
x{1,3} will match if characters are repeated between 1 and 3 times.
[abc]{7} will match if either a,b or c are matched 7 times.
.{2,6} between 2 and 6 times of any character.

r{3,5} will not match carriage

  • and +

\d* means it needs to match 0 or more numeric digits
\d+ means it needs to match 1 or more numeric digits (that’s better for a form)
a+ means 1 or more as (that can work for repetition)
[abc]+ means 1 or more of any of the a, b or c characters
.* means zero or more of any character

You can add + and * together

a+b*c+ will match aabbbbc

Optional characters ‘?’

If you put ? after a character it becomes optional.
ab?c will match abc or ac.

If you want to match a plain ?, you need to escape it with \?

\d+ files? found\?
Will match : 24 files found?

White spaces

( ) the space
(\t) the tab
(\n) the new line
(\s): matches any of the specific white spaces above

[1-3].\s+abc > \s you one at least 1 space.

Lines that starts and ends with

^: match the line that begins with
$: match the line that ends with

^ this start hat is the not the same as the ^ in the brackets that will exclude characters: [^abc]

^hello cat$ will only match with sentence that is like:
Hello the cat

Match and capture groups of characters

You can also extract information for further processing with regular expressions.

( and ) are special characters that allow you to capture characters as a group, used to capture emails, phone numbers etc.

You capture what is in the ( )

^(file.+).pdf$ // match what starts with files and en with .pdf and has more than 1 character. And the () will only keep the file name and not the extension.

Nested Groups

This is used to capture 2 bits of information at the same time. For example the full datte (month and year) but also just the year.

\d : only numbers
\w : any letter or number

To exclude numbers [^\d]

(\w+ (\d+)) : this will capture Month + Year and just Year

Capture separate groups

1280×720
(\d+)x(\d+) will match all but only capture 1280 and 720

Adding conditional logic

You can use ( and ) to add a logic

I love (cats|dogs) // this will match I love cats OR I love dogs

Other special characters

Common special characters are \d \w \s
By adding a capital letter, you create the opposite:

\D: any non number
\W: any non number or letter (like punctuation)
\S: any non white space

\b: boundary between a word and a non-word character.
\w+\b would capture an entire word.

Reference your captured groups
\0 full matched text
\1 1st group
\2 2nd group

If you want to replace all elements as follow:
(\d+)-(\d+) replace with \2-\1

Matching specific filenames

(\w+).(jpg|png|gif)$

Match all images with either jpg, png or gif

Trimming whitespaces from start and end of line

Use \S and () to trim down

Extracting information from a log file

Parsing and extracting data from a URL

Using Regular Expressions in Javascript
In javascript RegExp is an object. So you can write it like:

var re = new RegExp(“\w+\d+”);

But the easier way to write it is with literal shorthand like:

Let regVar = /\w+\d+/ >> literal shorthand..

This means match at least 1 letter/numbers and at least 1 number.

Mixing Regex with Javascript

RegExp.test()
This will give an boolean result: True or False

/\w+\d+/.test(string) // true or false

RegExp.exec()
This will do the matching and gives the result

/\w+\d+/.exec(string) // creates an object. obj[0] will give the result that matches. Obj.index is where the regex pattern starts in the string.

let str = ‘July 13’
let regX = /([a-zA-Z]+) (\d+)/;
if (regX.test(str)){
console.log(‘It matches’)
console.log(regX.exec(str)[0])
}

This will log:
It matches
July 24

RegExp Flags

g: run RegExp.exec() multiple times to find every match
I: makes regex case INsensitive
m: necessary if string as new line \n. You can use ^ and $ for several lines
u: interpret the regular expression as unicode codepoints

Let regX = /regex/gi // find all the matches and using case insensitive matching.

String Functions

You can match, search and replace substrings with given regular expressions.

inputStr.match(/regex/flags)

inputStr.search(/regex/flags)

inputStr.replace(/regex/flags, string)

let str = ‘deuUb465A’
let regX = /(\d+)/g
let string2 = ‘elephant’

var match = str.match(regX)
console.log(match)
// gives the match as a result

var search = str.search(regX)
console.log(search)
// gives the index of the match

var replace = str.replace(regX,string2)
console.log(replace)
// will replace the matched numbers with the word elephant

What do you think?

Written by John

Leave a Reply

Your email address will not be published. Required fields are marked *

The Day That Coding Saved My Life

Website animation with jQuery