Main image of the post

Today we'll talk about regular expressions. What is a regular expression? Regular expressions are patterns used to match sequences of characters in strings.

Syntax for creating a regular expression:

var regexp = /template/; // without flags 
var regexp = /template/gmi; // with gmi flags  (we'll look at them further).

A template or pattern is the basis of a regular expression. This is a string that can be expanded with special characters that make the search much more flexible.

In the simplest case, if there are no flags and special characters, the pattern search is the same as the normal substring search.

Flags

Regular expressions can have flags that affect the search. They are optional, by default no one is installed, but their use makes searching much more flexible and powerful.

Next, we'll look at the RegExp methods. Methods for searching by regular expressions are built directly into ordinary String strings. There are the following built-in methods for working with regular expressions.

Str.search(reg): returns the position of the first match or -1 if nothing is found. Searches only the first match.

An example of using this method we saw above.

Str.match(reg): str.match works differently, depending on the presence or absence of the g flag. First, we'll figure out the option when it's not there.

In this case, str.match(reg) finds only one, the first match.

The result of the call is an array consisting of this match, with additional properties index - the position on which it was found and input - the string in which the search was performed.

Example:

var str = "some string"; 
var reg = str.match(/some/); 
alert(reg[0]); // some (match in the string); 
alert(reg.index); // 0 (position); 
alert(reg.input); // "some new string";

Next, consider the variant str.match(reg) with the flag g. If there is a flag g, calling match returns a normal array of all matches.

There are no additional properties for the array in this case, parentheses of additional elements do not generate.

Example:

var str = "some new string with something interesting"; 
var reg = str.match(/some/g); 
alert(reg); // return an array ["some", "some"]

Str.split(reg | substr, limit): splits a string into an array by a separator - a regular expression regexp or a substr substring.

Example:

alert('12-34-56'.split(/-/)); // [12, 34, 56]

Str.replace(reg, str | func): multifunctional tool for working with strings, searching and replacing any level of complexity.

An example of the use of this method we have already seen.

console.log('everyone likes, read'.replace(/, /g,' to ')); //' everyone likes to read '

When we need a more flexible customizable replacement, we use the function as the second argument. It will be called for each match, and its result will be inserted as a replacement.

For example:

var i = 0; 
// replace each occurrence of "oh" with the result of the function call 
alert("Oy-Oy-oy".replace(/oy/ gi, function () { 
return ++i; 
})); // 1-2-3

This function receives the following arguments:

  • Str is the match found,
  • P1, p2, ..., pn - the contents of the brackets (if any),
  • Offset - the position on which the match was found,
  • S is the original string.

Regexp.test(str): The test method checks if there is at least one match in the str string. Returns true / false.

It works, in fact, the same way as checking str.search(reg)! = -1, for example:

var str = "some string"; 
var regexp = /some/; 
alert(regexp.test(str)); // returns true

Character Classes

A symbol class is a special designation for which any character from a particular set matches.

The most commonly used are:

  • \d is a digit, a character from 0 to 9.
  • \s - whitespace character, including tabs, line breaks, etc.
  • \w - the symbol of "word", or more precisely - the letter of the Latin alphabet or a number or the underscore '_'. Non-English letters are not \w, that is, the Russian letter is not suitable.

Example:

var str = "The future of HTML6"; 
var reg = /html\d/i; 
alert(str.match(reg)); // HTML6

Another example:

alert("I love HTML5!".match(/\s\w\w\w\w\d/)); // 'HTML5'

Inverse classes

For each class there is an "inverse to it", represented by the same, but capitalized letter.

"Reverse" means that it corresponds to all other symbols, for example:

  • \D is a non-digit, that is, any character other than \d, such as a letter.
  • \S is a non-blank space, that is, any character other than \s, such as a letter.
  • \W - any character, except \w, that is not Latin, not an underscore, not a digit. In particular, Russian letters belong to this class.

Example:

var str = "some, thing"; 
alert(str.replace(/\W/g, "")); // something 

Sets and ranges

If in a regular expression several characters or character classes are enclosed in square brackets [...], then this means "looking for any character from the ones specified in [...]".

For example, [a-z] is an arbitrary character from a to z, [0-5] is a digit from 0 to 5.

n the example below, we will look for "x", after which there is twice any digit or letter from A to F:

// find "xAF"
alert("Exception 0xAF".match(/x[0-9A-F][0-9A-F]/g));

Ranges "except" - except for the usual ones, there are also excluding ranges: [^ ...].

Square brackets beginning with a carriage sign: [^ ...] find any character other than the specified.

alert("alice15@gmail.com".match(/[^\d\sA-Z]/gi)); // "@", "."

Today we looked at what regular expressions are and how they are used. I hope I managed to convince you that regular expressions are a simple and interesting tool that can greatly simplify the life of any developer.

Share

Read more

Contact us

Send us a few words about your project

team@mifort.org

Estonia, Tallinn, Kesklinna linnaosa, Pärnu mnt 105, 11312

View on map