Skip to main content

Insecure Use of Regular Expressions

Why is this important?

Regular Expressions (Regex) are used in almost every application. Less known is the fact that a Regex can lead to Denial of Service (DOS) attacks, called ReDOS. This is due to the fact that regex engines may take a large amount of time when analyzing certain strings, depending on how the regex is defined.

For example, given: ^(a+)+$, the input "aaaaaaaaaaaaaaaaX" makes the regex engine analyze 65536 different paths.

Therefore, it is possible that a single request may cause a large amount of computation on the server side. The problem with this regex, and others like it, is that there are two different ways the same input character can be accepted by the Regex due to the + (or a *) inside the parenthesis, and the + (or a *) outside the parenthesis. The way this is written, either + could consume the character 'a'. To fix this, the regex should be rewritten to eliminate the ambiguity. For example, this could simply be rewritten as: ^a+$, which is presumably what the author meant anyway (any number of a's). Assuming that's what the original regex meant, this new regex can be evaluated quickly, and is not subject to ReDOS.

Read below to find out how to fix this issue in your code.

Fixing Insecure Use of Regular Expressions

In order to double-check whether the regular expression is really vulnerable, there is a helpful web service that can be leveraged called recheck.

Option A: Create safe Regular Expressions

  1. Go through the issues that GuardRails identified in the PR.
  2. Identify any * or + operators that are close to each other, without serving different purposes.
  3. Rewrite the Regex to eliminate the ambiguities. Libraries like Safe-Regex can help with testing that a Regex is safe.
  4. Test it and ensure the regex is still working as expected.
  5. Ship it 🚢 and relax 🌴

More information