Insecure Use of Regular Expressions
Why is this important?
Regular Expressions (Regex) are used in almost every application. Less known is the fact that a Regex can lead to Denial of Service (DOS) attacks, called ReDOS. This is due to the fact that regex engines may take a large amount of time when analyzing certain strings, depending on how the regex is defined.
For example, given: ^(a+)+$
, the input "aaaaaaaaaaaaaaaaX" makes the regex
engine analyze 65536 different paths.
Therefore, it is possible that a single request may cause a large amount of
computation on the server side. The problem with this regex, and others like
it, is that there are two different ways the same input character can be
accepted by the Regex due to the +
(or a *
) inside the parenthesis, and
the +
(or a *
) outside the parenthesis. The way this is written, either
+
could consume the character 'a'. To fix this, the regex should be
rewritten to eliminate the ambiguity. For example, this could simply be
rewritten as: ^a+$
, which is presumably what the author meant anyway (any
number of a's). Assuming that's what the original regex meant, this new regex
can be evaluated quickly, and is not subject to ReDOS.
Read below to find out how to fix this issue in your code.
Fixing Insecure Use of Regular Expressions
Option A: Create safe Regular Expressions
- Go through the issues that GuardRails identified in the PR.
- Identify any
*
or+
operators that are close to each other, without serving different purposes. - Rewrite the Regex to eliminate the ambiguities. Libraries like Safe-Regex can help with testing that a Regex is safe.
- Test it and ensure the regex is still working as expected.
- Ship it 🚢 and relax 🌴