What are XSS Attacks?
XSS attacks are attacks that target the end user instead of your actual site. Vulnerable web applications that don't check or sanitize incoming data let arbitrary code to run on a client computer (such as Javascript). The end result can be anything from stealing cookie data or redirecting to a different site, to embedding a browser exploit on a page. Anything that can be done with Javascript (a lot!) can be done if your application is vulnerable.
An Example
As always, it's easier to understand concepts when given an example -- so let's make one.
Bob, a website owner, has created a custom gallery script. He created a feature that let his viewers comment on his photos by submitting a form. To enhance their messages, he lets them use certain codes to format their text (ie. bbcode, like the codes we use here on WT). For the sake of simplicity, let's say he only let users use a [img] code.
In his code, he converts the img code to HTML with this bit of regex:
PHP Code:
$message = preg_replace('#\[img\](.*?)\[/img\]#', '<img src="$1" />', $message);
So if a user enters:
My cat: [img]www.mysite.com/cat.jpg[/img]
Bob's script will output:
My cat: <img src="www.mysite.com/cat.jpg" />
Can you spot the problem? He does not check
what the user inputs between the img codes, and he is blindly trusting the user to enter correctly formatted data.
One day, an evil site owner named Jack comes along. Jack is jealous of all of Bob's traffic and decides he wants to steal some of it. He recognizes the error Bob made in his script, and exploits it. In 20 minutes, Jack has replied to many of Bob's recent gallery entries with the following comment:
Code:
Hi bob, very nice pic! [img]http://www.google.com/images/logo.gif" onload="window.location='http://jacks-site.com/'[/img]
And, of course, Bob's comment script obediantly turns it into HTML (red is Jack's input):
Code:
Hi bob, very nice pic! <img src="http://google.com/images/logo.gif" onload="window.location='http://jacks-site.com/'" />
And every time a user views one of Bob's most recent gallery photos, they are rudely redirected to Jack's site.
What Happened?
Since Bob's script didn't check Jack's input, he allowed Jack to insert his own HTML. By inserting a quote after the URL to his image (in this case, the Google logo) he closed the quote for the
src attribute. Then he just entered some code that would redirect the user to his website once the image was loaded.
How do I Prevent XSS Attacks?
To prevent XSS attacks, you just
have to check and sanitize
all user inputted data that you plan on using.
For starters, disallow all HTML. Use
htmlspecialchars() to convert HTML characters into HTML entities. So characters like < and > that mark the beginning/end of a tag are turned into < and >. It is
not enough to simply use
strip_tags() to only allow some tags as the function does not strip out harmful attributes like the onclick or onload. Even an innocent looking <strong> tag can contain some nasty code.
If you need to allow users to enter formatted text, then you have to create some sort of code like BBCode. But make sure you check and sanitize the output or else you'll suffer from vulnerabilities like Bob. For example, if you have a [url] tag that enters a link, make sure users don't enter something like
javascript:alert("Hello");
Make sure they enter valid URL's.
The rule of thumb: If it will ever be outputted, then check and sanitize it.