In response to my previous posts I received a very nice request to explain the construction of the Quicktime content-type overflow rules. Everytime I write a Snort rule I learn something about the protocol or Snort, so it's worth it to me to dig a bit deeper. What follows is a discussion of the process I use to construct a complex rule.
To begin with I use three main sources when writing a rule.
First is any public exploit code or vulnerability details. In this case, the PoC code posted on milw0rm's site
Second, I use the RFC for the protocol in question - in this case it's RFC 2326
Third, I open "Writing Snort Rules: How to Write Snort Rules and Keep Your Sanity" from the latest Snort manual
From there it's like putting together a recipe. This is the important bit from the PoC code:
header = ( 'RTSP/1.0 200 OK\r\n' 'CSeq: 1\r\n' 'Date: 0x00 :P\r\n' 'Content-Base: rtsp://0.0.0.0/1.mp3/\r\n' 'Content-Type: %s\r\n' # <-- overflow 'Content-Length: %d\r\n' '\r\n')
If you read further into the PoC code you can see that at the %s they stuff the overflow data (900+ bytes)
One of the important ideas in rule writing is to write for the vulnerability, and not the exploit. In other words, attackers can change the way the code is written, so you want to focus in on the part that actually causes the software to misbehave. Looking at the code snippit:
tmp = "A" * 987
I could have simply looked for 987 "A"s. But the next exploit might use 987 "B"s or something else entirely.
In this case I decided to start by identifying that we are looking at an RTSP stream (rather than an HTTP stream, or some other type of traffic). That's what this does:
content: "rtsp/"; nocase; depth: 5;
That says look for the string "rtsp/" in the data portion of the packet, ignore UPPER/lower case (remember, attackers can change the exploit in the future), and stop looking after 5 bytes. We stop at 5 bytes for performance reasons, otherwise Snort would have to dig all the way through every packet to make sure it doesn't match.
Then I want to identify that this is an attack rather than an ordinary RTSP stream. That's what this does:
content: "Content-Type: "; nocase; content:!"|0A|"; within: 50;
That says to look for the string "Content-Type: " (again, not case sensitive). Once that is found, look for a line feed (hexcode 0A) in the next 50 bytes. If it doesn't find one then this matches (thats what the ! operator does, it reads as "no line feed within 50 bytes". Although response headers are terminated with \r\n (that's carriage return line feed or 0D 0A) the RFC states that clients should support a line feed only, so we will too in case an attacker tries the same thing.
Then I add on the other rule parts as necessary. I'm not going to go over each one here, as they are well documented in the Snort documentation. Also, look at existing rules for examples of how they are constructed. To highlight a main point from the Snort documentation - you'll want to put "inexpensive" operations before expensive ones, to reduce processing requirements. For example, it is faster for Snort to check the ports, hosts, and flow status (established,to_client) than for it to look for content through packet data. With thousands of rules processing millions of packets, performance can become a big issue.
Originally I wrote this as a single rule. Then I realized that it would be possible to evade detection using HTTP response splitting, although I suppose that would be RTSP response splitting in this case. Response splitting is where an attacker sends the data in multiple packets with the PSH flag set on each packet. Because Snort rules operate on packets rather than flows, if this were built into a single rule and the attacker split the RTSP/ and Content-Type: .. into separate packets, it would never match.
Instead I use two rules, and use the flowbits keyword to tie them together. flowbits can label flows so that later rules can match the same or subsequent packets. Check out this discussion for the best description I've see on using flowbits.
In the first rule I first check to see if is_proto_rtsp is set on the flow (if it is, no reason to run the content check). If it isn't and the content matches, we set is_proto_rtsp. We also use flowbits:noalert so that it only tags the flow and doesn't actually alert (we woudn't want to alert on every RTSP flow!) On the second rule, we check to see if is_proto_rtsp is set - if it isn't, we stop there. In other words, we only check for the content-type overflow if the flow matched the first rule.
I hope this helps. Please drop me some feedback if you have any questions or suggestions.