How to tell if proxy pattern blocking is working
by Rich SuttonNovember 19th, 2007
I occasionally get emails from new customers, usually just after they’ve installed the filter for the first time, that read like this:
“How come you guys don’t have proxy.example.com in your database?? I can’t believe you don’t block this proxy! It’s the first one I found!”
They are almost always correct when they say that we didn’t have some given proxy site in our Library. We have many, many thousands of proxy URLs in the Library, but certainly not all of them.
However, they are almost always wrong when they say that we don’t block it.
As I’ve covered before, a purely list-based approach to blocking proxies simply doesn’t work well enough to be considered a complete solution. The reason is the evolution of open-source, web-based proxies that are easily installed on home computers by non-technical users. For more info see this post.
The bottom line is, a student or employee can download one of these proxy packages for free, install it on his home computer, leave the computer on, then access it from a computer on the filtered school or work network, using just a browser. If no other web site in the world links to that web-based proxy on that user’s home computer, which is a good bet, then even Google won’t find it, much less a web-filtering vendor like us or Websense.
That’s why we detect proxies with packet signatures. We also call them “patterns”; that’s the term I’ll be using in this post.
Now — mea culpa — the reason why many of our current and prospective customers can’t tell when our proxy pattern blocking is working is because we don’t make it intuitive. We are fixing that in the next version of the filter, due out before the end of the year.
But in the meantime, here’s an illustration of what you should see when a proxy is blocked by pattern. As an example, I’m going to use a proxy site that was reported to us today by one of our customers as “one we don’t block”. (Of course, we added it to the Library today — once we know a proxy URL, it goes in the Library — so you’ll have to use a different one to test.)
The key piece of information here is that proxy patterns don’t operate on the proxy site itself when it is initially viewed. They only kick in when the proxy site is used.
So this is what you see when you view that proxy site:
Since this site is not in the Library (yet), the filter won’t block viewing its main page. If the site was in the Library, you’d see the block page. The HTTP request sent when viewing the main page has nothing special in it that can be identified by pattern.
However, once you try to use the proxy to get to a banned site, you get blocked. To test this, use the proxy just like a student would, by typing “myspace.com” into the edit box labeled “URL” or “Go to this URL”.
When the filter recognizes a proxy by pattern, it sends a TCP reset packet back to the browser, instead of a block page. This terminates the underlying connection between the browser and the proxy web site, which makes the browser think that some network error occurred. So you’ll see this:
Internet Explorer shows it a little differently:
The filter does this because it’s currently not smart enough to tell the difference between a packet signature match on HTTP versus non-HTTP traffic. For obvious reasons, sending a block page back to something other than a browser does not have the desired effect. The block page is HTML, it can only be displayed by a browser.
We’ve fixed this in the next version of the R3000 (2.0.10), which is due out before the end of the year. Now the filter recognizes when a pattern hit has occurred on an HTTP session (irrespective of the port), and we send back the block page instead of a straight TCP reset. So, you’ll see this:
Note that the “Blocked URL” shows “pattern://<IP address>”.
A couple of caveats are important here:
HTTPS proxies will still see the error page. This is because we are a pass-by filter, we don’t do SSL termination, so we can’t inject a block page into the SSL stream. We still just send a TCP reset.
Client-based proxies (or any type of proxy that isn’t a browser) can’t receive and display a block page to the end user, so those will also still be terminated with TCP resets. The proxy program will simply complain that it can’t connect.
Tags: anonymizers, circumventors, patterns, proxy


December 18th, 2007 at 3:41 pm
But not everything is 100%. As an 8e6 client, and we’ve had several different filtering products, your product is one of the best seen for blocking proxies. But there are some site-based proxies that do still work (example-which I sent today for URL filtering- delprox.com). Could this be a new version of the proxy code that does not match the patterns?
BTW- Thanks for doing the blog. I’m finding very useful items here, Rich.
December 19th, 2007 at 5:57 pm
Thanks for the comment - we just had someone test out delprox.com and we’re blocking it by pattern. It looks like a standard PHProxy installation, so that’s what I would expect. I’ll ask one of our Tech Support folks to contact you to double check your config.
And by the way - I totally agree with your “not everything is 100%” comment. Nothing drives me crazier than when marketing departments (including ours) trot out the 100% effectiveness claim. As anybody who’s been in the trenches can tell you, security requires constant vigilance. For a security software vendor, this means the ability to quickly turn around product enhancements and improvements when the bad guys get out in front.
December 27th, 2007 at 10:51 am
[...] We’ve fixed one of the most confusing aspects of the product when it comes to figuring out whether or not our proxy pattern blocking is working. In previous versions of the filter, we would simply send a TCP reset back to the browser, which makes the user (or admin testing our product) think a network error occurred. Now, we’ll actually show a block page. The details, including screen shots, can be found in this previous post. [...]
January 9th, 2008 at 2:53 pm
[...] Anonymous proxies fit this model as well. Proxy packages are designed to be installed on user’s home computers and accessed from work (an overview of this problem is here and how we handle it is here). [...]