Select Right HTTP Proxy Design

Title: How to select the right HTTP proxy design
Author(s): Christian
Date: 9 Sept. 2011
Version(s): all
Zentyal profiles: Gateway

Before starting such discussion, let's clarify some points: Proxy, because this is one more component in the middle between client and server, will not improve performance until, if cache is used, there is a significant number of users benefiting from cache. Latency will not be shorter except for pages in cache but there is more and more “PRAGMA NO-CACHE” tags [0] and therefore less objects stored in cache :-(
On the other hand, proxy will bring a lot of added value in term of security and control.

Let's assume proxy is deployed on Zentyal server with one connection inside (Intranet) and one connection outside (Internet) as described in the “Perfect Zentyal Gateway setup” document. [http://trac.zentyal.org/wiki/Documentation/Community/HowTo/GatewaySetup ]

One design that will not be discussed here: deployment where proxy is enabled but Zentyal is not the default gateway on your LAN. With such design, using proxy in explicit mode is more than strongly advised unless you want to introduce potentially complex routes. simple reason is that transparent proxy will not work unless you redirect flow at default gateway level in order to reach Zentyal.

Transparent proxy mode:

This mode permits to intercept [1], thanks to firewall[2], all requests sent to internet to proxy listening port (default in Zentyal being 3128).

Pros:

no need to define anything on client machines at browser level.
Users may ignore there is a proxy in the middle

Cons:

HTTPS flow can NOT be handled by transparent proxy. This requires to add firewall rules to permit direct access from browser to HTTPS server.
As a result, filtering defined at proxy level doesn't apply for HTTPS and must be managed, by IP address, at firewall level.
Transparent proxy MUST be deployed at subnet default gateway otherwise clients will never reach it.
As this is transparent, no authentication and therefore profiling based on name or group membership can apply. This also means no access control.
Client needs to perform DNS request even if page will be found in HTTP proxy cache
Doesn't work (easily) if Zentyal has only one NIC (Network Interface Controller, i.e. network card)

Explicit (non transparent) proxy mode:

In this mode, browser “knows” there is a proxy to be used. Different mechanisms can be implemented to automatize client setup.

Pros:

Proxy can be deployed anywhere on Intranet, no need to match default gateway IP. (thus is works with "single NIC Zentyal")
Authentication and therefore access control and profiling can be enabled.
HTTPS is handled by proxy. No need for extra firewall rules. Content filtering doesn't work because of encrypted session between client and server (TLS) but domain filtering works.
No need for extra firewall filtering rules.
No DNS requests from client but from proxy (better use of shared DNS cache)
WPAD can bring additional control on how proxy is used.

Cons:

Browser configuration: if browser is not configured to use proxy, it doesn't work.
Users are aware that proxy is used (and therefore control and logs can be enabled)
HTTPS to "non standard" HTTPS ports (i.e. 443) requires customized configuration (safe ports in squid.conf)

This is shown with this drawing:

Making a decision

If your main criteria is to not manually change Zentyal configuration, then transparent proxy is the best choice
If you need to provide different profile for users (filtering, access control) then explicit proxy is the only choice.

For the shake of discussion, let's assume this last point is your choice but you don't want to modify each and every client browser neither...

On large environments, maintaining configuration client side can be painful and time-consuming. Many services aim at easing this:

DNS exist to avoid local /etc/hosts file management
DHCP exists to avoid configuring IP address on each device. (IP address is a shortcut. DHCP can manage much more network related stuff)

Same, some mechanisms exist to help proxy configuration on browsers.

If we look at Firefox (IE provides very similar settings ;-) ), we have 5 different options[3]:

1. No proxy

2. Manual proxy configuration

3. Use system proxy settings

4. Automatic proxy configuration URL

5. Auto-detect proxy settings for this network.

Lets have a closer look at each one:

1. No proxy Not very interesting here as goal is to use proxy :-)

2. Manual proxy settings
This is the potentially painful approach. It has to be done on each and every machine. What you put there is IP address and port number. If any changes, you have to update it everywhere :-( No admin wants to do that!

3. Use system proxy setting
Default setting (if I'm not wrong) for Firefox. Useful if proxy is already defined at system level.

4. Automatic proxy configuration URL
Default setting (if I'm not wrong) for IE. This one is definitely better than manual configuration because it can be less prone to change even if proxy IP address or port changes. This URL will provide access to a special file: proxy.pac describing browser behavior based on rules stored in this file.

5. Auto-detect proxy settings for this network (known as WPAD: Web Proxy Auto Discovery Have a look at http://www.wrec.org/Drafts/draft-cooper-webi-wpad-00.txt )

This is an extension of previous mechanism but URL is even not stored at browser level but provided by (in this order):

DHCP: option 252
DNS: multiple mechanisms can be used here.
SLP (Service Location Protocol)
Well known aliases (browser will search for DNS entry describing “wpad.yourdomain”[4]. This requires machine to be known as machine.domain otherwise domain is unknown).
Service: URLs (DNS TXT and SRV records)

DHCP and DNS “Well known aliases” are the only two mandatory mechanisms for web client as described in draft RFC.

WPAD is very flexible and powerful but has some constraints (also shared with proxy configuration URL): proxy.pac file (or wpad.dat) has to be written and stored on web server.

That's it for the concept part, let's try to implement it now...

Part two: implementation

We will look here at “auto detect proxy” mechanism, i.e. WPAD.

First step is to set up webserver for wpad.yourdomain.com[4]. This can be done with Zentyal web server module → Virtual host → wpad.yourdomain.com[4] This server is mandatory to handle your wpad.dat file.

Then decide about method that fits the best for you:

Either DHCP + WPAD (1+3 hereafter)
or DNS + WPAD (2+3)
although I would strongly suggest to implement 1+2+3 in order to ensure wide coverage if you have different clients like Windows, Linux, Smartphone, Tablet PC

1 - DHCP is the one tried first by client... but Zentyal doesn't permit to easily configure new DHCP options. Still you can do it manually in /usr/share/ebox/stubs or better using hooks http://trac.zentyal.org/wiki/Documentation/Community/HowTo/CustomizeConfigFiles

2 – DNS implementation with the “well known aliases” method is easier because especially in case clients are using DHCP which will provide consistent domain name because then browser will search for wpad.(whatever).yourdomain[4]. Let's make it clearer

you domain is "mydomain.com"
client, thanks to DHCP is known as "client.private.mydomain.com"
WPAD mechanism will search in DNS for: wpad.private.mydomain.com and wpad.mydomain.com

If you have set up such name in your DNS pointing to web server described above, you're done :-)

Starting with Zentyal 2.2, you can also improve DNS based discovery by maintaining SRV and/or TXT records using Zentyal GUI.

wpad            IN      A       192.168.0.10  (your wpad address here... if CNAME is not used)
                IN      TXT     "service: wpad:http://wpad.yourdomain/proxy.pac"
_wpad._tcp      IN      SRV     0 0 80 wpad.yourdomain.

Please notice the "dot" at the end of SRV record...

Notice that wpad doesn't exist, by default, in /etc/services. You will have to edit this file and add:

wpad            3128/tcp        wpad            # http proxy

3 – Last step: create a wpad.dat file and store it at the root of your wpad.yourdomain[4] web server, that's it.

Generic wpad.pad example:

proxy.pac or wpad.dat example:

function FindProxyForURL(url, host)
{
   if (isInNet(host, "192.168.0.0", "255.255.255.0")) {
      return "DIRECT";
   } else {
      if (shExpMatch(url, "http:*")) 
         return "PROXY zentyal.yourdomain.com:3128" ;
      if (shExpMatch(url, "https:*"))
         return "PROXY zentyal.yourdomain.com:3128" ;
      if (shExpMatch(url, "ftp:*"))
         return "PROXY zentyal.yourdomain.com:3128" ;
      return "DIRECT";
   }
}

Above example says:

for anything on subnet 192.168.0.0/24, no proxy
for anything else using HTTP, HTTPS and FTP protocol, then go to zentyal.yourdomain.com on port 3128 (this is the proxy on Zentyal)

Would you need to test your PAC file, go there: http://code.google.com/p/pactester/

Having this wpad.yourdomain.com running on Zentyal is easy: just configure VHOST in web server section.
Notice that depending on your client configuration, some will search for http://wpad.yourdomain.com while some others will search for http://wpad without domain extension. In order to have large coverage in term of client, this is a good idea to maintain both.

'Some hints:' not exposing your wpad.pac file on internet is a good idea ;-) It requires to either run wpad.yourdomain.com server internally or not to bind this server (or virtual host if ran on Zentyal) on external interface. In environments where security is highly critical, not using WPAD is safer because of its “auto-discovery” approach permitting attack especially if Dynamic DNS is enabled. The cost is more manual administration overhead.

Notes

[0] PRAGMA NO-CACHE directive permits to tell proxy and browser not to cache dynamic content
[1] assuming Zentyal Intranet address is defined as default gateway for machines on Intranet
[2] firewall is used to intercept requests. If firewall is stopped, no redirection occurs.
[3] http://support.mozilla.com/en-US/kb/Options%20window%20-%20Advanced%20panel
[4] replace 'yourdomain' with your real domain name.

Select Right HTTP Proxy Design

Contents

How to select the right HTTP proxy design?

Part one: concept

Transparent proxy mode:

Pros:

Cons:

Explicit (non transparent) proxy mode:

Pros:

Cons:

Making a decision

Part two: implementation

Further reading

Notes

Personal tools

Namespaces

Variants

Views

Actions

Search

Zentyal Wiki

Zentyal Doc

Navigation

Toolbox