The Apache HTTP Server module mod_jk and its ISAPI and NSAPI redirector variants for Microsoft IIS and the iPlanet Web Server connect the web server to a backend (typically Tomcat) using the AJP protocol. The web server receives an HTTP(S) request and the module forwards the request to the backend. This function is usually called a gateway or a proxy, in the context of HTTP it is called a reverse proxy.
Reverse Proxy HowTo
Introduction
Typical Problems
A reverse proxy is not totally transparent to the application on the backend. For instance the host name and port the original client (e.g. browser) needs to talk to belong to the web server and not to the backend, so the reverse proxy talks to a different host name and port. When the application on the backend returns content including self-referential URLs using its own backend address and port, the client will usually not be able to use these URLs.
Another example is the client IP address, which for the web server is the source IP of the incoming connection, whereas for the backend the connection always comes from the web server. This can be a problem, when the client IP is used by the backend application e.g. for security reasons.
AJP as a Solution
Most of these problems are automatically handled by the AJP protocol and the AJP connectors of the backend. The AJP protocol transports this communication metadata and the backend connector presents this metadata whenever the application asks for it using Servlet API methods.
The following list contains the communication metadata handled by AJP and the ServletRequest/HttpServletRequest API calls which can be used to retrieve them:
- local name:
getLocalName()
. This is also equal togetServerName()
, unless aHost
header is contained in the request. In this case the server name is taken from that header. - local IP address:
getLocalAddr()
. The local IP address was initially not supported. It is available when using version 1.2.41 for Apache or IIS together with Tomcat version at least 6.0.42, 7.0.55 or 8.0.11. For older versions or when using the NSAPI redirector,getLocalAddr()
will incorrectly return the same result asgetLocalName()
. As a workaround you can forward the local IP address by settingJkEnvVar SERVER_ADDR
and then either usingrequest.getAttribute("SERVER_ADDR")
instead ofgetLocalAddr()
or wrapping the request using a filter and overridinggetLocalAddr()
withrequest.getAttribute("SERVER_ADDR")
. - local port:
getLocalPort()
. This is also equal togetServerPort()
, unless aHost
header is contained in the request. In this case the server port is taken from that header if it contains an explicit port, or is equal to the default port of the scheme used. - client address:
getRemoteAddr()
- client port:
getRemotePort()
. The remote port was initially not supported. It is available when using version 1.2.32 for Apache or IIS together with Tomcat version at least 5.5.28, 6.0.20 or 7.0.0. For older versions or when using the NSAPI redirector,getRemotePort()
will incorrectly return 0 or -1. As a workaround you can forward the remote port by settingJkEnvVar REMOTE_PORT
and then either usingrequest.getAttribute("REMOTE_PORT")
instead ofgetRemotePort()
or wrapping the request using a filter and overridinggetRemotePort()
withrequest.getAttribute("REMOTE_PORT")
. - client host:
getRemoteHost()
- authentication type:
getAuthType()
- remote user:
getRemoteUser()
, iftomcatAuthentication="false"
- protocol:
getProtocol()
- HTTP method:
getMethod()
- URI:
getRequestURI()
- HTTPS used:
isSecure()
,getScheme()
- query string:
getQueryString()
SSLOptions +StdEnvVars
. For the certificate information you also need
to set SSLOptions +ExportCertData
.
- SSL cipher:
getAttribute(javax.servlet.request.cipher_suite)
- SSL key size:
getAttribute(javax.servlet.request.key_size)
. Can be disabled usingJkOptions -ForwardKeySize
. - SSL client certificate:
getAttribute(javax.servlet.request.X509Certificate)
. If you want the whole certificate chain, then you need to also setJkOptions ForwardSSLCertChain
. It is likely, that in this case you also need to adjust the maximal AJP packet size using the worker attribute max_packet_size. - SSL session ID:
getAttribute(javax.servlet.request.ssl_session)
. This is for Tomcat, it has not yet been standardized.
Fine Tuning
In some situations this is not enough though. Assume there is another less clever reverse proxy in front of your web server, for instance an HTTP load balancer or similar device which also serves as an SSL accelerator.
Then you are sure that all your clients use HTTPS, but your web server doesn't know about that. All it can see is requests coming from the accelerator using plain HTTP.
Another example would be a simple reverse proxy in front of your web server,
so that the client IP address that your web server sees is always the IP address
of this reverse proxy, and not of the original client. Often such reverse proxies
generate an additional HTTP header, like X-Forwareded-for
which
contains the original client IP address (or a list of IP addresses, if there are
more cascading reverse proxies in front). It would be nice, if we could use the
content of such a header as the client IP address to pass to the backend.
So we might need to manipulate some of the data that AJP sends to the backend. When using mod_jk inside the Apache HTTP Server you can use several Apache environment variables to let mod_jk know, which data it should forward. These environment variables can be set by the configuration directives SetEnv or SetEnvIf, but also in a very flexible way using mod_rewrite (since Apache 2.x it can not only test against environment variables, but also set them).
The following list contains all environment variables mod_jk checks, before sending data to the backend:
- JK_LOCAL_NAME: the local name
- JK_LOCAL_PORT: the local port
- JK_REMOTE_HOST: the client host
- JK_REMOTE_ADDR: the client address
- JK_AUTH_TYPE: the authentication type
- JK_REMOTE_USER: the remote user
- HTTPS: On (case-insensitive) to indicate, that HTTPS is used
- SSL_CIPHER: the SSL cipher
- SSL_CIPHER_USEKEYSIZE: the SSL key size
- SSL_CLIENT_CERT: the SSL client certificate
- SSL_CLIENT_CERT_CHAIN_: prefix of variable names, containing the client cerificate chain
- SSL_SESSION_ID: the SSL session ID
Remember: in general you don't need to set them. The module retrieves the data automatically from the web server. Only in case you want to change this data, you can overwrite it by using these variables.
Some of these variables might also be used by other web server modules. All variables whose name does not begin with "JK" are set directly by the Apache HTTP Server. If you want to change the data, but do not want to negatively influence the behaviour of other modules, you can change the names of all variables mod_jk uses to private ones. For the details see the Apache reference page.
All variables, that are not SSL-related have only been introduced in version 1.2.27.
In addition there are two special shortcuts to influence the client IP address that is forwarded.
Using JkOptions ForwardLocalAddress
you can forward the local IP address of the web server
as the client IP address. This can be useful, e.g. when using the Tomcat remote address valve for
allowing connections only from registered Apache HTTP Servers.
Using JkOptions ForwardPhysicalAddress
you always forward the physical peer
IP address as the client address. By default mod_jk
uses the logical address as provided by the web server. For example the module
mod_remoteip sets the logical IP address to the client IP forwarded by proxies
in the X-Forwarded-For
header.
Tomcat AJP Connector Settings
As an alternative to using the environment variables described in the previous section
(which do only exist when using Apache), you can also configure Tomcat to overwrite
some of the communications data forwarded by mod_jk. The AJP connector in Tomcat's server.xml
allows to set the following properties:
- proxyName: server name as returned by
getServerName()
- proxyPort: server port as returned by
getServerPort()
- scheme: protocol scheme as returned by
getScheme()
- secure: set to "true", if you wish
isSecure()
to return "true".
URL Handling
URL Rewriting
Sometimes one want to change path components of the URLs under which an application
is available. Especially if a web application is deployed as some context, say /myapp
,
marketing prefers short URLs, so want the application to be directly available under
http://www.mycompany.com/
. Although you can deploy the application as the so-called
ROOT context, which will be directly available at "/", admins often prefer not to use
the ROOT context, e.g. because only one application can be the root context (per host).
The procedure to change the URLs in the reverse proxy is tedious, because often an application produces self-referential URLs, which then include the path components which you tried to hide to the outside world. Nevertheless, if you absolutely need to do it, here are the steps.
Case A: You need to make the application available at a simple URL, but it is OK, if users proceed using the more complex URLs, as long as they don't have to type them in. That's the easy case, and if this suffices to you, you're lucky. Use a simply RedirectMatch for the Apache HTTP Server:
RedirectMatch ^/$ http://www.mycompany.com/myapp/
Your application will then be available under http://www.mycompany.com/
,
and each visitor will be immediately redirected to the real URL
http://www.mycompany.com/myapp/
Case B: You need to hide path components for all requests going to the application.
Here's the recipe for the case, where you want to hide the first path component
/myapp
. More complex manipulations are left as an exercise to the reader.
First the solution for the case of the Apache HTTP Server:
1. Use mod_rewrite
to add /myapp
to all requests before forwarding to the backend:
# Don't forget the PT flag! (pass through)
RewriteRule ^/(.*) http://www.mycompany.com/myapp/$1 [PT]
2. Use mod_headers
to rewrite any HTTP redirects your application might return. Such redirects typically contain
the path components you want to hide, because by the HTTP standard, redirects always need to include
the full URL, and your application is not aware of the fact, that your clients talk to it via
some shortened URL. An HTTP redirect is done with a special response header named Location
.
We rewrite the Location headers of our responses:
# Keep protocol, server and port if present,
# but insert our webapp name before the rest of the URL
Header edit Location ^([^/]*//[^/]*)?/(.*)$ $1/myapp/$2
3. Use mod_headers
again, to rewrite the paths contained in any cookies,
your application might set. Such cookie paths again might contain
the path components you want to hide.
A cookie is set with the HTTP response header named Set-Cookie
.
We rewrite the Set-Cookie headers of our responses:
# Fix the cookie path
Header edit Set-Cookie "^(.*; Path=/)(.*)" $1/myapp/$2
3. Some applications might contain hard coded absolute links.
In this case check, whether you find a configuration item for your web framework
to configure the base URL. If not, your only chance is to parse all response
content bodies and do search and replace. This is fragile and very resource intensive.
If you really need to do this, you can use
mod_proxy_html
,
mod_substitute
or mod_sed
for this task.
If you are using Microsoft IIS as a web server, the ISAPI redirector provides a way of doing the first step with a builtin feature. You define a mapping file for simple prefix changes like this:
# Add a context prefix to all requests ...
/=/myapp/
# ... or change some prefix ...
/oldapp/=/myapp/
and then put the name of the file in the rewrite_rule_file
entry of the registry or your
isapi_redirect.properties
file. In your uriworkermap.properties
file, you
still need to map the URLs as they are before rewriting!
More complex rewrites can be done using the same file, but with regular expressions. A leading
tilde sign '~
', indicates, that you are using a regular expression:
# Use a regular expression rewrite
~/oldapps([0-9]*)/=/newapps$1/
There is no support for Steps 2 (rewriting redirect responses) or 3 (rewriting cookie paths).
URL Encoding
Some types of problems are triggered by the use of encoded URLs
(see percent encoding).
For the same location there exist
a lot of different URLs which are equivalent. The reverse proxy needs to inspect the URL in order
to apply its own authentication rules and to decide, to which backend it should send the request
(or whether it should handle it itself). Therefore the request URL first is normalized:
percent encoded characters are decoded, /./
is replaced by /
,
/XXX/../
is replaced by /
and similar manipulations of the URL are done.
After that, the web server might apply rewrite rules to further change the URL in less obvious ways.
Finally there is no more way to put the resulting URL in an encoding, which is "similar" to
the one which was used for the original URL.
For historical reasons, there have been several alternatives, how mod_jk and the ISAPI
plugin encoded the resulting URL before sending it to the backend. They could be chosen via
JkOptions
(mod_jk) or uri_select
(ISAPI). None of those historical
encodings are recommended, because they have either negative functionality implications or
pose a security risk. The default encoding since version 1.2.24 is ForwardURIProxy
(mod_jk) or proxy
(ISAPI) and it is strongly recommended to keep the default
and remove all old explicit settings.
Request Attributes
You can also add more attributes to any request you are forwarding when using the Apache HTTP Server.
For this use the JkEnvVar
directive (for details see the
Apache reference page). Such request attributes can be
retrieved on the Tomcat side via request.getAttribute(attributeName).
Note that the names of attributes set via mod_jk will not be listed in request.getAttributeNames()!