Tomcat clustering, Varnish and Blue/Green deployment – Part 1

At Claranet, we aim to provide the best possible uptime to our customers while encouraging them to adopt an agile workflow where they push new versions of their apps often and seemlessly.

Because most of our clients are using PHP (or Ruby), we usually help them to setup Capistrano: it’s efficient and there are a lot of plugins for different frameworks (capifony for instance). Varnish is also quite often in our list of requirements, because of its sheer capacity to absorb trafic and deliver content.

Prerequesites:
- To best understand, you should know what Varnish is (a caching reverse-proxy), its basics (VCL, hash, etc.) and have a simple understanding of Tomcat.
- The software versions used in this post are: OS Debian 7 |Tomcat >= 8.0.15 |Apache 2.2 |Varnish 3.0.x |vmod curl – It should work without much adaptation on other distros.

When UGC came to us with their Java app, they had a set of strong requirements in response to which we had to innovate. With Alteis, their developpers, we set up a way to easily deploy apps in their Tomcat cluster while using Varnish in front of the whole infrastructure. On an infrastructure like UGC’s, Varnish is mandatory if we want to keep the number of servers used reasonable. It also allows us to (greatly) lessen the load when there is a lot of trafic (typically, on rainy days for cinemas!).

I will try to demonstrate how you can setup Varnish in front of a Tomcat cluster using Blue/Green deployment (known as Parallel Deployment in Tomcat parlance).

The goal here is to provide the following functionalities:

Being able to deploy an app in a Tomcat cluster by a simple operation,
Doing so without any downtime,
Having a good caching policy in Varnish (i.e, being able to cache more than just static files).

Part 1 - Tomcat

Analysis

Because we don’t want to reinvent the wheel, the best is to use tools and methods that already exist. With this in mind, Tomcat already provides a lot of the functionnalities we need:

Let’s talk a little about parallel deployment. It allows you to deploy multiple apps under the same URL. You can use it to upgrade an app: just deploy the new version under the same URL and new users are going to use the new version while users with sessions on the old version will stay on the old version. When all users are on the new version, you can delete the old one.

! The problem

The problem you now have is that you want to use Varnish and by default Varnish identifies objects to cache by their URL. As app 001 and 002 have exactly the same URLs, their cache will be mixed up and the result will be desastrous: Varnish will serve a mix of app 001 and 002 to users.

To solve this problem, we will isolate the cache of both versions of the app and identify which version is used by a given user.

Implementation

Set-up of a Tomcat cluster and cluster-wide deployment

As long as you can use multicast in your environment (which is NOT the case in some popular public cloud offers), this is pretty straightforward. In your tomcat’s server.xml, you can define it either in the « Engine » node (which will make the whole Tomcat member of the cluster), or in a « Host » node (which will make only this virtual host member of a cluster). I advise you to choose the later because you can then configure the FarmDeployer which allows you to deploy a webapp in your whole cluster.

&lt;Host name="www.ugc.fr" appBase="/my/app/base"&gt;
[ ... ]
 &lt;Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
        channelSendOptions="6"&gt;
&lt;Manager className="org.apache.catalina.ha.session.BackupManager"
        expireSessionsOnShutdown="false"
        notifyListenersOnReplication="true"
        mapSendOptions="6"/&gt;
&lt;Channel className="org.apache.catalina.tribes.group.GroupChannel"&gt;
   &lt;Membership className="org.apache.catalina.tribes.membership.McastService"
        address="228.0.0.5"
        port="12345"
        frequency="500"
        dropTime="3000"
        bind="10.0.0.1"/&gt;
   &lt;Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
        address="10.0.0.1"
        port="5000"
        selectorTimeout="100"
        maxThreads="6"/&gt;
   &lt;Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"&gt;
      &lt;Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/&gt;
   &lt;/Sender&gt;
   &lt;Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/&gt;
   &lt;Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/&gt;
   &lt;Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/&gt;
&lt;/Channel&gt;
&lt;Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*.gif|.*.js|.*.jpeg|.*.jpg|.*.png|.*.css|.*.txt"/&gt;
   &lt;Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
       tempDir="/cataline/home/cluster/www.ugc.fr/temp/"
       deployDir="/my/app/base"
       watchDir="/cataline/home/cluster/www.ugc.fr/listen/"
       watchEnabled="true"/&gt;
   &lt;ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/&gt;
 &lt;/Cluster&gt;
&lt;/Host&gt;

Let’s decompose this:

The « Host » node is the declaration of the virtual host. It goes in the « Engine » node in the server.xml file.
The « org.apache.catalina.valves.AccessLogValve » is just here to configure the access logs, it is not important for this example.
Then, we have the important stuff: the « Cluster » node.
- The « Manager » node is the class dealing with the exchange of session information between cluster members.
- The « Channel » node deals with how cluster members discover and communicate with each other.
  - The « Membership » node is how they discover their peers (multicast here, but it is possible to list unicast addresses of other peers). If you have several Host nodes, each with a cluster, you have to change the address and/or port of the multicast setting.
  - The « Receiver » node is how they talk to each other: once this peer has been discovered, other Tomcats can talk to it at the address0.0.1:5000
  - The other nodes are some fine tunning on this.
- The « org.apache.catalina.ha.tcp.ReplicationValve » allows you to list files for which no session information is transmitted between the cluster members.
- The « Deployer » node allows you to deploy an app on the whole cluster. You just have to drop your war file in « /cataline/home/cluster/www.ugc.fr/listen/ » and it will be deployed on the whole cluster. The « tempDir » and the « deployDir » must be writable by Tomcat. The « watchEnabled » setting must be « true » on a SINGLE member. You should always deploy your apps from the same server.

+ With this config, you now have a functionning Tomcat cluster where you can deploy apps by simply dropping them in a directory. Sessions will be shared between the servers which will enable you to setup a non-sticky load-balancer in front of your servers. It also means that if one of the Tomcats goes down, no user looses its session.

! You want all your Tomcat servers to be up and running while you are deploying an app. If a Tomcat server joins the cluster after the deployment, it won’t receive the messages ordering it to deploy the app, leaving it empty.

Tomcat’s parallel deployment

Each app must come as a WAR with the following naming convention: name##version.war. For instance, you can deploy both foo##001.war and foo##002.war under http://localhost:8080/foo.The trafic is then routed to one app or the other according to sessions. Let’s take an exemple:

You have foo##001.wardeployed under http://localhost:8080/foo, you have users using this app, they each have a session on it (usually a JSESSIONID cookie),
You now deploy foo##002.waron http://localhost:8080/foo. What happens now is:
1. Users with valid sessions for app 001 will stay on app 001.
2. Users with invalid or non-existent sessions will be redirected on app 002.
Trafic will now slowly switch from app 001 to 002, with new users arriving and old ones leaving (the shortest the TTL of the sessions is, the quickest will the trafic switch from one version to the other).
When there is no user left on app 001 (you can easily see this in the Tomcat manager), you can undeploy app 001 leaving only app 002.

You have updated your app without any downtime and completely seemlessly.

This method can be applied on a cluster without any problem.

Part 1 - Conclusion

At this stage, we have a functionning Tomcat cluster on which we can deploy several versions of the same application in parallel. The deployment is done by simply copying the war file in a directory on a single member of the cluster. In part 2, we will see how to integrate Varnish in this setup.

Part 2

=> Tomcat Clustering, Varnish and Blue/Green deployment - part 2

Part 1 - Tomcat

Analysis

Implementation

Part 1 - Conclusion

Part 2

Contactez nos experts