Details | Last modification | View Log | RSS feed
| Rev | Author | Line No. | Line |
|---|---|---|---|
| 771 | blopes | 1 | <!DOCTYPE html SYSTEM "about:legacy-compat"> |
| 2 | <html lang="en"><head><META http-equiv="Content-Type" content="text/html; charset=UTF-8"><link href="./images/docs-stylesheet.css" rel="stylesheet" type="text/css"><title>Apache Tomcat 9 (9.0.112) - Clustering/Session Replication How-To</title></head><body><div id="wrapper"><header><div id="header"><div><div><div class="logo noPrint"><a href="https://tomcat.apache.org/"><img alt="Tomcat Home" src="./images/tomcat.png"></a></div><div style="height: 1px;"></div><div class="asfLogo noPrint"><a href="https://www.apache.org/" target="_blank"><img src="./images/asf-logo.svg" alt="The Apache Software Foundation" style="width: 266px; height: 83px;"></a></div><h1>Apache Tomcat 9</h1><div class="versionInfo"> |
||
| 3 | Version 9.0.112, |
||
| 4 | <time datetime="2025-11-06">Nov 6 2025</time></div><div style="height: 1px;"></div><div style="clear: left;"></div></div></div></div></header><div id="middle"><div><div id="mainLeft" class="noprint"><div><nav><div><h2>Links</h2><ul><li><a href="index.html">Docs Home</a></li><li><a href="https://cwiki.apache.org/confluence/display/TOMCAT/FAQ">FAQ</a></li></ul></div><div><h2>User Guide</h2><ul><li><a href="introduction.html">1) Introduction</a></li><li><a href="setup.html">2) Setup</a></li><li><a href="appdev/index.html">3) First webapp</a></li><li><a href="deployer-howto.html">4) Deployer</a></li><li><a href="manager-howto.html">5) Manager</a></li><li><a href="host-manager-howto.html">6) Host Manager</a></li><li><a href="realm-howto.html">7) Realms and AAA</a></li><li><a href="security-manager-howto.html">8) Security Manager</a></li><li><a href="jndi-resources-howto.html">9) JNDI Resources</a></li><li><a href="jndi-datasource-examples-howto.html">10) JDBC DataSources</a></li><li><a href="class-loader-howto.html">11) Classloading</a></li><li><a href="jasper-howto.html">12) JSPs</a></li><li><a href="ssl-howto.html">13) SSL/TLS</a></li><li><a href="ssi-howto.html">14) SSI</a></li><li><a href="cgi-howto.html">15) CGI</a></li><li><a href="proxy-howto.html">16) Proxy Support</a></li><li><a href="mbeans-descriptors-howto.html">17) MBeans Descriptors</a></li><li><a href="default-servlet.html">18) Default Servlet</a></li><li><a href="cluster-howto.html">19) Clustering</a></li><li><a href="balancer-howto.html">20) Load Balancer</a></li><li><a href="connectors.html">21) Connectors</a></li><li><a href="monitoring.html">22) Monitoring and Management</a></li><li><a href="logging.html">23) Logging</a></li><li><a href="apr.html">24) APR/Native</a></li><li><a href="virtual-hosting-howto.html">25) Virtual Hosting</a></li><li><a href="aio.html">26) Advanced IO</a></li><li><a href="maven-jars.html">27) Mavenized</a></li><li><a href="security-howto.html">28) Security Considerations</a></li><li><a href="windows-service-howto.html">29) Windows Service</a></li><li><a href="windows-auth-howto.html">30) Windows Authentication</a></li><li><a href="jdbc-pool.html">31) Tomcat's JDBC Pool</a></li><li><a href="web-socket-howto.html">32) WebSocket</a></li><li><a href="rewrite.html">33) Rewrite</a></li><li><a href="cdi.html">34) CDI 2 and JAX-RS</a></li><li><a href="graal.html">35) AOT/GraalVM Support</a></li></ul></div><div><h2>Reference</h2><ul><li><a href="RELEASE-NOTES.txt">Release Notes</a></li><li><a href="config/index.html">Configuration</a></li><li><a href="api/index.html">Tomcat Javadocs</a></li><li><a href="servletapi/index.html">Servlet 4.0 Javadocs</a></li><li><a href="jspapi/index.html">JSP 2.3 Javadocs</a></li><li><a href="elapi/index.html">EL 3.0 Javadocs</a></li><li><a href="websocketapi/index.html">WebSocket 1.1 Javadocs</a></li><li><a href="jaspicapi/index.html">JASPIC 1.1 Javadocs</a></li><li><a href="annotationapi/index.html">Common Annotations 1.3 Javadocs</a></li><li><a href="https://tomcat.apache.org/connectors-doc/">JK 1.2 Documentation</a></li></ul></div><div><h2>Apache Tomcat Development</h2><ul><li><a href="building.html">Building</a></li><li><a href="changelog.html">Changelog</a></li><li><a href="https://cwiki.apache.org/confluence/display/TOMCAT/Tomcat+Versions">Status</a></li><li><a href="developers.html">Developers</a></li><li><a href="architecture/index.html">Architecture</a></li><li><a href="tribes/introduction.html">Tribes</a></li></ul></div></nav></div></div><div id="mainRight"><div id="content"><h2>Clustering/Session Replication How-To</h2><h3 id="Important_Note">Important Note</h3><div class="text"> |
||
| 5 | <p><b>You can also check the <a href="config/cluster.html">configuration reference documentation.</a></b> |
||
| 6 | </p> |
||
| 7 | </div><h3 id="Table_of_Contents">Table of Contents</h3><div class="text"> |
||
| 8 | <ul><li><a href="#For_the_impatient">For the impatient</a></li><li><a href="#Security">Security</a></li><li><a href="#Cluster_Basics">Cluster Basics</a></li><li><a href="#Overview">Overview</a></li><li><a href="#Cluster_Information">Cluster Information</a></li><li><a href="#Bind_session_after_crash_to_failover_node">Bind session after crash to failover node</a></li><li><a href="#Configuration_Example">Configuration Example</a></li><li><a href="#Cluster_Architecture">Cluster Architecture</a></li><li><a href="#How_it_Works">How it Works</a></li><li><a href="#Monitoring_your_Cluster_with_JMX">Monitoring your Cluster with JMX</a></li><li><a href="#FAQ">FAQ</a></li></ul> |
||
| 9 | </div><h3 id="For_the_impatient">For the impatient</h3><div class="text"> |
||
| 10 | <p> |
||
| 11 | Simply add |
||
| 12 | </p> |
||
| 13 | <div class="codeBox"><pre><code><Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"/></code></pre></div> |
||
| 14 | <p> |
||
| 15 | to your <code><Engine></code> or your <code><Host></code> element to enable clustering. |
||
| 16 | </p> |
||
| 17 | <p> |
||
| 18 | Using the above configuration will enable all-to-all session replication |
||
| 19 | using the <code>DeltaManager</code> to replicate session deltas. By all-to-all, we mean that <i>every</i> |
||
| 20 | session gets replicated to <i>all the other nodes</i> in the cluster. |
||
| 21 | This works great for smaller clusters, but we don't recommend it for larger clusters — more than 4 nodes or so. |
||
| 22 | Also, when using the DeltaManager, Tomcat will replicate sessions to <i>all</i> nodes, |
||
| 23 | <i>even nodes that don't have the application deployed</i>.<br> |
||
| 24 | To get around these problem, you'll want to use the <code>BackupManager</code>. The <code>BackupManager</code> |
||
| 25 | only replicates the session data to <i>one</i> backup node, and only to nodes that have the application deployed. |
||
| 26 | Once you have a simple cluster running with the <code>DeltaManager</code>, you will probably want to |
||
| 27 | migrate to the <code>BackupManager</code> as you increase the number of nodes in your cluster. |
||
| 28 | </p> |
||
| 29 | <p> |
||
| 30 | Here are some of the important default values: |
||
| 31 | </p> |
||
| 32 | <ol> |
||
| 33 | <li>Multicast address is 228.0.0.4</li> |
||
| 34 | <li>Multicast port is 45564 (the port and the address together determine cluster membership.</li> |
||
| 35 | <li>The IP broadcasted is <code>java.net.InetAddress.getLocalHost().getHostAddress()</code> (make sure you don't broadcast 127.0.0.1, this is a common error)</li> |
||
| 36 | <li>The TCP port listening for replication messages is the first available server socket in range <code>4000-4100</code></li> |
||
| 37 | <li>Listener is configured <code>ClusterSessionListener</code></li> |
||
| 38 | <li>Two interceptors are configured <code>TcpFailureDetector</code> and <code>MessageDispatchInterceptor</code></li> |
||
| 39 | </ol> |
||
| 40 | <p> |
||
| 41 | The following is the default cluster configuration: |
||
| 42 | </p> |
||
| 43 | <div class="codeBox"><pre><code> <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" |
||
| 44 | channelSendOptions="8"> |
||
| 45 | |||
| 46 | <Manager className="org.apache.catalina.ha.session.DeltaManager" |
||
| 47 | expireSessionsOnShutdown="false" |
||
| 48 | notifyListenersOnReplication="true"/> |
||
| 49 | |||
| 50 | <Channel className="org.apache.catalina.tribes.group.GroupChannel"> |
||
| 51 | <Membership className="org.apache.catalina.tribes.membership.McastService" |
||
| 52 | address="228.0.0.4" |
||
| 53 | port="45564" |
||
| 54 | frequency="500" |
||
| 55 | dropTime="3000"/> |
||
| 56 | <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" |
||
| 57 | address="auto" |
||
| 58 | port="4000" |
||
| 59 | autoBind="100" |
||
| 60 | selectorTimeout="5000" |
||
| 61 | maxThreads="6"/> |
||
| 62 | |||
| 63 | <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> |
||
| 64 | <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> |
||
| 65 | </Sender> |
||
| 66 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> |
||
| 67 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/> |
||
| 68 | </Channel> |
||
| 69 | |||
| 70 | <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" |
||
| 71 | filter=""/> |
||
| 72 | <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> |
||
| 73 | |||
| 74 | <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer" |
||
| 75 | tempDir="/tmp/war-temp/" |
||
| 76 | deployDir="/tmp/war-deploy/" |
||
| 77 | watchDir="/tmp/war-listen/" |
||
| 78 | watchEnabled="false"/> |
||
| 79 | |||
| 80 | <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> |
||
| 81 | </Cluster></code></pre></div> |
||
| 82 | <p>We will cover this section in more detail later in this document.</p> |
||
| 83 | </div><h3 id="Security">Security</h3><div class="text"> |
||
| 84 | |||
| 85 | <p>The cluster implementation is written on the basis that a secure, trusted |
||
| 86 | network is used for all of the cluster related network traffic. It is not safe |
||
| 87 | to run a cluster on a insecure, untrusted network.</p> |
||
| 88 | |||
| 89 | <p>There are many options for providing a secure, trusted network for use by a |
||
| 90 | Tomcat cluster. These include:</p> |
||
| 91 | <ul> |
||
| 92 | <li>private LAN</li> |
||
| 93 | <li>a Virtual Private Network (VPN)</li> |
||
| 94 | <li>IPSEC</li> |
||
| 95 | </ul> |
||
| 96 | |||
| 97 | <p>The <a href="cluster-interceptor.html#org.apache.catalina.tribes.group.interceptors.EncryptInterceptor_Attributes">EncryptInterceptor</a> |
||
| 98 | provides confidentiality and integrity protection but it does not protect |
||
| 99 | against all risks associated with running a Tomcat cluster on an untrusted |
||
| 100 | network, particularly DoS attacks.</p> |
||
| 101 | |||
| 102 | </div><h3 id="Cluster_Basics">Cluster Basics</h3><div class="text"> |
||
| 103 | |||
| 104 | <p>To run session replication in your Tomcat 9 container, the following steps |
||
| 105 | should be completed:</p> |
||
| 106 | <ul> |
||
| 107 | <li>All your session attributes must implement <code>java.io.Serializable</code></li> |
||
| 108 | <li>Uncomment the <code>Cluster</code> element in server.xml</li> |
||
| 109 | <li>If you have defined custom cluster valves, make sure you have the <code>ReplicationValve</code> defined as well under the Cluster element in server.xml</li> |
||
| 110 | <li>If your Tomcat instances are running on the same machine, make sure the <code>Receiver.port</code> |
||
| 111 | attribute is unique for each instance, in most cases Tomcat is smart enough to resolve this on it's own by autodetecting available ports in the range 4000-4100</li> |
||
| 112 | <li>Make sure your <code>web.xml</code> has the |
||
| 113 | <code><distributable/></code> element</li> |
||
| 114 | <li>If you are using mod_jk, make sure that jvmRoute attribute is set at your Engine <code><Engine name="Catalina" jvmRoute="node01" ></code> |
||
| 115 | and that the jvmRoute attribute value matches your worker name in workers.properties</li> |
||
| 116 | <li>Make sure that all nodes have the same time and sync with NTP service!</li> |
||
| 117 | <li>Make sure that your loadbalancer is configured for sticky session mode.</li> |
||
| 118 | </ul> |
||
| 119 | <p>Load balancing can be achieved through many techniques, as seen in the |
||
| 120 | <a href="balancer-howto.html">Load Balancing</a> chapter.</p> |
||
| 121 | <p>Note: Remember that your session state is tracked by a cookie, so your URL must look the same from the out |
||
| 122 | side otherwise, a new session will be created.</p> |
||
| 123 | <p>The Cluster module uses the Tomcat JULI logging framework, so you can configure logging |
||
| 124 | through the regular logging.properties file. To track messages, you can enable logging on the key: <code>org.apache.catalina.tribes.MESSAGES</code></p> |
||
| 125 | </div><h3 id="Overview">Overview</h3><div class="text"> |
||
| 126 | |||
| 127 | <p>To enable session replication in Tomcat, three different paths can be followed to achieve the exact same thing:</p> |
||
| 128 | <ol> |
||
| 129 | <li>Using session persistence, and saving the session to a shared file system (PersistenceManager + FileStore)</li> |
||
| 130 | <li>Using session persistence, and saving the session to a shared database (PersistenceManager + JDBCStore)</li> |
||
| 131 | <li>Using in-memory-replication, using the SimpleTcpCluster that ships with Tomcat (lib/catalina-tribes.jar + lib/catalina-ha.jar)</li> |
||
| 132 | </ol> |
||
| 133 | |||
| 134 | <p>Tomcat can perform an all-to-all replication of session state using the <code>DeltaManager</code> or |
||
| 135 | perform backup replication to only one node using the <code>BackupManager</code>. |
||
| 136 | The all-to-all replication is an algorithm that is only efficient when the clusters are small. For larger clusters, you |
||
| 137 | should use the BackupManager to use a primary-secondary session replication strategy where the session will only be |
||
| 138 | stored at one backup node.<br> |
||
| 139 | |||
| 140 | Currently you can use the domain worker attribute (mod_jk > 1.2.8) to build cluster partitions |
||
| 141 | with the potential of having a more scalable cluster solution with the DeltaManager |
||
| 142 | (you'll need to configure the domain interceptor for this). |
||
| 143 | In order to keep the network traffic down in an all-to-all environment, you can split your cluster |
||
| 144 | into smaller groups. This can be easily achieved by using different multicast addresses for the different groups. |
||
| 145 | A very simple setup would look like this: |
||
| 146 | </p> |
||
| 147 | |||
| 148 | <div class="codeBox"><pre><code> DNS Round Robin |
||
| 149 | | |
||
| 150 | Load Balancer |
||
| 151 | / \ |
||
| 152 | Cluster1 Cluster2 |
||
| 153 | / \ / \ |
||
| 154 | Tomcat1 Tomcat2 Tomcat3 Tomcat4</code></pre></div> |
||
| 155 | |||
| 156 | <p>What is important to mention here, is that session replication is only the beginning of clustering. |
||
| 157 | Another popular concept used to implement clusters is farming, i.e., you deploy your apps only to one |
||
| 158 | server, and the cluster will distribute the deployments across the entire cluster. |
||
| 159 | This is all capabilities that can go into with the FarmWarDeployer (s. cluster example at <code>server.xml</code>)</p> |
||
| 160 | <p>In the next section will go deeper into how session replication works and how to configure it.</p> |
||
| 161 | |||
| 162 | </div><h3 id="Cluster_Information">Cluster Information</h3><div class="text"> |
||
| 163 | <p>Membership is established using multicast heartbeats. |
||
| 164 | Hence, if you wish to subdivide your clusters, you can do this by |
||
| 165 | changing the multicast IP address or port in the <code><Membership></code> element. |
||
| 166 | </p> |
||
| 167 | <p> |
||
| 168 | The heartbeat contains the IP address of the Tomcat node and the TCP port that |
||
| 169 | Tomcat listens to for replication traffic. All data communication happens over TCP. |
||
| 170 | </p> |
||
| 171 | <p> |
||
| 172 | The <code>ReplicationValve</code> is used to find out when the request has been completed and initiate the |
||
| 173 | replication, if any. Data is only replicated if the session has changed (by calling setAttribute or removeAttribute |
||
| 174 | on the session). |
||
| 175 | </p> |
||
| 176 | <p> |
||
| 177 | One of the most important performance considerations is the synchronous versus asynchronous replication. |
||
| 178 | In a synchronous replication mode the request doesn't return until the replicated session has been |
||
| 179 | sent over the wire and reinstantiated on all the other cluster nodes. |
||
| 180 | Synchronous vs. asynchronous is configured using the <code>channelSendOptions</code> |
||
| 181 | flag and is an integer value. The default value for the <code>SimpleTcpCluster/DeltaManager</code> combo is |
||
| 182 | 8, which is asynchronous. |
||
| 183 | See the <a href="config/cluster.html#SimpleTcpCluster_Attributes">configuration reference</a> |
||
| 184 | for more discussion on the various <code>channelSendOptions</code> values. |
||
| 185 | </p> |
||
| 186 | <p> |
||
| 187 | For convenience, <code>channelSendOptions</code> can be set by name(s) rather than integer, |
||
| 188 | which are then translated to their integer value upon startup. The valid option names are: |
||
| 189 | "asynchronous" (alias "async"), "byte_message" (alias "byte"), "multicast", "secure", |
||
| 190 | "synchronized_ack" (alias "sync"), "udp", "use_ack". Use comma to separate multiple names, |
||
| 191 | e.g. pass "async, multicast" for the options |
||
| 192 | <code>SEND_OPTIONS_ASYNCHRONOUS | SEND_OPTIONS_MULTICAST</code>. |
||
| 193 | </p> |
||
| 194 | <p> |
||
| 195 | You can read more on the <a href="tribes/introduction.html">send flag(overview)</a> or the |
||
| 196 | <a href="https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/Channel.html">send flag(javadoc)</a>. |
||
| 197 | During async replication, the request is returned before the data has been replicated. async |
||
| 198 | replication yields shorter request times, and synchronous replication guarantees the session |
||
| 199 | to be replicated before the request returns. |
||
| 200 | </p> |
||
| 201 | </div><h3 id="Bind_session_after_crash_to_failover_node">Bind session after crash to failover node</h3><div class="text"> |
||
| 202 | <p> |
||
| 203 | If you are using mod_jk and not using sticky sessions or for some reasons sticky session don't |
||
| 204 | work, or you are simply failing over, the session id will need to be modified as it previously contained |
||
| 205 | the worker id of the previous tomcat (as defined by jvmRoute in the Engine element). |
||
| 206 | To solve this, we will use the JvmRouteBinderValve. |
||
| 207 | </p> |
||
| 208 | <p> |
||
| 209 | The JvmRouteBinderValve rewrites the session id to ensure that the next request will remain sticky |
||
| 210 | (and not fall back to go to random nodes since the worker is no longer available) after a fail over. |
||
| 211 | The valve rewrites the JSESSIONID value in the cookie with the same name. |
||
| 212 | Not having this valve in place, will make it harder to ensure stickiness in case of a failure for the mod_jk module. |
||
| 213 | </p> |
||
| 214 | <p> |
||
| 215 | Remember, if you are adding your own valves in server.xml then the defaults are no longer valid, |
||
| 216 | make sure that you add in all the appropriate valves as defined by the default. |
||
| 217 | </p> |
||
| 218 | <p> |
||
| 219 | <b>Hint:</b><br> |
||
| 220 | With attribute <i>sessionIdAttribute</i> you can change the request attribute name that included the old session id. |
||
| 221 | Default attribute name is <i>org.apache.catalina.ha.session.JvmRouteOriginalSessionID</i>. |
||
| 222 | </p> |
||
| 223 | <p> |
||
| 224 | <b>Trick:</b><br> |
||
| 225 | You can enable this mod_jk turnover mode via JMX before you drop a node to all backup nodes! |
||
| 226 | Set enable true on all JvmRouteBinderValve backups, disable worker at mod_jk |
||
| 227 | and then drop node and restart it! Then enable mod_jk Worker and disable JvmRouteBinderValves again. |
||
| 228 | This use case means that only requested session are migrated. |
||
| 229 | </p> |
||
| 230 | |||
| 231 | |||
| 232 | </div><h3 id="Configuration_Example">Configuration Example</h3><div class="text"> |
||
| 233 | <div class="codeBox"><pre><code> <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" |
||
| 234 | channelSendOptions="6"> |
||
| 235 | |||
| 236 | <Manager className="org.apache.catalina.ha.session.BackupManager" |
||
| 237 | expireSessionsOnShutdown="false" |
||
| 238 | notifyListenersOnReplication="true" |
||
| 239 | mapSendOptions="6"/> |
||
| 240 | <!-- |
||
| 241 | <Manager className="org.apache.catalina.ha.session.DeltaManager" |
||
| 242 | expireSessionsOnShutdown="false" |
||
| 243 | notifyListenersOnReplication="true"/> |
||
| 244 | --> |
||
| 245 | <Channel className="org.apache.catalina.tribes.group.GroupChannel"> |
||
| 246 | <Membership className="org.apache.catalina.tribes.membership.McastService" |
||
| 247 | address="228.0.0.4" |
||
| 248 | port="45564" |
||
| 249 | frequency="500" |
||
| 250 | dropTime="3000"/> |
||
| 251 | <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" |
||
| 252 | address="auto" |
||
| 253 | port="5000" |
||
| 254 | selectorTimeout="100" |
||
| 255 | maxThreads="6"/> |
||
| 256 | |||
| 257 | <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> |
||
| 258 | <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> |
||
| 259 | </Sender> |
||
| 260 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> |
||
| 261 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/> |
||
| 262 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/> |
||
| 263 | </Channel> |
||
| 264 | |||
| 265 | <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" |
||
| 266 | filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/> |
||
| 267 | |||
| 268 | <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer" |
||
| 269 | tempDir="/tmp/war-temp/" |
||
| 270 | deployDir="/tmp/war-deploy/" |
||
| 271 | watchDir="/tmp/war-listen/" |
||
| 272 | watchEnabled="false"/> |
||
| 273 | |||
| 274 | <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> |
||
| 275 | </Cluster></code></pre></div> |
||
| 276 | <p> |
||
| 277 | Break it down!! |
||
| 278 | </p> |
||
| 279 | <div class="codeBox"><pre><code> <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" |
||
| 280 | channelSendOptions="6"></code></pre></div> |
||
| 281 | <p> |
||
| 282 | The main element, inside this element all cluster details can be configured. |
||
| 283 | The <code>channelSendOptions</code> is the flag that is attached to each message sent by the |
||
| 284 | SimpleTcpCluster class or any objects that are invoking the SimpleTcpCluster.send method. |
||
| 285 | The description of the send flags is available at <a href="https://tomcat.apache.org/tomcat-9.0-doc/api/org/apache/catalina/tribes/Channel.html"> |
||
| 286 | our javadoc site</a> |
||
| 287 | The <code>DeltaManager</code> sends information using the SimpleTcpCluster.send method, while the backup manager |
||
| 288 | sends it itself directly through the channel. |
||
| 289 | <br>For more info, Please visit the <a href="config/cluster.html">reference documentation</a> |
||
| 290 | </p> |
||
| 291 | <div class="codeBox"><pre><code> <Manager className="org.apache.catalina.ha.session.BackupManager" |
||
| 292 | expireSessionsOnShutdown="false" |
||
| 293 | notifyListenersOnReplication="true" |
||
| 294 | mapSendOptions="6"/> |
||
| 295 | <!-- |
||
| 296 | <Manager className="org.apache.catalina.ha.session.DeltaManager" |
||
| 297 | expireSessionsOnShutdown="false" |
||
| 298 | notifyListenersOnReplication="true"/> |
||
| 299 | --></code></pre></div> |
||
| 300 | <p> |
||
| 301 | This is a template for the manager configuration that will be used if no manager is defined in the <Context> |
||
| 302 | element. In Tomcat 5.x each webapp marked distributable had to use the same manager, this is no longer the case |
||
| 303 | since Tomcat you can define a manager class for each webapp, so that you can mix managers in your cluster. |
||
| 304 | Obviously the managers on one node's application has to correspond with the same manager on the same application on the other node. |
||
| 305 | If no manager has been specified for the webapp, and the webapp is marked <distributable/> Tomcat will take this manager configuration |
||
| 306 | and create a manager instance cloning this configuration. |
||
| 307 | <br>For more info, Please visit the <a href="config/cluster-manager.html">reference documentation</a> |
||
| 308 | </p> |
||
| 309 | <div class="codeBox"><pre><code> <Channel className="org.apache.catalina.tribes.group.GroupChannel"></code></pre></div> |
||
| 310 | <p> |
||
| 311 | The channel element is <a href="tribes/introduction.html">Tribes</a>, the group communication framework |
||
| 312 | used inside Tomcat. This element encapsulates everything that has to do with communication and membership logic. |
||
| 313 | <br>For more info, Please visit the <a href="config/cluster-channel.html">reference documentation</a> |
||
| 314 | </p> |
||
| 315 | <div class="codeBox"><pre><code> <Membership className="org.apache.catalina.tribes.membership.McastService" |
||
| 316 | address="228.0.0.4" |
||
| 317 | port="45564" |
||
| 318 | frequency="500" |
||
| 319 | dropTime="3000"/></code></pre></div> |
||
| 320 | <p> |
||
| 321 | Membership is done using multicasting. Please note that Tribes also supports static memberships using the |
||
| 322 | <code>StaticMembershipService</code> if you want to extend your membership to points beyond multicasting. |
||
| 323 | The address attribute is the multicast address used and the port is the multicast port. These two together |
||
| 324 | create the cluster separation. If you want a QA cluster and a production cluster, the easiest config is to |
||
| 325 | have the QA cluster be on a separate multicast address/port combination than the production cluster.<br> |
||
| 326 | The membership component broadcasts TCP address/port of itself to the other nodes so that communication between |
||
| 327 | nodes can be done over TCP. Please note that the address being broadcasted is the one of the |
||
| 328 | <code>Receiver.address</code> attribute. |
||
| 329 | <br>For more info, Please visit the <a href="config/cluster-membership.html">reference documentation</a> |
||
| 330 | </p> |
||
| 331 | <div class="codeBox"><pre><code> <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" |
||
| 332 | address="auto" |
||
| 333 | port="5000" |
||
| 334 | selectorTimeout="100" |
||
| 335 | maxThreads="6"/></code></pre></div> |
||
| 336 | <p> |
||
| 337 | In tribes the logic of sending and receiving data has been broken into two functional components. The Receiver, as the name suggests |
||
| 338 | is responsible for receiving messages. Since the Tribes stack is thread less, (a popular improvement now adopted by other frameworks as well), |
||
| 339 | there is a thread pool in this component that has a maxThreads and minThreads setting.<br> |
||
| 340 | The address attribute is the host address that will be broadcasted by the membership component to the other nodes. |
||
| 341 | <br>For more info, Please visit the <a href="config/cluster-receiver.html">reference documentation</a> |
||
| 342 | </p> |
||
| 343 | <div class="codeBox"><pre><code> <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> |
||
| 344 | <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> |
||
| 345 | </Sender></code></pre></div> |
||
| 346 | <p> |
||
| 347 | The sender component, as the name indicates is responsible for sending messages to other nodes. |
||
| 348 | The sender has a shell component, the <code>ReplicationTransmitter</code> but the real stuff done is done in the |
||
| 349 | sub component, <code>Transport</code>. |
||
| 350 | Tribes support having a pool of senders, so that messages can be sent in parallel and if using the NIO sender, |
||
| 351 | you can send messages concurrently as well.<br> |
||
| 352 | Concurrently means one message to multiple senders at the same time and Parallel means multiple messages to multiple senders |
||
| 353 | at the same time. |
||
| 354 | <br>For more info, Please visit the <a href="config/cluster-sender.html">reference documentation</a> |
||
| 355 | </p> |
||
| 356 | <div class="codeBox"><pre><code> <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> |
||
| 357 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/> |
||
| 358 | <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/> |
||
| 359 | </Channel></code></pre></div> |
||
| 360 | <p> |
||
| 361 | Tribes uses a stack to send messages through. Each element in the stack is called an interceptor, and works much like the valves do |
||
| 362 | in the Tomcat servlet container. |
||
| 363 | Using interceptors, logic can be broken into more manageable pieces of code. The interceptors configured above are:<br> |
||
| 364 | TcpFailureDetector - verifies crashed members through TCP, if multicast packets get dropped, this interceptor protects against false positives, |
||
| 365 | ie the node marked as crashed even though it still is alive and running.<br> |
||
| 366 | MessageDispatchInterceptor - dispatches messages to a thread (thread pool) to send message asynchronously.<br> |
||
| 367 | ThroughputInterceptor - prints out simple stats on message traffic.<br> |
||
| 368 | Please note that the order of interceptors is important. The way they are defined in server.xml is the way they are represented in the |
||
| 369 | channel stack. Think of it as a linked list, with the head being the first most interceptor and the tail the last. |
||
| 370 | <br>For more info, Please visit the <a href="config/cluster-interceptor.html">reference documentation</a> |
||
| 371 | </p> |
||
| 372 | <div class="codeBox"><pre><code> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" |
||
| 373 | filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/></code></pre></div> |
||
| 374 | <p> |
||
| 375 | The cluster uses valves to track requests to web applications, we've mentioned the ReplicationValve and the JvmRouteBinderValve above. |
||
| 376 | The <Cluster> element itself is not part of the pipeline in Tomcat, instead the cluster adds the valve to its parent container. |
||
| 377 | If the <Cluster> elements is configured in the <Engine> element, the valves get added to the engine and so on. |
||
| 378 | <br>For more info, Please visit the <a href="config/cluster-valve.html">reference documentation</a> |
||
| 379 | </p> |
||
| 380 | <div class="codeBox"><pre><code> <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer" |
||
| 381 | tempDir="/tmp/war-temp/" |
||
| 382 | deployDir="/tmp/war-deploy/" |
||
| 383 | watchDir="/tmp/war-listen/" |
||
| 384 | watchEnabled="false"/></code></pre></div> |
||
| 385 | <p> |
||
| 386 | The default tomcat cluster supports farmed deployment, ie, the cluster can deploy and undeploy applications on the other nodes. |
||
| 387 | The state of this component is currently in flux but will be addressed soon. There was a change in the deployment algorithm |
||
| 388 | between Tomcat 5.0 and 5.5 and at that point, the logic of this component changed to where the deploy dir has to match the |
||
| 389 | webapps directory. |
||
| 390 | <br>For more info, Please visit the <a href="config/cluster-deployer.html">reference documentation</a> |
||
| 391 | </p> |
||
| 392 | <div class="codeBox"><pre><code> <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> |
||
| 393 | </Cluster></code></pre></div> |
||
| 394 | <p> |
||
| 395 | Since the SimpleTcpCluster itself is a sender and receiver of the Channel object, components can register themselves as listeners to |
||
| 396 | the SimpleTcpCluster. The listener above <code>ClusterSessionListener</code> listens for DeltaManager replication messages |
||
| 397 | and applies the deltas to the manager that in turn applies it to the session. |
||
| 398 | <br>For more info, Please visit the <a href="config/cluster-listener.html">reference documentation</a> |
||
| 399 | </p> |
||
| 400 | |||
| 401 | </div><h3 id="Cluster_Architecture">Cluster Architecture</h3><div class="text"> |
||
| 402 | |||
| 403 | <p><b>Component Levels:</b></p> |
||
| 404 | <div class="codeBox"><pre><code> Server |
||
| 405 | | |
||
| 406 | Service |
||
| 407 | | |
||
| 408 | Engine |
||
| 409 | | \ |
||
| 410 | | --- Cluster --* |
||
| 411 | | |
||
| 412 | Host |
||
| 413 | | |
||
| 414 | ------ |
||
| 415 | / \ |
||
| 416 | Cluster Context(1-N) |
||
| 417 | | \ |
||
| 418 | | -- Manager |
||
| 419 | | \ |
||
| 420 | | -- DeltaManager |
||
| 421 | | -- BackupManager |
||
| 422 | | |
||
| 423 | --------------------------- |
||
| 424 | | \ |
||
| 425 | Channel \ |
||
| 426 | ----------------------------- \ |
||
| 427 | | \ |
||
| 428 | Interceptor_1 .. \ |
||
| 429 | | \ |
||
| 430 | Interceptor_N \ |
||
| 431 | ----------------------------- \ |
||
| 432 | | | | \ |
||
| 433 | Receiver Sender Membership \ |
||
| 434 | -- Valve |
||
| 435 | | \ |
||
| 436 | | -- ReplicationValve |
||
| 437 | | -- JvmRouteBinderValve |
||
| 438 | | |
||
| 439 | -- LifecycleListener |
||
| 440 | | |
||
| 441 | -- ClusterListener |
||
| 442 | | \ |
||
| 443 | | -- ClusterSessionListener |
||
| 444 | | |
||
| 445 | -- Deployer |
||
| 446 | \ |
||
| 447 | -- FarmWarDeployer |
||
| 448 | |||
| 449 | </code></pre></div> |
||
| 450 | |||
| 451 | |||
| 452 | </div><h3 id="How_it_Works">How it Works</h3><div class="text"> |
||
| 453 | <p>To make it easy to understand how clustering works, we are gonna to take you through a series of scenarios. |
||
| 454 | In this scenario we only plan to use two tomcat instances <code>TomcatA</code> and <code>TomcatB</code>. |
||
| 455 | We will cover the following sequence of events:</p> |
||
| 456 | |||
| 457 | <ol> |
||
| 458 | <li><code>TomcatA</code> starts up</li> |
||
| 459 | <li><code>TomcatB</code> starts up (Wait the TomcatA start is complete)</li> |
||
| 460 | <li><code>TomcatA</code> receives a request, a session <code>S1</code> is created.</li> |
||
| 461 | <li><code>TomcatA</code> crashes</li> |
||
| 462 | <li><code>TomcatB</code> receives a request for session <code>S1</code></li> |
||
| 463 | <li><code>TomcatA</code> starts up</li> |
||
| 464 | <li><code>TomcatA</code> receives a request, invalidate is called on the session (<code>S1</code>)</li> |
||
| 465 | <li><code>TomcatB</code> receives a request, for a new session (<code>S2</code>)</li> |
||
| 466 | <li><code>TomcatA</code> The session <code>S2</code> expires due to inactivity.</li> |
||
| 467 | </ol> |
||
| 468 | |||
| 469 | <p>Ok, now that we have a good sequence, we will take you through exactly what happens in the session replication code</p> |
||
| 470 | |||
| 471 | <ol> |
||
| 472 | <li><b><code>TomcatA</code> starts up</b> |
||
| 473 | <p> |
||
| 474 | Tomcat starts up using the standard start up sequence. When the Host object is created, a cluster object is associated with it. |
||
| 475 | When the contexts are parsed, if the distributable element is in place in the web.xml file, |
||
| 476 | Tomcat asks the Cluster class (in this case <code>SimpleTcpCluster</code>) to create a manager |
||
| 477 | for the replicated context. So with clustering enabled, distributable set in web.xml |
||
| 478 | Tomcat will create a <code>DeltaManager</code> for that context instead of a <code>StandardManager</code>. |
||
| 479 | The cluster class will start up a membership service (multicast) and a replication service (tcp unicast). |
||
| 480 | More on the architecture further down in this document. |
||
| 481 | </p> |
||
| 482 | </li> |
||
| 483 | <li><b><code>TomcatB</code> starts up</b> |
||
| 484 | <p> |
||
| 485 | When TomcatB starts up, it follows the same sequence as TomcatA did with one exception. |
||
| 486 | The cluster is started and will establish a membership (TomcatA, TomcatB). |
||
| 487 | TomcatB will now request the session state from a server that already exists in the cluster, |
||
| 488 | in this case TomcatA. TomcatA responds to the request, and before TomcatB starts listening |
||
| 489 | for HTTP requests, the state has been transferred from TomcatA to TomcatB. |
||
| 490 | In case TomcatA doesn't respond, TomcatB will time out after 60 seconds, issue a log |
||
| 491 | entry, and continue starting. The session state gets transferred for each web |
||
| 492 | application that has distributable in its web.xml. (Note: To use session replication |
||
| 493 | efficiently, all your tomcat instances should be configured the same.) |
||
| 494 | </p> |
||
| 495 | </li> |
||
| 496 | <li><B><code>TomcatA</code> receives a request, a session <code>S1</code> is created.</B> |
||
| 497 | <p> |
||
| 498 | The request coming in to TomcatA is handled exactly the same way as without session |
||
| 499 | replication, until the request is completed, at which time the |
||
| 500 | <code>ReplicationValve</code> will intercept the request before the response is |
||
| 501 | returned to the user. At this point it finds that the session has been modified, |
||
| 502 | and it uses TCP to replicate the session to TomcatB. Once the serialized data has |
||
| 503 | been handed off to the operating system's TCP logic, the request returns to the user, |
||
| 504 | back through the valve pipeline. For each request the entire session is replicated, |
||
| 505 | this allows code that modifies attributes in the session without calling setAttribute |
||
| 506 | or removeAttribute to be replicated. A useDirtyFlag configuration parameter can |
||
| 507 | be used to optimize the number of times a session is replicated. |
||
| 508 | </p> |
||
| 509 | |||
| 510 | </li> |
||
| 511 | <li><b><code>TomcatA</code> crashes</b> |
||
| 512 | <p> |
||
| 513 | When TomcatA crashes, TomcatB receives a notification that TomcatA has dropped out |
||
| 514 | of the cluster. TomcatB removes TomcatA from its membership list, and TomcatA will |
||
| 515 | no longer be notified of any changes that occurs in TomcatB. The load balancer |
||
| 516 | will redirect the requests from TomcatA to TomcatB and all the sessions are current. |
||
| 517 | </p> |
||
| 518 | </li> |
||
| 519 | <li><b><code>TomcatB</code> receives a request for session <code>S1</code></b> |
||
| 520 | <p>Nothing exciting, TomcatB will process the request as any other request. |
||
| 521 | </p> |
||
| 522 | </li> |
||
| 523 | <li><b><code>TomcatA</code> starts up</b> |
||
| 524 | <p>Upon start up, before TomcatA starts taking new request and making itself |
||
| 525 | available to it will follow the start up sequence described above 1) 2). |
||
| 526 | It will join the cluster, contact TomcatB for the current state of all the sessions. |
||
| 527 | And once it receives the session state, it finishes loading and opens its HTTP/mod_jk ports. |
||
| 528 | So no requests will make it to TomcatA until it has received the session state from TomcatB. |
||
| 529 | </p> |
||
| 530 | </li> |
||
| 531 | <li><b><code>TomcatA</code> receives a request, invalidate is called on the session (<code>S1</code>)</b> |
||
| 532 | <p>The invalidate call is intercepted, and the session is queued with invalidated sessions. |
||
| 533 | When the request is complete, instead of sending out the session that has changed, it sends out |
||
| 534 | an "expire" message to TomcatB and TomcatB will invalidate the session as well. |
||
| 535 | </p> |
||
| 536 | |||
| 537 | </li> |
||
| 538 | <li><b><code>TomcatB</code> receives a request, for a new session (<code>S2</code>)</b> |
||
| 539 | <p>Same scenario as in step 3) |
||
| 540 | </p> |
||
| 541 | |||
| 542 | |||
| 543 | </li> |
||
| 544 | <li><code>TomcatA</code> The session <code>S2</code> expires due to inactivity. |
||
| 545 | <p>The invalidate call is intercepted the same way as when a session is invalidated by the user, |
||
| 546 | and the session is queued with invalidated sessions. |
||
| 547 | At this point, the invalidated session will not be replicated across until |
||
| 548 | another request comes through the system and checks the invalid queue. |
||
| 549 | </p> |
||
| 550 | </li> |
||
| 551 | </ol> |
||
| 552 | |||
| 553 | <p>Phuuuhh! :)</p> |
||
| 554 | |||
| 555 | <p><b>Membership</b> |
||
| 556 | Clustering membership is established using very simple multicast pings. |
||
| 557 | Each Tomcat instance will periodically send out a multicast ping, |
||
| 558 | in the ping message the instance will broadcast its IP and TCP listen port |
||
| 559 | for replication. |
||
| 560 | If an instance has not received such a ping within a given timeframe, the |
||
| 561 | member is considered dead. Very simple, and very effective! |
||
| 562 | Of course, you need to enable multicasting on your system. |
||
| 563 | </p> |
||
| 564 | |||
| 565 | <p><b>TCP Replication</b> |
||
| 566 | Once a multicast ping has been received, the member is added to the cluster |
||
| 567 | Upon the next replication request, the sending instance will use the host and |
||
| 568 | port info and establish a TCP socket. Using this socket it sends over the serialized data. |
||
| 569 | The reason I chose TCP sockets is because it has built in flow control and guaranteed delivery. |
||
| 570 | So I know, when I send some data, it will make it there :) |
||
| 571 | </p> |
||
| 572 | |||
| 573 | <p><b>Distributed locking and pages using frames</b> |
||
| 574 | Tomcat does not keep session instances in sync across the cluster. |
||
| 575 | The implementation of such logic would be to much overhead and cause all |
||
| 576 | kinds of problems. If your client accesses the same session |
||
| 577 | simultaneously using multiple requests, then the last request |
||
| 578 | will override the other sessions in the cluster. |
||
| 579 | </p> |
||
| 580 | |||
| 581 | </div><h3 id="Monitoring_your_Cluster_with_JMX">Monitoring your Cluster with JMX</h3><div class="text"> |
||
| 582 | <p>Monitoring is a very important question when you use a cluster. Some of the cluster objects are JMX MBeans </p> |
||
| 583 | <p>Add the following parameter to your startup script:</p> |
||
| 584 | <div class="codeBox"><pre><code>set CATALINA_OPTS=\ |
||
| 585 | -Dcom.sun.management.jmxremote \ |
||
| 586 | -Dcom.sun.management.jmxremote.port=%my.jmx.port% \ |
||
| 587 | -Dcom.sun.management.jmxremote.ssl=false \ |
||
| 588 | -Dcom.sun.management.jmxremote.authenticate=false</code></pre></div> |
||
| 589 | |||
| 590 | <p> |
||
| 591 | List of Cluster Mbeans |
||
| 592 | </p> |
||
| 593 | <table class="defaultTable"> |
||
| 594 | |||
| 595 | <tr> |
||
| 596 | <th>Name</th> |
||
| 597 | <th>Description</th> |
||
| 598 | <th>MBean ObjectName - Engine</th> |
||
| 599 | <th>MBean ObjectName - Host</th> |
||
| 600 | </tr> |
||
| 601 | |||
| 602 | <tr> |
||
| 603 | <td>Cluster</td> |
||
| 604 | <td>The complete cluster element</td> |
||
| 605 | <td><code>type=Cluster</code></td> |
||
| 606 | <td><code>type=Cluster,host=${HOST}</code></td> |
||
| 607 | </tr> |
||
| 608 | |||
| 609 | <tr> |
||
| 610 | <td>DeltaManager</td> |
||
| 611 | <td>This manager control the sessions and handle session replication </td> |
||
| 612 | <td><code>type=Manager,context=${APP.CONTEXT.PATH}, host=${HOST}</code></td> |
||
| 613 | <td><code>type=Manager,context=${APP.CONTEXT.PATH}, host=${HOST}</code></td> |
||
| 614 | </tr> |
||
| 615 | |||
| 616 | <tr> |
||
| 617 | <td>FarmWarDeployer</td> |
||
| 618 | <td>Manages the process of deploying an application to all nodes in the cluster</td> |
||
| 619 | <td>Not supported</td> |
||
| 620 | <td><code>type=Cluster, host=${HOST}, component=deployer</code></td> |
||
| 621 | </tr> |
||
| 622 | |||
| 623 | <tr> |
||
| 624 | <td>Member</td> |
||
| 625 | <td>Represents a node in the cluster</td> |
||
| 626 | <td>type=Cluster, component=member, name=${NODE_NAME}</td> |
||
| 627 | <td><code>type=Cluster, host=${HOST}, component=member, name=${NODE_NAME}</code></td> |
||
| 628 | </tr> |
||
| 629 | |||
| 630 | <tr> |
||
| 631 | <td>ReplicationValve</td> |
||
| 632 | <td>This valve control the replication to the backup nodes</td> |
||
| 633 | <td><code>type=Valve,name=ReplicationValve</code></td> |
||
| 634 | <td><code>type=Valve,name=ReplicationValve,host=${HOST}</code></td> |
||
| 635 | </tr> |
||
| 636 | |||
| 637 | <tr> |
||
| 638 | <td>JvmRouteBinderValve</td> |
||
| 639 | <td>This is a cluster fallback valve to change the Session ID to the current tomcat jvmroute.</td> |
||
| 640 | <td><code>type=Valve,name=JvmRouteBinderValve, |
||
| 641 | context=${APP.CONTEXT.PATH}</code></td> |
||
| 642 | <td><code>type=Valve,name=JvmRouteBinderValve,host=${HOST}, |
||
| 643 | context=${APP.CONTEXT.PATH}</code></td> |
||
| 644 | </tr> |
||
| 645 | |||
| 646 | </table> |
||
| 647 | </div><h3 id="FAQ">FAQ</h3><div class="text"> |
||
| 648 | <p>Please see <a href="https://cwiki.apache.org/confluence/display/TOMCAT/Clustering">the clustering section of the FAQ</a>.</p> |
||
| 649 | </div></div></div></div></div><footer><div id="footer"> |
||
| 650 | Copyright © 1999-2025, The Apache Software Foundation |
||
| 651 | <br> |
||
| 652 | Apache Tomcat, Tomcat, Apache, the Apache Tomcat logo and the Apache logo |
||
| 653 | are either registered trademarks or trademarks of the Apache Software |
||
| 654 | Foundation. |
||
| 655 | </div></footer></div></body></html> |