<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html xmlns:fn="http://www.w3.org/2005/02/xpath-functions"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <link rel="stylesheet" href="../otp_doc.css" type="text/css"> <title>Erlang -- Distributed Applications</title> </head> <body bgcolor="white" text="#000000" link="#0000ff" vlink="#ff00ff" alink="#ff0000"><div id="container"> <script id="js" type="text/javascript" language="JavaScript" src="../js/flipmenu/flipmenu.js"></script><script id="js2" type="text/javascript" src="../js/erlresolvelinks.js"></script><script language="JavaScript" type="text/javascript"> <!-- function getWinHeight() { var myHeight = 0; if( typeof( window.innerHeight ) == 'number' ) { //Non-IE myHeight = window.innerHeight; } else if( document.documentElement && ( document.documentElement.clientWidth || document.documentElement.clientHeight ) ) { //IE 6+ in 'standards compliant mode' myHeight = document.documentElement.clientHeight; } else if( document.body && ( document.body.clientWidth || document.body.clientHeight ) ) { //IE 4 compatible myHeight = document.body.clientHeight; } return myHeight; } function setscrollpos() { var objf=document.getElementById('loadscrollpos'); document.getElementById("leftnav").scrollTop = objf.offsetTop - getWinHeight()/2; } function addEvent(obj, evType, fn){ if (obj.addEventListener){ obj.addEventListener(evType, fn, true); return true; } else if (obj.attachEvent){ var r = obj.attachEvent("on"+evType, fn); return r; } else { return false; } } addEvent(window, 'load', setscrollpos); //--></script><div id="leftnav"><div class="innertube"> <img alt="Erlang logo" src="../erlang-logo.png"><br><small><a href="users_guide.html">User's Guide</a><br><a href="../pdf/otp-system-documentation-5.8.5.pdf">PDF</a><br><a href="../index.html">Top</a></small><p><strong>OTP Design Principles</strong><br><strong>User's Guide</strong><br><small>Version 5.8.5</small></p> <br><a href="javascript:openAllFlips()">Expand All</a><br><a href="javascript:closeAllFlips()">Contract All</a><p><small><strong>Chapters</strong></small></p> <ul class="flipMenu" imagepath="../js/flipmenu"> <li id="no" title="Overview" expanded="false">Overview<ul> <li><a href="des_princ.html"> Top of chapter </a></li> <li title="Supervision Trees"><a href="des_princ.html#id149638">Supervision Trees</a></li> <li title="Behaviours"><a href="des_princ.html#id149564">Behaviours</a></li> <li title="Applications"><a href="des_princ.html#id147833">Applications</a></li> <li title="Releases"><a href="des_princ.html#id147689">Releases</a></li> <li title="Release Handling"><a href="des_princ.html#id147663">Release Handling</a></li> </ul> </li> <li id="no" title="Gen_Server Behaviour" expanded="false">Gen_Server Behaviour<ul> <li><a href="gen_server_concepts.html"> Top of chapter </a></li> <li title="Client-Server Principles"><a href="gen_server_concepts.html#id147574">Client-Server Principles</a></li> <li title="Example"><a href="gen_server_concepts.html#id148317">Example</a></li> <li title="Starting a Gen_Server"><a href="gen_server_concepts.html#id148281">Starting a Gen_Server</a></li> <li title="Synchronous Requests - Call"><a href="gen_server_concepts.html#id148102">Synchronous Requests - Call</a></li> <li title="Asynchronous Requests - Cast"><a href="gen_server_concepts.html#id148083">Asynchronous Requests - Cast</a></li> <li title="Stopping"><a href="gen_server_concepts.html#id144138">Stopping</a></li> <li title="Handling Other Messages"><a href="gen_server_concepts.html#id142970">Handling Other Messages</a></li> </ul> </li> <li id="no" title="Gen_Fsm Behaviour" expanded="false">Gen_Fsm Behaviour<ul> <li><a href="fsm.html"> Top of chapter </a></li> <li title="Finite State Machines"><a href="fsm.html#id145620">Finite State Machines</a></li> <li title="Example"><a href="fsm.html#id143653">Example</a></li> <li title="Starting a Gen_Fsm"><a href="fsm.html#id143621">Starting a Gen_Fsm</a></li> <li title="Notifying About Events"><a href="fsm.html#id147964">Notifying About Events</a></li> <li title="Timeouts"><a href="fsm.html#id140288">Timeouts</a></li> <li title="All State Events"><a href="fsm.html#id140323">All State Events</a></li> <li title="Stopping"><a href="fsm.html#id139968">Stopping</a></li> <li title="Handling Other Messages"><a href="fsm.html#id133942">Handling Other Messages</a></li> </ul> </li> <li id="no" title="Gen_Event Behaviour" expanded="false">Gen_Event Behaviour<ul> <li><a href="events.html"> Top of chapter </a></li> <li title="Event Handling Principles"><a href="events.html#id149802">Event Handling Principles</a></li> <li title="Example"><a href="events.html#id149850">Example</a></li> <li title="Starting an Event Manager"><a href="events.html#id149175">Starting an Event Manager</a></li> <li title="Adding an Event Handler"><a href="events.html#id149248">Adding an Event Handler</a></li> <li title="Notifying About Events"><a href="events.html#id149322">Notifying About Events</a></li> <li title="Deleting an Event Handler"><a href="events.html#id149387">Deleting an Event Handler</a></li> <li title="Stopping"><a href="events.html#id150266">Stopping</a></li> <li title="Handling Other Messages"><a href="events.html#id150314">Handling Other Messages</a></li> </ul> </li> <li id="no" title="Supervisor Behaviour" expanded="false">Supervisor Behaviour<ul> <li><a href="sup_princ.html"> Top of chapter </a></li> <li title="Supervision Principles"><a href="sup_princ.html#id150402">Supervision Principles</a></li> <li title="Example"><a href="sup_princ.html#id150424">Example</a></li> <li title="Restart Strategy"><a href="sup_princ.html#id150487">Restart Strategy</a></li> <li title="Maximum Restart Frequency"><a href="sup_princ.html#id150560">Maximum Restart Frequency</a></li> <li title="Child Specification"><a href="sup_princ.html#id150613">Child Specification</a></li> <li title="Starting a Supervisor"><a href="sup_princ.html#id150882">Starting a Supervisor</a></li> <li title="Adding a Child Process"><a href="sup_princ.html#id150987">Adding a Child Process</a></li> <li title="Stopping a Child Process"><a href="sup_princ.html#id151028">Stopping a Child Process</a></li> <li title="Simple-One-For-One Supervisors"><a href="sup_princ.html#id151074">Simple-One-For-One Supervisors</a></li> <li title="Stopping"><a href="sup_princ.html#id143159">Stopping</a></li> </ul> </li> <li id="no" title="Sys and Proc_Lib" expanded="false">Sys and Proc_Lib<ul> <li><a href="spec_proc.html"> Top of chapter </a></li> <li title="Simple Debugging"><a href="spec_proc.html#id151152">Simple Debugging</a></li> <li title="Special Processes"><a href="spec_proc.html#id151223">Special Processes</a></li> <li title="User-Defined Behaviours"><a href="spec_proc.html#id151769">User-Defined Behaviours</a></li> </ul> </li> <li id="no" title="Applications" expanded="false">Applications<ul> <li><a href="applications.html"> Top of chapter </a></li> <li title="Application Concept"><a href="applications.html#id151915">Application Concept</a></li> <li title="Application Callback Module"><a href="applications.html#id151982">Application Callback Module</a></li> <li title="Application Resource File"><a href="applications.html#id152096">Application Resource File</a></li> <li title="Directory Structure"><a href="applications.html#id152345">Directory Structure</a></li> <li title="Application Controller"><a href="applications.html#id152471">Application Controller</a></li> <li title="Loading and Unloading Applications"><a href="applications.html#id152504">Loading and Unloading Applications</a></li> <li title="Starting and Stopping Applications"><a href="applications.html#id152566">Starting and Stopping Applications</a></li> <li title="Configuring an Application"><a href="applications.html#id152654">Configuring an Application</a></li> <li title="Application Start Types"><a href="applications.html#id152853">Application Start Types</a></li> </ul> </li> <li id="no" title="Included Applications" expanded="false">Included Applications<ul> <li><a href="included_applications.html"> Top of chapter </a></li> <li title="Definition"><a href="included_applications.html#id152996">Definition</a></li> <li title="Specifying Included Applications"><a href="included_applications.html#id153061">Specifying Included Applications</a></li> <li title="Synchronizing Processes During Startup"><a href="included_applications.html#id153086">Synchronizing Processes During Startup</a></li> </ul> </li> <li id="loadscrollpos" title="Distributed Applications" expanded="true">Distributed Applications<ul> <li><a href="distributed_applications.html"> Top of chapter </a></li> <li title="Definition"><a href="distributed_applications.html#id153302">Definition</a></li> <li title="Specifying Distributed Applications"><a href="distributed_applications.html#id153338">Specifying Distributed Applications</a></li> <li title="Starting and Stopping Distributed Applications"><a href="distributed_applications.html#id153544">Starting and Stopping Distributed Applications</a></li> <li title="Failover"><a href="distributed_applications.html#id153646">Failover</a></li> <li title="Takeover"><a href="distributed_applications.html#id153789">Takeover</a></li> </ul> </li> <li id="no" title="Releases" expanded="false">Releases<ul> <li><a href="release_structure.html"> Top of chapter </a></li> <li title="Release Concept"><a href="release_structure.html#id153986">Release Concept</a></li> <li title="Release Resource File"><a href="release_structure.html#id154039">Release Resource File</a></li> <li title="Generating Boot Scripts"><a href="release_structure.html#id154179">Generating Boot Scripts</a></li> <li title="Creating a Release Package"><a href="release_structure.html#id154272">Creating a Release Package</a></li> <li title="Directory Structure"><a href="release_structure.html#id154391">Directory Structure</a></li> </ul> </li> <li id="no" title="Release Handling" expanded="false">Release Handling<ul> <li><a href="release_handling.html"> Top of chapter </a></li> <li title="Release Handling Principles"><a href="release_handling.html#id154647">Release Handling Principles</a></li> <li title="Requirements"><a href="release_handling.html#id154891">Requirements</a></li> <li title="Distributed Systems"><a href="release_handling.html#id154984">Distributed Systems</a></li> <li title="Release Handling Instructions"><a href="release_handling.html#id155010">Release Handling Instructions</a></li> <li title="Application Upgrade File"><a href="release_handling.html#id155441">Application Upgrade File</a></li> <li title="Release Upgrade File"><a href="release_handling.html#id155623">Release Upgrade File</a></li> <li title="Installing a Release"><a href="release_handling.html#id155779">Installing a Release</a></li> <li title="Updating Application Specifications"><a href="release_handling.html#id156253">Updating Application Specifications</a></li> </ul> </li> <li id="no" title="Appup Cookbook" expanded="false">Appup Cookbook<ul> <li><a href="appup_cookbook.html"> Top of chapter </a></li> <li title="Changing a Functional Module"><a href="appup_cookbook.html#id156440">Changing a Functional Module</a></li> <li title="Changing a Residence Module"><a href="appup_cookbook.html#id156462">Changing a Residence Module</a></li> <li title="Changing a Callback Module"><a href="appup_cookbook.html#id156501">Changing a Callback Module</a></li> <li title="Changing Internal State"><a href="appup_cookbook.html#id156552">Changing Internal State</a></li> <li title="Module Dependencies"><a href="appup_cookbook.html#id156687">Module Dependencies</a></li> <li title="Changing Code For a Special Process"><a href="appup_cookbook.html#id156857">Changing Code For a Special Process</a></li> <li title="Changing a Supervisor"><a href="appup_cookbook.html#id157024">Changing a Supervisor</a></li> <li title="Adding or Deleting a Module"><a href="appup_cookbook.html#id157270">Adding or Deleting a Module</a></li> <li title="Starting or Terminating a Process"><a href="appup_cookbook.html#id157295">Starting or Terminating a Process</a></li> <li title="Adding or Removing an Application"><a href="appup_cookbook.html#id157314">Adding or Removing an Application</a></li> <li title="Restarting an Application"><a href="appup_cookbook.html#id157345">Restarting an Application</a></li> <li title="Changing an Application Specification"><a href="appup_cookbook.html#id157387">Changing an Application Specification</a></li> <li title="Changing Application Configuration"><a href="appup_cookbook.html#id157411">Changing Application Configuration</a></li> <li title="Changing Included Applications"><a href="appup_cookbook.html#id157444">Changing Included Applications</a></li> <li title="Changing Non-Erlang Code"><a href="appup_cookbook.html#id157684">Changing Non-Erlang Code</a></li> <li title="Emulator Restart"><a href="appup_cookbook.html#id157768">Emulator Restart</a></li> </ul> </li> </ul> </div></div> <div id="content"> <div class="innertube"> <h1>9 Distributed Applications</h1> <h3><a name="id153302">9.1 Definition</a></h3> <p>In a distributed system with several Erlang nodes, there may be a need to control applications in a distributed manner. If the node, where a certain application is running, goes down, the application should be restarted at another node.</p> <p>Such an application is called a <strong>distributed application</strong>. Note that it is the control of the application which is distributed, all applications can of course be distributed in the sense that they, for example, use services on other nodes.</p> <p>Because a distributed application may move between nodes, some addressing mechanism is required to ensure that it can be addressed by other applications, regardless on which node it currently executes. This issue is not addressed here, but the Kernel module <span class="code">global</span> or STDLIB module <span class="code">pg</span> can be used for this purpose.</p> <h3><a name="id153338">9.2 Specifying Distributed Applications</a></h3> <p>Distributed applications are controlled by both the application controller and a distributed application controller process, <span class="code">dist_ac</span>. Both these processes are part of the <span class="code">kernel</span> application. Therefore, distributed applications are specified by configuring the <span class="code">kernel</span> application, using the following configuration parameter (see also <span class="code">kernel(6)</span>):</p> <dl> <dt><strong><span class="code">distributed = [{Application, [Timeout,] NodeDesc}]</span></strong></dt> <dd> <p>Specifies where the application <span class="code">Application = atom()</span> may execute. <span class="code">NodeDesc = [Node | {Node,...,Node}]</span> is a list of node names in priority order. The order between nodes in a tuple is undefined.</p> <p><span class="code">Timeout = integer()</span> specifies how many milliseconds to wait before restarting the application at another node. Defaults to 0.</p> </dd> </dl> <p>For distribution of application control to work properly, the nodes where a distributed application may run must contact each other and negotiate where to start the application. This is done using the following <span class="code">kernel</span> configuration parameters:</p> <dl> <dt><strong><span class="code">sync_nodes_mandatory = [Node]</span></strong></dt> <dd>Specifies which other nodes must be started (within the timeout specified by <span class="code">sync_nodes_timeout</span>.</dd> <dt><strong><span class="code">sync_nodes_optional = [Node]</span></strong></dt> <dd>Specifies which other nodes can be started (within the timeout specified by <span class="code">sync_nodes_timeout</span>.</dd> <dt><strong><span class="code">sync_nodes_timeout = integer() | infinity</span></strong></dt> <dd>Specifies how many milliseconds to wait for the other nodes to start.</dd> </dl> <p>When started, the node will wait for all nodes specified by <span class="code">sync_nodes_mandatory</span> and <span class="code">sync_nodes_optional</span> to come up. When all nodes have come up, or when all mandatory nodes have come up and the time specified by <span class="code">sync_nodes_timeout</span> has elapsed, all applications will be started. If not all mandatory nodes have come up, the node will terminate.</p> <p>Example: An application <span class="code">myapp</span> should run at the node <span class="code">cp1@cave</span>. If this node goes down, <span class="code">myapp</span> should be restarted at <span class="code">cp2@cave</span> or <span class="code">cp3@cave</span>. A system configuration file <span class="code">cp1.config</span> for <span class="code">cp1@cave</span> could look like:</p> <div class="example"><pre> [{kernel, [{distributed, [{myapp, 5000, [cp1@cave, {cp2@cave, cp3@cave}]}]}, {sync_nodes_mandatory, [cp2@cave, cp3@cave]}, {sync_nodes_timeout, 5000} ] } ].</pre></div> <p>The system configuration files for <span class="code">cp2@cave</span> and <span class="code">cp3@cave</span> are identical, except for the list of mandatory nodes which should be <span class="code">[cp1@cave, cp3@cave]</span> for <span class="code">cp2@cave</span> and <span class="code">[cp1@cave, cp2@cave]</span> for <span class="code">cp3@cave</span>.</p> <div class="note"> <div class="label">Note</div> <div class="content"><p> <p>All involved nodes must have the same value for <span class="code">distributed</span> and <span class="code">sync_nodes_timeout</span>, or the behaviour of the system is undefined.</p> </p></div> </div> <h3><a name="id153544">9.3 Starting and Stopping Distributed Applications</a></h3> <p>When all involved (mandatory) nodes have been started, the distributed application can be started by calling <span class="code">application:start(Application)</span> at <strong>all of these nodes.</strong></p> <p>It is of course also possible to use a boot script (see <span class="bold_code"><a href="release_structure.html">Releases</a></span>) which automatically starts the application.</p> <p>The application will be started at the first node, specified by the <span class="code">distributed</span> configuration parameter, which is up and running. The application is started as usual. That is, an application master is created and calls the application callback function:</p> <div class="example"><pre> Module:start(normal, StartArgs)</pre></div> <p>Example: Continuing the example from the previous section, the three nodes are started, specifying the system configuration file:</p> <div class="example"><pre> > <span class="bold_code">erl -sname cp1 -config cp1</span> > <span class="bold_code">erl -sname cp2 -config cp2</span> > <span class="bold_code">erl -sname cp3 -config cp3</span></pre></div> <p>When all nodes are up and running, <span class="code">myapp</span> can be started. This is achieved by calling <span class="code">application:start(myapp)</span> at all three nodes. It is then started at <span class="code">cp1</span>, as shown in the figure below.</p> <a name="dist1"></a> <img alt="IMAGE MISSING" src="../design_principles/dist1.gif"><br> <em>Figure 9.1: Application myapp - Situation 1</em> <p>Similarly, the application must be stopped by calling <span class="code">application:stop(Application)</span> at all involved nodes.</p> <h3><a name="id153646">9.4 Failover</a></h3> <p>If the node where the application is running goes down, the application is restarted (after the specified timeout) at the first node, specified by the <span class="code">distributed</span> configuration parameter, which is up and running. This is called a <strong>failover</strong>.</p> <p>The application is started the normal way at the new node, that is, by the application master calling:</p> <div class="example"><pre> Module:start(normal, StartArgs)</pre></div> <p>Exception: If the application has the <span class="code">start_phases</span> key defined (see <span class="bold_code"><a href="included_applications.html">Included Applications</a></span>), then the application is instead started by calling:</p> <div class="example"><pre> Module:start({failover, Node}, StartArgs)</pre></div> <p>where <span class="code">Node</span> is the terminated node.</p> <p>Example: If <span class="code">cp1</span> goes down, the system checks which one of the other nodes, <span class="code">cp2</span> or <span class="code">cp3</span>, has the least number of running applications, but waits for 5 seconds for <span class="code">cp1</span> to restart. If <span class="code">cp1</span> does not restart and <span class="code">cp2</span> runs fewer applications than <span class="code">cp3,</span> then <span class="code">myapp</span> is restarted on <span class="code">cp2</span>.</p> <a name="dist2"></a> <img alt="IMAGE MISSING" src="../design_principles/dist2.gif"><br> <em>Figure 9.2: Application myapp - Situation 2</em> <p>Suppose now that <span class="code">cp2</span> goes down as well and does not restart within 5 seconds. <span class="code">myapp</span> is now restarted on <span class="code">cp3</span>.</p> <a name="dist3"></a> <img alt="IMAGE MISSING" src="../design_principles/dist3.gif"><br> <em>Figure 9.3: Application myapp - Situation 3</em> <h3><a name="id153789">9.5 Takeover</a></h3> <p>If a node is started, which has higher priority according to <span class="code">distributed</span>, than the node where a distributed application is currently running, the application will be restarted at the new node and stopped at the old node. This is called a <strong>takeover</strong>.</p> <p>The application is started by the application master calling:</p> <div class="example"><pre> Module:start({takeover, Node}, StartArgs)</pre></div> <p>where <span class="code">Node</span> is the old node.</p> <p>Example: If <span class="code">myapp</span> is running at <span class="code">cp3</span>, and if <span class="code">cp2</span> now restarts, it will not restart <span class="code">myapp</span>, because the order between nodes <span class="code">cp2</span> and <span class="code">cp3</span> is undefined.</p> <a name="dist4"></a> <img alt="IMAGE MISSING" src="../design_principles/dist4.gif"><br> <em>Figure 9.4: Application myapp - Situation 4</em> <p>However, if <span class="code">cp1</span> restarts as well, the function <span class="code">application:takeover/2</span> moves <span class="code">myapp</span> to <span class="code">cp1</span>, because <span class="code">cp1</span> has a higher priority than <span class="code">cp3</span> for this application. In this case, <span class="code">Module:start({takeover, cp3@cave}, StartArgs)</span> is executed at <span class="code">cp1</span> to start the application.</p> <a name="dist5"></a> <img alt="IMAGE MISSING" src="../design_principles/dist5.gif"><br> <em>Figure 9.5: Application myapp - Situation 5</em> </div> <div class="footer"> <hr> <p>Copyright © 1997-2011 Ericsson AB. All Rights Reserved.</p> </div> </div> </div></body> </html>