+----------+ | diet 2.8 | +----------+ * Features: - New forwarders: they do not need a configuration file anymore, and they can be started at any time. - CORBA calls are more robust: Two handlers for communication failures and transient exceptions have been added. They retry the CORBA call for as long as the maximum number of authorized retries (currently 3) has not been reached. - a new tool has been introduced: dietObjects. It mimics the behavior of nameclt, but restricted to DIET CORBA objects. It also shows which object is managed by which forwarder (or through which forwarder an object is accessible). * Bug Fix: - Forwarders now unregister themselves from omniNames when they are shut down properly. - Forwarders can be launched after any DIET element - To avoid problems when a file managed by DAGDA is modified outside DAGDA, we know compute the files sizes before any transfer. - Let CMake manage rpath. Thus, you need to explicitly set your LD_LIBRARY_PATH or DYLD_LIBRARY_PATH variable. * Source code organization: - we now use a modified version of cpplint (http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py) for checking our code against our coding standards. - DIET now requires Boost to compile. - removed dependency towards libuuid, we now use Boost UUID instead - endianness is now managed with Boost - replaced usage of fnmatch by boost:regex - intermediary libraries are not installed anymore - we removed all deprecated code and documentation: + JXTA: was used to build a multi-MA hierarchy using a P2P approach + DTM: the former data manager + JUXMEM: was a P2P data manager used in conjunction with DTM + FD: fault detector + FAST: performance evaluation plugin + TAU: a code profiler - old CVS headers have been removed and replaced by Doxygen headers - move FWorkflow.dtd from /etc to /usr/share/diet since it's application data (accordingly to FHS) +------------+ | diet 2.7.1 | +------------+ * Features: - CORBA calls are now more robust. Whenever a COMM_FAILURE or TRANSIENT exception occurs, a handler is called and decides whether or not the CORBA call should be retried (up to a given maximum number of retries). * Bug Fix: - Forwarders now use macros defined in debug.hh instead of using cout or cerr to print messages - fixed a bug when parsing forwarders' net-config files which use the ' = ' separator +----------+ | diet 2.7 | +----------+ * Features: - A new traceLevel has appeared: TRACE_ERR_AND_WARN=1, when this traceLevel is selected only error and warning messages are printed. The default value is still TRACE_MAIN_STEPS, but it now equals 2, thus check your configuration files (example configuration files have been updated accordingly). The NO_TRACE level do not print anything, not even warnings and errors. NO_TRACE = 0 DIET do not print anything, not even warning and error messages. TRACE_ERR_AND_WARN = 1 DIET only prints warnings and errors on the standard error output. TRACE_MAIN_STEPS = 2 [default] DIET prints information on the main steps of a call. TRACE_ALL_STEPS = 5 DIET prints information on all internal steps too. TRACE_MAX_VALUE = 10 DIET prints all the communication structures too. >10 (traceLevel - 10) is given to the ORB to print CORBA messages too. - A new method has been added to DIET_server.h: diet_get_SeD_services. Given the name of a SeD it returns the list of services of this SeD. - forwarders now find an unused port during initialization and write it in the /tmp/DIET-forwarder-ior-*.tmp file which is used by its peer. Thus, you do not necessarily need to specify the --remote-port option anymore. - forwarders have a new option, --tunnel-wait, to specify the time a forwarder has to wait before considering that the tunnel is open (default is 10s). - a new API for administrating the DIET hierarchy has been created. Currently three methods are available: + diet_remove_from_hierarchy: kill an element (MA, LA, SeD) + diet_change_parent: modify the parent of an LA or a SeD + diet_disconnect_from_hierarchy: remove the link between an LA or a SeD and its parent, i.e., the sub-hierarchy becomes unavailable. See include/DIET_admin.h for more details. - The SLURM batch scheduling system is now supported by the SeD Batch. * Compilation: - Documentation can now be compiled directly from the main CMakeLists.txt. For this, set the DIET_BUILD_DOC to ON. Documentation can though still be compiled separately from the doc directory. * Source code organization: - the old parser has been definitively removed - all elements of DIET apart from the dietForwarder now use macros in debug.hh to print out messages instead of using cout and cerr. - a new include file has been added: DIET_admin.h * Bug fix: - A few memory leaks - DIET elements do not hang anymore when a directory is given as a configuration file - Forwarder: really add all local addresses when using the "localhost" keyword in a reject or accept rule. (Tested only on Linux and Mac OSX currently. Still need to test on AIX, Cygwin and Windows.) - During initialization, if the POA already exists, then we now try to recover it instead of crashing. - SSH keys were not handled properly - async calls hanged when calling a profile with inout parameters - DAGDA: fixed bugs with paramstrings - New defintion of FNM_CASEFOLD for AIX compatibility (AIX 5.1 or later) - DIET elements now correctly unsubscribe from omniNames +------------+ | diet 2.6.1 | +------------+ * Features: - dietAgent now has command line help (./dietAgent --help) - SIGINT and SIGTERM can be used to properly kill the agents and the SeDs * Source code organization: - New methods for retrieving options have been added. It can take into account options passed in the configuration file, on the command line, and in environment variables - Old parser is still used by agent/localAgentJNI.cc, agent/masterAgentJNI.cc and utils/nodes/FASTMgr.cc. It should be removed soon. - We now use cppcheck to check the quality of the code. This helped remove a few memleaks, and improve the code: - Prefer prefix ++/-- operators for non-primitive types. - reduce variables scopes - realloc mistake, variable nulled but not freed upon failure - free resources opened with fopen - Use TRACE_TEXT instead of cout to control the verbosity. - Removed a few compilation warnings * Bug fix: - Problems with inout in workflows (bugs 154, 155 and 156) - Fixed initialization of containers in wf - Correct bugs with inout in Dagda - Dynamic hierarchies work again * TODO: - add command line options in maDagAgent and dietForwarder - definitively remove the old parser +----------+ | diet 2.6 | +----------+ * External DIET tools are back DIET now works once more with LogService DIET now works once more with GoDIET * Compilation improvements: - DIET compilation needs the log service at compilation to use the log service at runtime. - Add libraries versions when compiling the 3 main libraries - Fix the cmake CheckFunctionExist problem - Correct the cori compil because use of cmake flags in source code (->add_definitions) - Remove DTM from ccmake compilation because deprecated. Nevertheless DTM data manager is still available with: > cmake -DDIET_USE_DTM=1 -DDIET_USE_DAGDA=0 -DDIET_USE_DYNAMICS=0 .. > make * Source code organization: - SeD batch source files have been moved from src/utils to src/utils/batch - Move parsers error codes to DIET_grpc.h, people using DIET_* error codes might have to recompile their applications. +----------+ | diet 2.5 | +----------+ * Compilation & installation improvements - DIET compilation now produces only three main dynamic libraries: DIET_client, DIET_SeD and DIET_Dagda - The DIET internal parts are now compiled as static libraries and are linked to the three dynamic ones. The installation copies only these three libraries. * DIET forwarder - A new DIET component: The DIET forwarder that forwards the DIET CORBA messages through SSH tunnel. - DIET forwarders allow to build DIET hierarchy on different networks that can only be reached through SSH. * DIET cloud - DIET cloud now works with Amazon EC2 +----------+ | diet 2.4 | +----------+ * Windows version - Added the support of Windows platform inside Cygwin. DIET is now available on Windows through a port in Cygwin. It as been successfully tested on Cygwin 1.5.25 and Cygwin 1.7.1. - It should be noticed that the workflow module is not available on this platform. * Compilation improvement - Compilation process in general has been improved to remove cyclic dependancies between libraries during the link. * Dynamic hierarchy - Added the possibility to dynamically change the hierarchy shape via CORBA calls. * Data Management - DAGDA can now return transfers progressions. Accessible from the API when activating a compilation option. - Data containers are introduced to manage variable-length data sets. * Workflow Management - The workflow engine now supports in addition to the existing MAdag language a new workflow language that provides a higher level of abstraction with loops and conditional structures as well as operators on the data flows and multiple instanciation of services for parallel processing (language defined by the Gwendia project: http://gwendia.polytech.unice.fr +----------+ | diet 2.3 | +----------+ * Parallel Subsmissions - The server API has been extended to provide transparent submission for parallel tasks on parallel resources, eventually managed by Batch systems. * CoRI Collectors of ressource information for batch systems - The CoRI system has been extented for batch systems: still only a few information, but one can now get some information on the batch system state, such as the number of idle nodes. These can be used to dynamically tune a batch script. * Loadable Schedulers - Added support for scheduler module at agent level. The user can define his own scheduler class to be loaded by the agent and used to select the SeD(s) returned to the client. * Data Arrangement for Grid and Distributed Application - The users can now choose a new data manager which allows explicit and implicit data distribution/replication. DAGDA extends the standard DIET API with data management functions which can be called from a client, a server or an agent through a plugin scheduler. * Workflow the workflow management system of DIET has been redesigned. 4 differents scheduling heuristics have been implemented: - basic provides HEFT heuristic for the dag - multi-heft provides multi-HEFT heuristic when several Dag are submit to the MA_DAG - multi-aging_heft provides multi-HEFT with modification of node rank according to their age in the system. - foft provides multi-HEFT with modification of node selection according to the slowdown of each dag. See UserManual for more details. * All bug was resolved. If you found a new one please refer to http://graal.ens-lyon.fr/bugzilla * DietLogComponents now implement the test() method, in order to allow LogCentral to test if an old component is reachable when a new component tries to sign in with the same name. +----------+ | diet 2.2 | +----------+ * Workflow - Workflow support added to DIET. A workflow engine and a special agent the maDagAgent were added to manage and schedule workflow submissions. * Fault Tolerance - Added support for Chandra&Aguilera&Toueg failure detector. This failure detector may be used to observe failures of components and restart them. Failure detector triggers callback upon definitive failure suspicion of an observed process. Client provide a restart procedure for failed SeDs. * JuxMem - Added support for asynchronous calls and refactorized implementation - Updated for JuxMem 0.3 * CMake - DIET used a cmake based building process (autotool version was abandoned) * All bug was resolved. If you found a new one please refer to http://graal.ens-lyon.fr/bugzilla +----------+ | diet 2.1 | +----------+ * CoRI Collectors of ressource information - This new functionality allows to acces in an easy way to information about every SeD. In conjunction with the plug-in scheduler, you can now specify your own scheduler and your own method to collect the ressource information. * JuxMem - JuxMem support inside DIET has been updated for JuxMem 0.2. Removed C++ wrapper inside src/utils (included in JuxMem). * JXTA Multi-MA - Changes inside the DIET code caused the JXTA Multi-MA to be broken. It's now repaired. Moreover, the JXTA SeD will be automatically updated when the DIET server API is modified. * DIET with MacOSX - Bug corrections to compile on MacOSX. Remark : DIET requires Xcode 2.1 and the CVS version of omniORB 4.1.x Compilation has been tested on MacOSX 10.4.x * CMake - DIET now has a cmake based building process. and gcc 4.x support, few bug fix... * Known bugs but not resolved - INOUT parameters are not changed in asynchronous mode (bug #9 in bugzilla) +----------+ | diet 2.0 | +----------+ * Plug-in - add support for user-defined performance metrics at SeD-level, and allow users to specify priorities for metrics to control scheduling process at agents. * LogService - DIET can now send enhanced log information including tracking read and write operations in JuxMem and tracking user-defined plug-in scheduler details. * JuxMem - JuxMem is a distributed memory system based on JXTA peer-to-peer technologies. DIET now how the ability to store and retrieve data in JuxMem. Experimental. * Multi-MA - A second type of multi-MA has been integrated that is not based on JXTA. Experimental. * Bug corrections - file length limits are too small (bug #10 in bugzilla) - dmat_manips clients crash when useAsyncAPI = 0 in config file. option not needed, so removed from config files. (bug #11 in bugzilla) - when building in a separate directory make install fails due to dependency on src/libs/*.jar. Dependency removed with not compiling with JXTA options. (bug #12) - union type embedded in a struct in idl/common_types.idl can not be compiled by java compiler. structure re-arranged to correct the problem (bug #13) - when user re-uses the same profile for multiple service declarations without calling free in between, MA crashes. bug was introduced after release 1.1, and has been corrected for release 2.0. (bug #14) * Known bugs but not resolved - INOUT parameters are not changed in asynchronous mode (bug #9 in bugzilla) +----------+ | diet 1.1 | +----------+ * logService - add LogManager to DIET component (MAs, LAs and SeDs) to enable DIET to suscribe to LogService. ( use configure with : --enable--logservice option) * Data persistance - add data persistance with DTM in DIET, now SeDs can store DATA. * DIET multi-MA with JXTA - a prototype of multi-MA based on JXTA. ( use configure with : --enable-JXTA-mode option) - it is only a prototype! * change in config file - name of variables "endPointPort" and "endHostname" change with "dietPort" and "dietHostname". * bugs correction - bugs in diet_wait_or() function (bug #3 in bugzilla) - detection of omniORB in darwin (bug #4 in bugzilla) - empty file for DATA_PERSISTANCE even if we don't use it (bug #6 in bugzilla) - option --repeat in dmat_manip/client.c example (bug #7 in bugzilla) - parsing file on 64 bit architectures (bug #8 in bugzilla) * known bugs but not resolved - INOUT parameters are not changed in asynchronous mode ( bug #9 in bugzilla) - file length are too short (bug #10 in bugzilla) - dmat_manips clients crash when useAsyncAPI = 0 ( bug #11 in bugzilla) +----------+ | diet 1.0 | +----------+ * GNU autotools: automake and autoconf. - Apparition of this ChangeLog. - Ascendant compatibility with the old "manual" configure + all options for abling/disabling modules (doc, examples, examples/BLAS and examples/ScaLAPACK), for giving omniORB and FAST paths, installation, etc. - For maintainers, three other modules are available: stats, Cichlid (as in 0.6.4), and multi-MA. * Separation of LA and MA. Two different programs are now built: DIET_LA and DIET_MA. Of course the configuration file is modified (the first line is obsolete). * Contract checking. The client asks the server for a re-estimation of the request just before it invokes the solve. But the data transfer does not begin at the same time yet. * Communication structures. - The IDL types are reorganized to take into account the latest discussions about the algorithms to be implemented in DIET. There is no need of a decision type left (the client needs complete information of a response for contract checking). - The attributes of a server include NWS forecasts if FAST was not able to estimate the computation time, and a sequence of communication times (one for each argument, even OUT ones, when it is possible to guess their sizes). - The data loc are removed. Let us the data manager take care of that. * Scheduling - Big purge in the agent ! It does not compute the persistent data transfer times (which is done at the SeD level by the data manager), but only the non-persstent ones, assuming the transfer time is approximately the one through all nodes of hierarchy downto the SeD. - The scheduling part of the agent is deported into various scheduling classes. The final goal is to easily switch from a scheduler to another, and make it depend on the request itself. The various scheduling classes are structured as follows: there are Schedulers, which can sort and aggregate servers that have similar features (such as the ones that got a FAST estimation, the ones that have only NWS forecasts, etc.), and there is a GlobalScheduler, which applies the schedulers in a given order for sorting or aggregating. - Some schedulers are defined (with their interface) in Schedulers.hh, and GlobalSchedulers are defined in GlobalSchedulers.hh. Their sort method is dead code so far (only aggregate is used, since all agents give their parent a sorted list of servers). * CORBA part: - The ORBMgr is the only interface to the ORB. - New IDL files are added: Agent, MasterAgent and LocalAgent.idl for the LA/MA separation, but also registration_and_request, response.idl, since communication structures have been reorganized. - A check-contract mechanism is implemented to avoid crossed-request issues. * Utils part: - Merge all traceLevels into one static variable in the module debug. - Add Parsers, a class that performs the parsing of configuration files. - Manage the string ID for data.