This package consists of two related programs. The first, msort, is the actual sort program. It has a command-line interface and is written in C. The code is quite standard and no exotic libraries are required, so it should compile and run on any POSIX-compliant system. The one non-standard library required is Ville Laurikari's TRE regular expression library, available at http://laurikari.net/tre/. The second program, msg, is a graphical front end to msort. It isn't of any real use without msort, but it doesn't literally depend on it. You can run it on a system lacking msort. When it starts up it will report that it cannot find msort, and therefore of course it will not actually sort anything, but if it amuses you, you can still play with it. msg is written in Tcl and uses the Tk toolkit. It is meant to be run under wish, the Tcl/Tk windowing shell. So long as you have Tcl/Tk/wish available, there is nothing much to be done to install msg. Since Tcl is interpreted, no compilation is necessary. If you do not have Tcl/Tk, don't worry, it is easy to obtain and install. For most platforms, the easiest approach is to obtain the ActiveTcl distribution from: http://www.activestate.com/Products/ActiveTcl Further information is available at: http://billposer.org/Software/msort.html FURTHER DETAILS ON MSORT Msort has been developed and tested primarily under GNU/Linux. I also have access to a machine running FreeBSD and am able to test it there. According to reports from others, it compiles and runs under Solaris and Mac OS X. The man page only gives basic information. The real reference manual is Doc/msort.pdf. DEPENDENCIES Msort makes use of several libraries that are not routinely installed. The first is Ville Laurikari's regular expression library, wihch may be obtained from: http://laurikari.net/tre/. This library is reported to work on pretty much all varieties of Unix, including Mac OS X, as well as MS Windows XP. Second, msort requires support for Unicode normalization. It can be compiled to use either libicu (International Components for Unicode), which may be obtained from http://www.icu-project.org/, or libutf8proc, which may be obtained from http://www.flexiguided.de/publications.utf8proc.en.html. ICU is fairly widely used, so you already have it on your system. To use it, give the option --disable-utf8proc to configure. msort defaults to using utf8proc because utf8proc is smaller and easier to install. Third, msort optionally uses libuninum to handle numbers in systems other than the usual Indo-Arabic system. Libuninum is my own library and may be obtained from http://billposer.org/Software/libuninum.html. Packages for a variey of systems are available. If you do not need support for exotic number systems, you may build msort without libuninum. To do this, give the option --disable-uninum to configure. Libuninum in turn uses the GNU MP library for arbitrary precision arithmetic. It is available from http://www.swox.com/gmp/. libgmp is required if libuninum is linked. To summarize, if you want to build msort with the minimum of trouble, you will need libtre and either libutf8proc or libicu. If the latter is not already installed, you will probably find it easier to go with the libutf8, which is the default. If you do not need to handle exotic number systems, you can forgo libuninum and libgmp. To build this minimal configuration, call configure as follows: configure --disable-uninum On some systems, the autoconfiguration system will not detect the need to link to libintl. If this happens to you, give the flag: LIBS="-lintl" to configure, e.g.: ./configure LIBS="-lintl" INSTALLATION If you have the GNU autoconf system available, follow the generic installation instructions in INSTALL. In short, these are: ./configure make make test (su) make install-strip The last command arranges for the symbol table to be removed from the executable file when it is installed, which results in a substantial reduction in size. If you want to be able to use a debugger on msort you will want to preserve the symbol table, in which case you should give the command: make install instead. "make test" is optional. It executes a set of regression tests. The tests run very quickly so don't hesitate to try it. The results will be written to the file RegressionTests/TestResults. There are a few additional tests that are not executed by "make test". These are tests that depend on the correct functioning of the locale system, including the ability to switch into certain particular locales. They are kept separate because they can fail even if msort itself is working perfectly. To execute these tests, give the command: make localetest The results will be written to the file RegressionTests/LocaleTestResults. There are several non-standard options to configure: --disable-allocaok By default, in certain situations msort uses the alloca routine to allocate storage on the stack, which is faster than allocating it on the heap. However, alloca is buggy on some systems. If you give configure the option --disable-allocaok, msort will not use alloca. If you know that alloca is funky on your system, or if msort seems to behave strangely, configuring msort with this flag is wise. --disable-uninum Build without reliance on libuninum. This eliminates the ability to handle exotic number systems. --disable-utf8proc Use libicu rather than the default of utf8proc for Unicode normalization. --disable-comparison-count Eliminates the comparison count. In theory this will speed things up slightly, but the speed-up is unlikely to be noticable. --enable-debugbuild This adds replaces the default compiler options "-g -O2" with "-ggdb -g3", causing the resulting executable to contain the maximum amount of useful information for gdb and disabling optimazation. This eliminates the need for manual editing of the Makefile. It also defines the MACRO DEBUGBUILD in the C files, allowing conditional compilation of code for debugging. For generic details on installation using the the autoconf system, see the file "INSTALLATION". The standard option you are most likely to be interested in is: --prefix=foo, which changes the directories in which msort is installed. For example, by default the executables will be installed in /usr/local/bin. If you prefer to install the executables in your personal bin, in my case, /home/poser/bin, you can configure msort using the command: ./configure --prefix=/home/poser This will result in the executables being put in /home/poser/bin, the manual page in /home/poser/man/man1, etc. If you do not have autoconf/automake, or if a problem arises, look in the Doc directory for the file OriginalMakefile and make a copy of it in this directory named Makefile. To compile, first see if there is anything in the Makefile that you want to change. You may wish to change the default installation directories BINDIR, where the executable goes, and MANDIR, where the manual page goes. The compiler is also set to gcc. If you don't have gcc, or want to use another compiler, change the value of CC. Then a simple "make" should suffice to compile msort. To install, su if necessary, then "make install". Msort uses the TRE regular expression library to match tags and to perform substitutions on keys. This library is available for a wide range of systems but in source form. It must be compiled and installed. Clear instructions for compiling and installing it are provided with the package. However, those not experienced with installing libraries may encounter difficulties. One problem that you may encounter is that, even after you install the library, the linker (part of the compilation process) says that it cannot find it. This is probably the result of the library having been installed in a directory that the linker does not know about. To remedy this, you need to run the ldconfig program. On Linux systems this should be located in /sbin, a directory that contains programs normally used only by the system administrator. You will need to be root to run ldconfig. Ldconfig indexes the standard directories /usr/lib and /lib, any directories listed in the file /etc/ld.so.conf, and directories listed on the command line. If you install the TRE library in a directory other than /lib or /usr/lib, such as the default /usr/local/lib, you will need to tell ldconfig to search that directory. You can do this either by adding the name of the directory to /etc/ld.so.conf or supplying the directory name on the command line, e.g.: /sbin/ldconfig /usr/local/lib Another approach is to give the compiler options that it will pass on to the linker to tell it where to look. There are two such options: -L and -rpath. On some systems -L is used for static libraries and -rpath for shared libraries, but there is some variation. It appears always to work if you just use both. This is especially useful if you do not have root privileges on the system. In the msort Makefile, the relevant portion looks like this: msort: ${OBJS} ${CC} -o msort ${OBJS} -ltre This says that "msort" depends on the files listed in the variable OBJS, namely msort.o, misc.o, etc., and that "msort" is created from these files by running the command that is the value of the variable CC. The value of CC will generally be "gcc". The flag -ltre indicates that the TRE library should be loaded. To tell the linker that the files for the TRE library are located in /usr/local/lib/, change the second line above to: ${CC} -o msort ${OBJS} -L /usr/local/lib -rpath /usr/local/lib -ltre Of course, if you don't have root privileges you probably can't install TRE in /usr/local/lib. If you install it in one of your own directories, give that directory as argument to -L and -rpath instead, e.g.: ${CC} -o msort ${OBJS} -L /home/wjposer/Src/lib -rpath /home/wjposer/Src/lib -ltre Some sample sort order definition files are provided in the SortOrders subdirectory. In addition to serving as examples, some of them may be useful, if, for example, you need to sort country names in United Nations order, sort by the Chinese Heavenly Stems, or handle traditional Armenian dates.