README: Readme for samefile *************************** For instructions on how to compile and install samefile read the file INSTALL. What is it? *********** The samefile program helps you find identical files on your file systems. Identical files don't have to have the same name to be found by samefile. The input is a list of filenames, e.g. from ls(1) or find(1). The output is a list of file name pairs along with some other useful information. Samefile comes in handy when you - are notoriously low on disk space. It will probably find megabytes of redundant files that you can then hardlink, symlink or remove. - want to know what files in two subdirectory trees have (not) changed. The program is reasonably fast to even check any file against any other. On the author's system with 500MB in 25000 files in 5 file systems on a 1GB disk the command find / -print | samefile -v >/dev/null uses 20 seconds user and 145 seconds system time (it's a 486DX2-66 running FreeBSD 2.1.0) and reports 30MB in identical files. Real time was about five times user plus system on an otherwise idle system. This indicates that the limiting factor is I/O latency, not CPU power. The observation was confirmed by ps(1), which showed the process often to be in D (short term disk wait) state. Execution times reduce significantly if only files larger than some value are compared. The same command for files that are at least 10K in size, e.g. find / -print | samefile -v -g 10240 >/dev/null runs only 14 user and 58 system seconds, 240 seconds real. Will it compile on my system? ***************************** The source is believed to be POSIX 1003.1-1988 conforming and should compile on any nearly POSIX compliant system with a Standard C compiler. The configuration process will figure out if your system supports features beyond POSIX, in particular symbolic links and memory mapped I/O and use them to improve efficiency. Samefile has been successfully compiled on these configurations: OS Release Compilers ==================================== FreeBSD 2.1.0 gcc 2.6.3 Linux 1.2.13 gcc 2.7.0 Linux 2.2.14 ? Linux 2.4.2 ? SunOS 4.1.3 gcc 2.7.2 Solaris 2.5 gcc 2.7.2, cc AIX 3.2.5 gcc 2.5.8, cc UNICOS 9.0.2.0 c89 IRIX 6.5.8 ? Trivia ****** Did you know that this program was a winning entry in the 1998 "International Obfuscated C Code Contest" (IOCCC)? A slightly modified version of samefile.c can be found in schweikh3.c. See <URL:http://www.ioccc.org/main.html> for the contest, and <URL:http://www.ioccc.org/years.html#1998_schweikh3> for the winning entry and related information (look for schweikh3). Your compiler must understand digraphs and trigraphs to compile this; with gcc use gcc -std=iso9899:199409 -DM0=sizeof -DM1=long -DM2=void -DM3=realloc \ -DM4=calloc -DM5=free -o ioccc ioccc.c Legal mumbo jumbo ***************** `install-sh' is from the X Consortium and is not copyrighted. The rest is Copyright (c) 1996 Jens Schweikhardt. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.