-- $MawkId: CHANGES,v 1.127 2010/06/25 09:43:31 tom Exp $ Changes by Thomas E Dickey <dickey@invisible-island.net> 20100625 + correct translation of octal and hex escapes for system regular expression library. + modify configure script to support --program-suffix, etc. + add Debian package scripts, for "mawk-cur". + add RPM spec-file. + move release- and patch-level values from version.c to patchlev.h to simplify packaging scripts. 20100618 + correct translation of "^{" pattern for system regular expression library (report by Elias Pipping). + fix sentence fragment in README (report by Elias Pipping). 20100507 + cleanup gcc warnings for 64-bit platform, e.g., use size_t rather than unsigned, etc. + fix warnings from clang --analyze + update/improve configure script + modify CF_GCC_VERSION to ignore stderr, e.g., from c89 alias + modify CF_GCC_WARNINGS, moving -W and -Wall into the list to check, since c89 alias for gcc complains about these options. + add --disable-leaks and related options, for testing. + add lint rule to makefile. + add configure-check for ctags to work with pkgsrc. + amend change of array.w, fixes a regression in "delete" (report by Heiner Marxen). 20100419 + modify split() to handle embedded nulls in the string to split, e.g., BEGIN{s="a\0b"; print length(s); n = split(s,f,""); print n} (report by Morris Lee). + modify array.w to update table pointers in the special case where an array is known to have string-indices, but is later indexed via integers. The problem occurs when the array grows large enough to rehash it, e.g., BEGIN{a["n"];for(i=1;i<1000;++i)printf "%d\n", a[i]; } (report by Morris Lee). + increase size of reference-count for strings to unsigned. It was an unsigned short, which prevented using arrays larger than 64k, e.g., BEGIN{for(i = 1; i <= 65550; ++i){if(i >= 65534 && i<= 65537) print i; s[i] = "a"}; delete s;} (report by Morris Lee). + add special case for Solaris 10 (and up) to configure check CF_XOPEN_SOURCE + refactored configure check CF_REGEX 20100224 + add a configure check for large files (report by Sean Kumar). + modify check in collect_RE() to show the actual limit value, e.g., MIN_SPRINTF-2 used for built-in regular expressions. + increase MIN_SPRINTF, used as limit on regular-expression size, to match the MAX_SPLIT value, i.e., slightly more than doubling the size (report by Markus Gnam). + further modify makefile to build outside the source-tree. + modify makefile and mawktest to use relative path again, since the existing script did not work with openSUSE's build (patch by Guido Berhoerster). + fix makefile's .c.i rule, which lacked CPP definition. + update mawktest.bat script to more/less correspond with mawktest, for Win32 console except where echo command does not handle the required quoting syntax. + add vs6.mak, for Visual Studio 6. + modify mawktest script to report results from all tests, rather than halting on the first failure. + add limit-check after processing match(test, "[^0-9A-Za-z]") to ensure the internal trailing null of the test-string is not mistaken for part of the string, i.e., RSTART, RLENGTH are computed correctly (report by Markus Gnam). + modify parsing of -W option to use comma-separated values, e.g., "-Wi,e" for "-Winteractive" and "-Werror". + add timestamp to scancode.c, to help manage revisions. + improve configure macro CF_XOPEN_SOURCE, making it remove possibly conflicting definitions before adding new ones. + update config.guess and config.sub > patches by Jan Psota: + improve buffering for -Winteractive option. + allow multiple single-character flags after -W, e.g., "-Wie" for "-Winteractive" and "-Werror" to permit these to be passed on a "#"-line of a shell script, e.g., #!/usr/bin/mawk -Wie > patches by Jonathan Nieder: + add new M_SAVE_POS and M_2JC operation codes (states) to the built-in regular expression engine. Use these to reimplement m* (closure), to provide a way to avoid infinite looping on matches against empty strings. This change requires reimplementing the workaround for gawk's noloop1 testcase from 20090726. + improve buffer-overflow check for string_buff. + fix collect_RE to treat "[^]]" as a character class (meaning "not a closing bracket") but "[^^]]" not as one. This also requires initializing the local "start of character class" variable to NULL rather than the beginning of the string, to avoid an invalid array access when collecting expressions such as "^text". + within a character class and not followed by a :, ., or ~, a "[" is just like any other character. This way, you can tell mawk to scan for a literal [ character with "mawk /[[]/", and you can scan for a [ or ] with "mawk /[][]/". Also clean up the relevant loop in do_class() to make it a bit more readable. + outside a character class, a "]" is just like any other character. + prevent do_class() from scanning past the end, e.g., if the terminating zero byte was escaped. + fix regular-expression parsing when a right parenthesis ")" is found without a preceding left parenthesis. + fix resetting of position stack when backtracking. + modify regular-expression engine to avoid exponential running time for some regular expression matches in which the first match mawk finds extends to the end of the string. This is a new fix for the gawk noloop2 test, added here for regression testing. 20091220 + bump version to 1.3.4 + update INSTALL and README files. + improve configure checks for math library. + change test for NaN to use sqrt() rather than log() to work around cygwin's partly broken math functions. + add/use isnanf() to work around other breakage in cygwin math functions. + add configure check for _XOPEN_SOURCE, etc., needed to define proper function pointer for sigaction, e.g., on Tru64. + add check for sigaction function pointer, whose POSIX form is absent from the cygwin header. + extend MAWKBINMODE, adding a third bit which when set will suppress the change for RS or ORS to use CR/LF rather than LF. This is used for MinGW to make the "check" rule in a build work, for instance. + add configure check for functions used for pipe/system calls, e.g., for MinGW where these are absent. + add runtime check for floating-point underflow exceptions + fix an old 1.3.3 bug in re_split(), which did not check properly for the end of buffer; this broke on Tru64. + drop obsolete config-user, v7 and atarist subdirectories + improve configure checks for sigaction, making the definitions used in fpe_check.c consistent with matherr.c + build fixes for AIX, Tru64. + add configure check for 'environ'. + remove redundant setlocale() calls; only LC_CTYPE and LC_NUMERIC are used. 20091213 + add makedeps.sh script to aid in updating object dependencies in Makefile.in + use "mkdir -p" rather than mkdirs.sh (suggested by Aleksey Cheusov). + reformatted this file, to simplify extraction of contributor names. + update config.guess and config.sub > patches by Jonathan Nieder: + modify CF_DISABLE_ECHO autoconf macro to ensure that command lines in Makefile.in begin with a tab. + the makefile does not use $(MAKE); remove the SET_MAKE substitution. + add some files to the "make clean" rule, in case make gets interrupted in the middle of a rule. + add a maintainer-clean rule to the makefile, to remove files which could be regenerated. + fix an unescaped "-" in man/mawk.1 + remove an unneeded cast in bi_funct.c + fix an unused parameter warning in matherr.c + drop unused line_no parameter from compile_error() and its callers. + convert makescan.c to ANSI C, do further cleanup of that file. + split-out scancode.h from scan.h 20090920 + improve hash function by using FNV-1 algorithm (patch/analysis by Jim Mellander). This greatly improves speed for accessing arrays with a large number of distinct keys; however the unsorted order in "for" loops will differ. + add "internal regex" or "external regex" string to version message to allow scripting based on support for embedded nulls. + drop obsolete CF_MAWK_PROG_GCC and CF_MAWK_PROG_YACC macros from configure script (report by Mike Frysinger). + fixes to allow build outside source-tree (report by Mike Frysinger). + correct logic in scan.c to handle expression "[[]" (report by Aleksey Cheusov). + add MAWK_LONG_OPTIONS feature to allow mawk to ignore long options which are not implemented. + two changes for embedded nulls, allows FS to be either a null or contain a character class with null, e.g., '\000' or '[ \000]': + modify built-in regular expression functions to accept embedded nulls. + modify input reader FINgets() to accept embedded nulls in data read from files. Data read from standard input is line-buffered, and is still null-terminated. + update config.guess and config.sub 20090820 + minor portability/standards fixes for examples/hical + add WHINY_USERS sorted-array feature, for compatibility with gawk (patch by Aharon Robbins). + correct lower-limit for d_to_U() function, which broke conversion of zero in "%x" format, added in fix for Debian #303825 (report by Masami Hiramatsu). + modify "%s" and "%c" formatting in printf/sprintf commands to ensure that "%02s" does not do zero-padding, for standards conformance (discussion with Aharon Robbins, Mike Brennan, prompted by Debian #339799). 20090728 + add fallback definitions for GCC_NORETURN and GCC_UNUSED to build with non-gcc compilers (report by Jan Wells). 20090727 + add check/fix to prevent gsub from recurring to modify on a substring of the current line when the regular expression is anchored to the beginning of the line; fixes gawk's anchgsub testcase. + add check for implicit concatenation mistaken for exponent; fixes gawk's hex testcase. + add character-classes to built-in regular expressions. + add 8-bit locale support 20090726 + improve configure checks for MAX__UINT. + add a check for infinite loop in REmatch(), to work with gawk's noloop1 testcase. + modify logic to allow setting RS to an explicit null character, e.g., RS="\0" (Ubuntu #400409). + modify workaround for (incorrect) scripts which use a zero-parameter for substr to ensure the overall length of the result stays the same. For example, from makewhatis: filename_no_gz = substr(filename, 0, RSTART - 1); + move regular-expression files into main directory to simplify building using configure --srcdir and VPATH. + modify internal functions for gsub() to handle a case from makewhatis script from man 1.6f which substitutes embedded zero bytes to a different value (report by Gabor Z Papp). + change default for configure --with-builtin-regex to prefer the builtin regular expressions. They are incomplete, but POSIX regular expressions cannot match nulls embedded in strings. + require standard C compiler; converted to ANSI C. + rename #define'd SV_SIGINFO to MAWK_SV_SIGINFO to avoid predefined name on Mac OS X (report on comp.lang.awk). + revise configure script to use conventional scheme for generating config.h 20090721 + port to cygwin, modifying test-script to work with executable-suffix and fp-exception code to allow const in declaration. 20090712 + bump patch-date. + fix Debian #127293 "mawk does not understand unescaped '/' in character classes", noting that gawk 3.1.6 also has problems with this area. + drop support for varargs; use stdarg only. + add tags rule to makefile. + rename configure option "--with-local-regexp" to "--with-builtin-regex" + add build-fix from NetBSD port: enable vax FP support when defined(__vax__) as well as BSD43_VAX. from ragge. + add configure --disable-echo option + identify updated files with "MawkId" keywords, retaining old log comments for reference. Ongoing changes are recorded here. + change maintainer email address to my own. 20090711 + use conventional install/uninstall rules in makefile. + add install-sh script, mkdirs.sh, config.guess and config.sub + add $DESTDIR support to makefile + add uninstall make-target + add configure --enable-warnings option (fixed some warnings...) + change configure script to autoconf 2.52 to permit cross-compiles. + drop config.user file; replace with conventional autoconf --prefix, etc. + fill-in CA_REC.type in two places in parse.y which weren't initialized. + fix Debian #38353 (absent expression on last part of "for" statement caused core-dump). + fix Debian #303825 (printf %x clamps numbers to signed range rather than unsigned range). + apply fix from Debian #355966 (null-pointer check in is_string_split()) + apply fix from Debian #391051 (limit-check in scan.c) 20090709 + add support for external regexp libraries (patch by Aleksey Cheusov) (originally comp.lang.awk June 15 2005, also cited in Debian #355442 March 6, 2006). 20090705 + fixes to build on BeOS (if LIBS is set to -lroot), and QNX (redefinition of setmode). 20080909 + fix gcc 4.3.1 -Wall warnings + transform mawk.ac.m4 to aclocal.m4, renaming macros to use "CF_" or "CF_MAWK_" prefixes depending on their reusability. Modified macros to use quoting consistent with recent autoconf versions. + regenerate configure with autoconf version 2.13.20030927 + regenerate parse.c with byacc - 1.9 20080827 + ifdef'd prototypes in nstd.h which belong to stdlib.h, string.h to use standard C by default. + source-in Debian patches 01-08 (see below). Changes for 20080909 fix Debian #496980 as well. ------------------------------------------------------------------------------- Debian patches: (Peter Eisentraut <petere@debian.org> Sat, 05 Apr 2008 17:11:11 +0200) 08_fix-for-gcc3.3 Debian #195371 (James Troup <james@nocrew.org> Fri, 30 May 2003 15:24:50 +0100) 07_mawktest-check-devfull Debian #51875 06_parse.y-semicolons Debian #170973 05_-Wall-fixes 04_mawk.1-fix-pi Debian #103699 03_read-and-close-redefinition Debian #104124 02_fix-examples Debian #36011 01_error-on-full-fs Debian #28249 Debian #4293 ------------------------------------------------------------------------------- 1.3.1 -> 1.3.2 Sep 1996 1) Numeric but not integer indices caused core dump in new array scheme. Fixed bug and fired test division. 2) Added ferror() checks on writes. 3) Added some static storage specs to array.c to keep non-ansi compilers happy. 1.3 -> 1.3.1 Sep 1996 Release to new ftp site ftp://ftp.whidbey.net. 1) Workaround for overflow exception in strtod, sunos5.5 solaris. 2) []...] and [^]...] put ] in a class (or not in a class) without having to use back-slash escape. 1.2.2 -> 1.3 Jul 1996 Extensive redesign of array data structures to support large arrays and fast access to arrays created with split. Many of the ideas in the new design were inspired by reading "The Design and Implementation of Dynamic Hashing Sets and Tables in Icon" by William Griswold and Gregg Townsend, SPE 23,351-367. 1.2.1 -> 1.2.2 Jan 1996 1) Improved autoconfig, in particular, fpe tests. This is far from perfect and never will be until C standardizes an interface to ieee754. 2) Removed automatic error message on open failure for getline. 3) Flush all output before system(). Previous behavior was to only flush std{out,err}. 4) Explicitly fclose() all output on exit to work around AIX4.1 bug. 5) Fixed random number generator to work with longs larger than 32bits. 6) Added a type Int which is int on real machines and long on dos machines. Believe that all implicit assumptions that int=32bits are now gone. ------------------------------------------------------------------------------- Changes from version 1.1.4 to 1.2: 1) Limit on code size set by #define in sizes.h is gone. 2) A number of obscure bugs have been fixed such as, you can now make a recursive function call inside a for( i in A) loop. Function calls with array parameters in loop expressions sometimes generated erroneous internal code. See RCS log comments in code for details. Reported bugs are fixed. 3) new -W options -We file : reads commands from file and next argument, regardless of form, is ARGV[1]. Useful for passing -v , -f etc to an awk program started with #!/.../mawk #!/usr/local/bin/mawk -We myprogram -v works, while #!/usr/local/bin/mawk -f myprogram -v gives error message mawk: option -v lacks argument This is really a posix bozo. Posix says you end arguments with "--" , but this doesn't work with the "#!" convention. -W interactive : forces stdout to be unbuffered and stdin to be line buffered. Records from stdin are lines regardless of the value of RS. Useful for interaction with a mawk on a pipe. -W dump, -Wd : disassembles internal code to stdout (used to be stderr) and exits 0. 4) FS = "" causes each record to be broken into characters and placed into $1,$2 ... same with split(x,A,"") and split(x,A,//) 5) print > "/dev/stdout" writes to stdout, exactly the same as print This is useful for passing stdout to function my_special_output_routine(s, file) { # do something fancy with s print s > file } 6) New built-in function fflush() -- copied from the lastest att awk. fflush() : flushes stdout and returns 0 fflush(file) flushes file and returns 0; if file was not an open output file then returns -1. 7) delete A ; -- removes all elements of the array A intended to replace: for( i in A) delete A[i] 8) mawk errors such as compilation failure, file open failure, etc. now exit 2 which reserves exit 1 for the user. 9) No program now silently exits 0, prior behavior was to exit 2 with an error message