.help spp83 Jan83 "IRAF Subset Preprocessor Language" .sh 1. Introduction The IRAF subset preprocessor language (SPP) implements a subset of the proposed IRAF preprocessor language, described in the paper "The Role of the Preprocessor". The subset does not include pointers, structures, and dynamic and virtual arrays. Subset preprocessor programs should be written with eventual conversion to the full preprocessor language in mind. In general, this conversion will not be difficult, provided one generates simple and straightforward code. One of the best ways to learn to program in a new language is to read the source listings of existing programs. Many examples of procedures, programs, and packages written in the SPP language can be found in the IRAF system source directories. We shall only attempt to summarize the basic features of the language here. .sh 2. Getting Started The best way to get started is to build and run a simple program, before attempting to learn all the details of the language. Here is our version of the "hello world" program from the C book: .ks .nf # Simple program to print "hello, world" on the standard # output. task hello # CL callable task procedure hello() # common procedure begin call printf ("hello, world\n") end .fi .ke On the UNIX system, this program would be placed in a file with the extension ".x" and compiled with the command XC (X Compiler) as follows. xc hello.x The XC compiler will translate the program into Fortran, call the Fortran compiler to generate the object file ("hello.o"), and call the loader to link the object file with modules from the IRAF system libraries to produce the executable program "hello". XC may be used to compile C and Fortran programs as well as X programs, and in general behaves very much like CC or F77 (note that the "-o" flag is not required; by default the name of the output module is the base name of the first file name on the command line). The "-F" flag may be used to inspect the Fortran generated by the preprocessor; this is sometimes necessary to interpret error messages from the F77 compiler. .sh 3. Fundamentals of the Language The SPP language is based on the Ratfor language. The lexical form, operators, and control flow constructs are identical to those provided by Ratfor. The major differences are the data types, the form of a procedure, the addition of inline strings and character constants, the use of square brackets for arrays, and of course the TASK statement. The i/o facilities provided are quite different. .sh 3.1 Lexical Form A subset preprocessor program consists of a sequence of lines of text. The length of a line is arbitrary, but the SPP is guaranteed to be able to handle only lines of up to 160 characters in length. The end of each line is marked by the NEWLINE character. Both upper and lower case characters are permitted. Case is significant. WHITESPACE is defined as one or more tabs or spaces. Newline normally marks the end of a statement, and is not considered to be whitespace. Whitespace always delimits tokens, i.e., keywords and operators will not be recognized as such if they contain embedded whitespace. .ih 2 4 3.1.1 Comments A comment begins with the character # and ends at the end of the line. .ih 3.1.2 Continuation Statements may span several lines. A line which ends with an operator (excluding '/') or punctuation character (comma or semicolon) is automatically understood to be continued on the following line. .ih 3.1.3 Integer Constants A decimal integer constant is a sequence of one or more of the digits 0-9. An octal constant is a sequence of one or more of the digits 0-7, followed by the letter 'b' or 'B'. A hexadecimal integer constant is one of the digits 0-9, followed by zero or more of the digits 0-9, the letters a-f, or the letters A-F, followed by the letter 'x' or 'X'. The following notation more concisely summarizes these definitions: .nf decimal constant [0-9]+ octal constant [0-7]+('b'|'B') hexadecimal constant [0-9][0-9a-fA-F]*('x'|'X') identifier [a-zA-Z][a-zA-Z_0-9]* .fi In the notation used above, "+" means 1 or more, "*" means zero or more, "-" implies a range, and "|" means "or". Brackets ("[]") define a class of characters. Thus, "[0-9]+" reads "one or more of the characters 0 through 9". .ih 3.1.4 Floating Point Constants A floating point constant (type REAL or DOUBLE) consists of a decimal integer, followed by a decimal point, followed by a decimal fraction, followed by one of (e|E|d|D), followed by a decimal integer, which may be negative. Either the decimal integer or the decimal fraction part must be present. The number must contain either the decimal point or the exponent (or both). Embedded whitespace is not permitted. The following are all legal floating point numbers: .nf .01 100. 100.01 1E5 1E-5 1.00D5 1.0D0 .fi A floating constant may also be given in sexagesimal format, i.e., in hours and minutes, or in hours, minutes, and seconds. The number of colon separated fields must be two or three, and the number of decimal digits in the second field and in the integer part of the third field is limited to exactly two. The decimal point is optional. .nf 00:01 = 0.017 00:00:01 = 0.00028 01:00:00 = 1.0 01:00:00.00 = 1.0 .fi .ih 3.1.5 Character Constants A character constant consists of from 1 to 4 digits delimited at front and rear by the single quote ("'", as opposed to the double quotes used to delimit string constants). A character constant is numerically equivalent to the corresponding decimal integer, and may be used wherever an integer constant would be used. .nf 'a' integer equivalent of the letter 'a' '\n' integer equiv. of the newline character '\007' the octal integer 07B '\\' the integer equiv. of the character '\' .fi The backslash character ("\") is used to form "escape sequences". The following escape sequences are defined: .nf \b backspace \f formfeed \n newline \r carriage return \t tab .fi .ih 3.1.6 String Constants A string constant is a sequence of characters enclosed in double quotes. The double quote itself may be included in the string by escaping it with backslash. All of the escape sequences given above are recognized. The backslash character itself must be escaped to be included in the string. A string constant may not span several lines of text. .ih 3.1.7 Identifiers An identifier is an upper or lower case letter, followed by zero or more upper or lower case letters, digits, or the underscore character. Identifiers may be as long as desired, but only the first five characters and the last character are significant. The following identifiers are reserved (though some are not actually used at present): .ls -8 .nf auto do include short begin double int sizeof bool else long static break end map struct call entry next switch case extern plot task char false printf true clgetpar for procedure union clputpar getpix putpix unmap common goto real until complex if repeat virtual data iferr return vstruct define imstruct scan while .fi .le .sh 3.2 Data Types The subset preprocessor language supports a fairly wide range of data types. The actual mapping of an XPP data type into a Fortran data type depends on what the target compiler has to offer. .ks .nf XPP Data Types bool boolean (Fortran LOGICAL) char character (8 bit signed) short short integer int integer (Fortran INTEGER) long long integer real single precision floating (Fortran REAL) double double precision floating (DOUBLE PRECISION) complex single precision complex (Fortran COMPLEX) .fi .ke The only permissible values for a boolean variable are "true" and "false". The CHAR data type belongs to the family of integer data types, i.e., a CHAR variable or array behaves like an integer variable or array. The value of a CHAR variable may range from -127 to 127. CHAR and SHORT are signed integer data types (i.e., they may take on negative values). In addition to the seven primitive data types, the SPP language provides the abstract type POINTER. The SPP language makes no distinction between pointers to different types of objects, unlike more strongly typed languages such as C (and the full preprocessor). The SPP implementation of the POINTER data type is a stopgap measure. .sh 3.3 Declarations The SPP language implements named procedures with formal parameters and local variables. Global common and dynamic memory allocation may be used to share data amongst procedures. A procedure may return a value, but may not return an array or string. Declarations are included for procedures, variables, arrays, strings, typed procedures, external procedures, and global common areas. Storage for local and global variables and arrays may be assumed to be statically allocated. .sh 3.3.1 Variable, Array, and Function Declarations Although the language does not require that parameters be declared before local variables and functions, it is a good practice to follow. The syntax of a type declaration is the same for parameters, variables, and procedures. type_spec object [, object [,... ]] Here, "type_spec" may be any of the seven primitive data types, a derived type such as POINTER, or EXTERN. A list of one or more data objects follows. An object may be a variable, array, or procedure. The declaration for each type of object (variable, array, or procedure) has a unique syntax, as follows: .ks .nf variable identifier array identifier "[" dimension_list "]" procedure identifier "()" .fi .ke Procedures may be passed to other procedures as formal parameters. If a procedure is to be passed to a called procedure as a formal parameter, it must be declared in the calling procedure as an object of type EXTERN. .sh 3.3.2 Array Declarations Arrays are one-indexed. The storage order is fixed in such a way that when the elements of the array are accessed in storage order, the leftmost subscript varies fastest. Arrays of up to three dimensions are permitted. The size of each dimension of an array may be specified by any compile time constant expression, or by an integer parameter or parameters, if the array is a formal parameter to the procedure. If the array is declared as a formal parameter, and the size of the highest dimension is unknown, the size of that dimension should be given as ARB (for arbitrary). .nf real data[ARB] # length of array is unknown short raster[NPIX*2,128] # 2-dim array .fi The declared dimensionality of an array passed as a formal parameter to a procedure may be less than or equal to the actual dimensionality of the array. .sh 3.3.3 String Declarations A string is an EOS delimited array of type CHAR (EOS stands for End Of String). Strings may contain only character data (values 0 through 127 decimal), and must be EOS delimited. A character string may be declared in either of two ways, depending on whether initialization is desired: .nf char input_file[SZ_FNAME] string legal_codes "efgdox" .fi The preprocessor automatically adds 1 to the declared array size, to allow space for the EOS marker. The space used by the EOS marker is not considered part of the string. Thus, the array "char x[10]" will contain space for ten characters, plus the EOS marker. .sh 3.3.4 Global Common Declarations Global common provides a means for sharing data between separately compiled procedures. The COMMON statement is a declaration, and must be used only in the declarations section of a procedure. Each procedure referencing the same common must declare the common in the same way. common /common_name/ object [, object [, ... ]] To avoid the possibility of two procedures declaring the same common area differently in separate procedures, the COMMON declaration should be placed in a INCLUDE file (include files are discussed in a later section). .sh 3.3.5 Procedure Declarations The form of a PROCEDURE declaration is shown below. The "data_type" field must be included if the procedure returns a value. The BEGIN keyword separates the declarations section from the executable body of the procedure, and is required. The END keyword must follow the last executable statement. .ks .nf [data_type] PROCEDURE proc_name ([p1 [, p2 [,... ]]]) (declarations for parameters) (declarations for local variables and functions) (initialization) BEGIN (executable statements) END .fi .ke All parameters, variables, and typed procedures must be declared. The XPP language does not permit implicit typing of parameters, variables, or procedures (unlike Fortran). If a procedure has formal parameters, they should agree in both number and type in the procedure declaration and when the procedure is called. In particular, beware of SHORT or CHAR parameters in argument lists. An INT may be passed as a parameter to a procedure expecting a SHORT integer on some machines, but this usage is NOT PORTABLE, and is not detected by the compiler. The compiler does not verify that a procedure is declared and used consistently. If a procedure returns a value, the calling program must declare the procedure in a type declaration, and reference the procedure in an expression. If a procedure does not return a value, the calling program may reference the procedure only in a CALL statement. .sh Example 1: The Sinc Function This example demonstrates how to declare a typed procedure, which in this case returns a single real value. Note the inclusion of the double parenthesis ("()") in the declaration of the function SIN, to make it clear that a function is being declared, rather than a local variable. Note also the use of the RETURN statement to return the value of the function SINC. .ks .nf real procedure sinc (x) real x begin if (x == 0.0) return (1.0) else return (sin(x) / x) end .fi .ke .sh 3.3.6 Multiple Entry Points Procedures with multiple entry points are permitted in the subset preprocessor language because they provide an attractive alternative to global common when several procedures must have access to the same data. The multiple entry point mechanism is a primitive form of block structuring. The advantages of multiple entry points over global common are: .ls .ls (1) Access to the database is restricted to calls to the defined entry points. A secure database can thus be assured. .le .ls (2) Initialization of data in a procedure with multiple entry points is permissible at compile time, whereas global common cannot (reliably) be initialized at compile time. .le .le Nonetheless, the multiple entry point construct is only useful for small problems. If the problem grows too large, an enormous procedure with many entry points results, which is unacceptable. The form of a procedure with multiple entry points is shown below. Either all entry points should be untyped, as in the example, or all entry points should return values of the same type. Control should only flow forward. Each entry point should be terminated by a RETURN statement, or by a GOTO to a common section of code which all entry points share. The shared section of code should be terminated by a single RETURN which all entry points share. .ks .nf Example 2: Multiple Entry Points procedure push (datum) int datum # value to be pushed or popped int stack[SZ_STACK] # the stack int sp # the stack pointer data sp/0/ begin (push datum on the stack, check for overflow) return entry pop (datum) (pop stack into "datum", check for underflow) return end .fi .ke .sh 3.4 Initialization Local variables, arrays, and character strings may be initialized at compile time with the DATA statement. Data in a global common may NOT be initialized at compile time. If initialization of data in a global common is required, it must be done at run time by an initialization procedure. The syntax of the DATA statement is defined in the Fortran 77 standard. Some simple examples follow. .ks .nf real x, y[2] char ch[2] data x/0/, y/1.0,2.0/, ch/'a','b',EOS/ .fi .ke .sh 3.5 Control Flow Constructs The subset preprocessor provides a full set of control flow constructs, such as are found in most modern languages. Some of these have already appeared in the examples. An SPP control flow construct executes a "statement" either conditionally or repetitively. The "statement" to be executed may be a simple one line statement, a COMPOUND STATEMENT enclosed in curly brackets or braces ("{}"), or the NULL STATEMENT (";" on a line by itself). .nf conditional constructs: IF, IF ELSE, SWITCH CASE repetitive constructs: DO, FOR, REPEAT UNTIL, WHILE branching: BREAK, NEXT, GOTO, RETURN .fi Two special statements are provided to interrupt the flow of control through one of the repetitive constructs. BREAK causes an immediate exit from the loop, by jumping to the statement following the loop. NEXT shifts control to the next iteration of the loop. If BREAK and NEXT are embedded in a conditional construct, which is in turn embedded in a repetitive construct, it is the outer repetitive construct which will define where control is shifted to. .sh 3.5.1 Conditional Execution The IF and IF ELSE constructs are shown below. The "expr" part may be any boolean expression. The "statement" part may be a simple statement, compound statement enclosed in braces, or the null statement. The control flow constructs may be nested indefinitely. .ks .nf if (expr) IF construct statement .fi .ke .ks .nf if (expr) IF ELSE construct statement else statement .fi .ke The ELSE IF construct is useful for selecting one statement to be executed from a group of possible choices. This construct is a more general form of the SWITCH CASE construct. .ks .nf if (expr) ELSE IF construct statement else if (expr) statement else if (expr) statement .fi .ke The SWITCH CASE construct evaluates an integer expression once, then branches to the matching case. Each case must be a unique integer constant. The maximum number of cases is limited only by table space within the compiler. A case may consist of a single integer constant, or a list of integer constants, delimited by the character ":". The special case DEFAULT, if included, is selected if the switch value does not match any of the other cases. If the switch value does not match any case, and there is no default case, control passes to the statement following the body of the SWITCH statement. Each case of the SWITCH statement may consist of an arbitrary number of statements, which do not have to be enclosed in braces. The body of the switch statement, however, must be enclosed in braces as shown. .ks .nf switch (int_expr) { SWITCH CASE construct case int_const_list: statements case int_const_list: statements default: statements } .fi .ke .ks .nf example: switch (operator) { case '+': c = a + b case '-': c = a - b default: call error (1, "unknown operator") } .fi .ke The SWITCH construct will execute most efficiently if the cases form a monotonically increasing sequence without large gaps between the cases (i.e., case 1, case 2, case 3, etc.). The cases should, of course, be defined parameters or character constants, rather than explicit numbers. .sh 3.5.2 Error Handling The SPP language provides support for error actions, error handling and error recovery. Knowledge of the SPP error handling procedures is necessary to correctly deal with error actions initiated by the system library routines. A recoverable error condition is asserted by a call to the ERROR statement. An irrecoverable error condition is asserted with the FATAL statement. Error recovery is implemented using the IFERR and IFNOERR statements. If an error handler is not "posted" by a call to IFERR or IFNOERR, a system defined error handler will be called, returning system resources, closing files, deleting temporary files, and aborting the program. .ks .nf errchk proc1, proc2, ... # errchk declaration iferr (procedure call or assignment statement) <error_action_statement> iferr { <any statements, including IFERR> } then <error_action_statement> .fi .ke Language support includes the IFERR and IFNOERR statements and the ERRCHK declaration. The IFERR and IFNOERR statements are grammatically equivalent to the IF statement. The meaning of the IFERR statement is "if an error occurs during the processing of the enclosed code,...". IFNOERR is equivalent, except that the sense of the test is reversed. Note that the condition to be tested in an IFERR statement may a single or compound procedure call or assignment statement, while the IF statement tests a boolean expression. If a procedure calls a subprocedure which may directly or indirectly take an error action, then the subprocedure must be named in an ERRCHK declaration in the calling procedure. If an error occurs during the processing of a subprocedure and an error handler is posted somewhere back up the chain of procedure calls, then control must revert immediately back up the chain of procedures to the procedure which posted the error handler. This will work only if all intermediate procedures include ERRCHK declarations for the next lower procedure in the chain. Graphically, assume that procedure A calls B, that B in turn calls C, and so on as shown below: .ks .nf A (A posts error handler with IFERR) B (B must ERRCHK procedure C) C (C must ERRCHK procedure D) D (D calls ERROR) .fi .ke As indicated by the diagram, procedure D calls ERROR, "taking an error action". If no handler is posted, the error action will consist of the system error recovery actions, terminating with the abort of the current program. But if an error handler is posted, as is done by procedure A in the example, then control should revert immediately to procedure A. The error handler in A might try again with slightly different parameters, perform special cleanup actions and abort, print a more meaningful error message and take another error action, print a warning message, or whatever. If the ERRCHK declaration is omitted in procedure B or C, control will not revert immediately to procedure A, and processing will erroneously continue in the intermediate procedure, as if an error had not occurred. Several library procedures are provided in the system library for use in error handlers. The ERRACT procedure may be called in an error handler to issue the error message posted by the original ERROR call as a warning message, or to cause a particular error action to be taken. The error actions are defined in the include file "<error.h>". ERRCODE returns either OK or the integer code of the posted error. .ks .nf Library procedures related to error handling: error (errcode, error_message) (language) fatal (errcode, error_message) (library) erract (severity) (library) val = errcode () (library) .fi .ke .ks .nf .rj <error.h> ERRACT severity codes EA_WARN # issue a warning message EA_ERROR # assert recoverable error EA_FATAL # assert fatal error .fi .ke An arithmetic exception (X_ARITH) will be trapped by an IFERR statement, provided the posted handler(s) return without causing error restart. X_INT and X_ACV (interrupt and access violation may be caught only by posting an exception handler with XWHEN. .sh 3.5.3 Repetitive Execution An assortment of repetitive constructs are provided for convenience. The simplest constructs are WHILE, which tests at the top of the loop, and REPEAT UNTIL, which tests at the bottom. The DO construct is convenient for simple sequential operations on arrays. The most general repetitive construct is the FOR statement. .ks .nf while (expr) WHILE construct statement .fi .ke .ks .nf repeat { REPEAT UNTIL statements } until (expr) .fi .ke .ks .nf repeat { infinite REPEAT loop statements (exit with BREAK, RETURN, etc) } .fi .ke The FOR construct consists of an initialization part, a test part, and a loop control part. The initialization part consists of a statement which is executed once before entering the loop. The test part is a boolean expression, which is tested before each iteration of the loop. The loop control statement is executed after the last statement in the body of the FOR, before branching to the test at the beginning of the loop. When used in a FOR statement, NEXT causes a branch to the loop control statement. The FOR construct is very general, because of the lack of restrictions on the type of initialization and loop control statements chosen. Any or all of the three parts of the FOR may be omitted, but the semicolon delimiters must be present. .ks .nf for (init; test; control) FOR construct statement example: for (ip=strlen(str); str[ip] != 'z' && ip > 0; ip=ip-1) ; .fi .ke The example demonstrates the flexibility of the FOR construct. The FOR statement shown searches the string "str" backwards until the character 'z' is encountered, or until the beginning of the string is reached. Note the use of the null statement for the body of the FOR, since everything has already been done in the FOR itself. The STRLEN procedure is shown in a later example. The DO construct is a special case of the FOR construct. DO is ideal for simple array operations, and since it is implemented with the Fortran DO statement, its use should result in particularly efficient code. Only INTEGER loop control expressions are permitted in the DO statement. General expressions are permitted. The loop may run forwards or backwards, with any step size. The value of the loop control parameter is UNDEFINED upon exit from the loop. The body of the DO will be executed zero times, if the initial value of the loop control parameter satisfies the termination condition. .ks .nf do lcp = initial_value, final_value [, step_size] statement example: do i = 1, NPIX DO construct a[i] = abs (a[i]) .fi .ke .sh 3.6 Expressions Every expression is characterized by a data type and a value. The data type is fixed at compile time, but the value may be either fixed at compile time, or calculated at run time. An expression may be a constant, a string constant, an array reference, a call to a typed procedure, or any combination of the above elements, in combination with one or more unary or binary operators. .ks .nf ( Special Operators ) (arglist) procedure call [arglist] array reference ( Unary Operators ) - negation ! boolean not ( Binary Operators ) ** exponentiation / * arithmetic + - == != <= >= < > boolean comparison && || boolean and, or .fi .ke Parenthesis may be used to force the compiler to evaluate the parts of an expression in a certain order. In the absence of parenthesis, the "precedence" of an operator determines the order of evaluation of an expression. The highest precedence operators are evaluated first. The precedence of the SPP operators is defined by the order in which the operators appear in the table above (procedure call has the highest precedence). The "arglist" in a procedure or array reference consists of a list of general expressions separated by commas. If an expression contains calls to two or more procedures, the order in which the procedures are evaluated is undefined. .sh 3.6.1 Mixed Mode Expressions The binary operators combine two expressions into a single expression. If the two input expressions are of different data types, the expression is said to be a "mixed mode" expression. The data type of a mixed mode expression is defined by the order in which the types of the two input expressions appear in the table on page 5. The data type which appears furthest down in this table will be the data type of the combined expression. For example, an integer plus a real produces a real. Mixed mode expressions involving booleans are illegal. .sh 3.6.2 Type Coercion The term "type coercion" refers to the conversion of an object from one data type to another. Such conversions may involve loss of information, and hence are not always reversible. Type coercion occurs automatically in mixed mode expressions, and in assignment statements. Type coercion is not permitted between booleans and the other data types. The data type of an expression may coerced by a call to an intrinsic function. The names of these intrinsic functions are the same as the names of the data types. Thus, "int(x)", where X is of type REAL, coerces X to type INT, while "double(x)" produces a double precision result. .sh 3.7 The Assignment Statement The assignment statement assigns the value of the general expression on the right side to the variable or array element given on the left side. Automatic type coercion will occur during the assignment if necessary (and legal). Multiple assignments may not be made on the same line. .sh 3.8 Some Examples We have now finished discussing the fundamentals of the subset preprocessor language. The following examples demonstrate two complete procedures written in the SPP language. Additional examples are given in appendix B, and in the IRAF source directories. .sh Example 3: Length of a String This example demonstrates the declaration and use of a function to compute the length of a character string passed as a formal parameter. STRLEN simply inspects each character in the string, until the end of string marker (EOS) is reached. .ks .nf int procedure strlen (string) char string[ARB] int ip begin ip = 1 while (string[ip] != EOS) ip = ip + 1 return (ip - 1) end .fi .ke The code fragment shown below shows how the function STRLEN might be used in another procedure. STRLEN is called to get the index of the last character in the string, then the string is truncated by overwriting the last character with EOS. EOS is a predefined constant, which should be considered part of the language. .ks .nf char string[SZ_LINE] int strlen() begin string_length = strlen (string) if (string_length >= 1) string[string_length] = EOS .fi .ke .sh Example 4: Min and Max of a REAL Array This example shows how to declare a procedure which returns its output via formal parameters, rather than as the function value. Note the use of square brackets to declare and reference arrays. If the limiting values of the data cannot be computed, the special value INDEF is returned, signifying that the limiting values are indefinite. INDEF is another predefined constant. .tp 6 .nf procedure limits (data, npix, minval, maxval) real data[npix] # input data array int npix # length of array real minval, maxval # output values int i begin if (npix >= 1) { minval = data[1] maxval = data[1] for (i=2; i <= npix; i=i+1) { if (data[i] < minval) minval = data[i] if (data[i] > maxval) maxval = data[i] } } else { minval = INDEF maxval = INDEF } end .fi The generalization of this procedure to handle indefinites in the input data array is left up to the reader. .sh 3.9 Program Structure An SPP source file may contain any number of PROCEDURE declarations, zero or one TASK statements, any number of DEFINE or INCLUDE statements, and any number of HELP text segments. By convention, global definitions and include file references should appear at the beginning of the file, followed by the task statement, if any, and the procedure declarations. .nf include <ctype.h> # character type definitions include "widgets.h" # package definitions file # This file contains the source for the tasks making up the # Widgets analysis package (describe the contents of the file). define MAX_WIDGETS 50 # local definitions define NPIX 512 define LONGITUDE 7:32:23.42 task alpha, beta, epsilon=eps # ALPHA -- (describe the alpha task) procedure alpha() ... .fi .sh 3.9.1 Include Files Include files are referenced at the beginning of a file to include global definitions that must be shared amongst separately compiled files, and within procedures to reference common block definitions. The INCLUDE statement is effectively replaced by the contents of the named file. Includes may be nested at least 5 deep. The name of the file to be included must be delimited by either angle brackets (<file>) or quotation marks ("file"). The first form is used to reference the IRAF system include files. The second, more general, form may be used to include any file. .sh 3.9.2 Macro Definitions Macro definitions are invaluable for "information hiding", and can do much to enhance the modifiability of a program. The effective use of macros also tends to improve the readability of a program. By convention, the names of macros are always upper case, to make it clear that a macro is being used, and to avoid redefinitions of ordinary variables and procedures. There are two kinds of macros -- those with arguments, and those without. Macros without arguments are the most common, and are used primarily to turn explicit constants into symbolic parameters. Examples are shown above. Macros may also be used to reference the field of a structure, or to define inline code fragments (similar to Fortran statement functions). In the SPP, the arguments of a macro are referenced as "$1", "$2", in the following manner: .ks .nf define I_TYPE $1[1] define I_NPIX $1[2] define I_COEFF $1[10] if (I_TYPE(coeff) == LINEAR) ... .fi .ke In this example, the array "coeff" is actually a simple structure, containing the fields "i_type", "i_npix", ..., and "i_coeff". It greatly enhances the readability of the program to refer to the fields of this structure by name, rather than offset ("coeff[2]"), and furthermore makes it trivial to modify the structure. Macros with arguments may also be used to define inline functions. For example, here are a couple of definitions of character classes from the system include file "ctype.h": .ks .nf define IS_UPPER ($1>='A'&&$1<='Z') define IS_LOWER ($1>='a'&&$1<='z') define IS_DIGIT ($1>='0'&&$1<='9') usage: if (IS_DIGIT(string[i])) { ... .fi .ke Note that these definitions work for ASCII, but not for EBCDIC (IBM). By using macros, we have concentrated this machine dependent knowledge of the character set into a single file. NOTE: In the current implementation of the SPP, macro definitions may not include string constants. All other types of constants, constant expressions, array and procedure references, are allowed. The domain of definition of a macro extends from the line following the macro, to the end of the file (except for include files). Macros are recursive. Redefinitions of macros are silently permitted. .sh 3.9.3 The TASK Statement, and Tasks The TASK statement is used to make an IRAF task. A file need not contain a task statement, and may not contain more than a single task statement. Files without task statements are separately compiled to produce object modules, which may subsequently be linked together to make a task, or which may be installed in a library. A single physical task (ptask) may contain one or more LOGICAL TASKS (ltasks). These tasks need not be related. Several ltasks may be grouped together into a single ptask merely to save disk storage, or to minimize the overhead of task execution. Ltasks should communicate with one another only via disk files, even if they reside in the same ptask. task ltask1, ltask2, ltask3=proc3 The task statement defines a set of ltasks, and associates each with a compiled procedure. If only the name of the ltask is given in the task statement, the associated procedure is assumed to have the same name. A file may contain any number of ordinary procedures which are not associated (directly) with an ltask. The source for the procedure associated with a given ltask need not reside in the same file as the task statement. An ltask associated procedure MUST not have any arguments. An ltask procedure gets its parameters from the CL via the CL interface. Most commonly used are the CLGETx procedures. The CLPUTx procedures may be used to change the values of parameters. .ks .nf task alpha, beta, epsilon=eps procedure alpha() int npix, clgeti() real lcut, clgetr() char file[SZ_FNAME] begin npix = clgeti ("npix") lcut = clgetr ("lower_cutoff") call clgstr ("input_file", file, SZ_FNAME) ... .fi .ke An IRAF task may be run by the CL or called from the command interpreter provided by the host operating system, without change. Parameter requests and i/o to the standard input and output will function properly in both cases. When running without the CL, of course, the interface is much more primitive. To run an IRAF task directly, without the CL (especially useful for debugging purposes), begin by simply running the task. The task will sense that it is being run without the CL, and issue a prompt: .ks .nf > ? alpha beta epsilon > alpha npix: (response) lower_cutoff: (response) input_file: (response) (ltask "alpha" continues) > bye .fi .ke Every IRAF task has two special commands built in. The command "?" will list the names of the ltasks recognized by the interpreter. The command "bye" is used to exit the interpreter. .sh 3.9.4 Help Text Documentation may be embedded in an XPP source file either by commenting out the lines of text, or by enclosing the lines of text within ".help" and ".endhelp" directives. If there are only a few lines of text, it is probably most convenient to comment them out. Large blocks of text should be enclosed by the help directives, making the text easier to edit, and accessible to the online documentation and text processing tools. .ks .nf # (everything from the '#' to end of line is a comment) .help [keyword [qualifier [package_description_string]]] (help text) .endhelp .fi .ke The preprocessor ignores comments, and everything between ".help" and ".endhelp" directives. The directives must occur at the beginning of a line to be recognized. In both cases, the preprocessor ignores the remainder of the line. The arguments to ".help" are used by the HELP, MANPAGE, and LISTING utilities, but are ignored by XPP. Help text may be typed in as it is to appear on the terminal or printer, or it may contain text processing directives. A filter (LISTING) is available to strip help text out when making listings, or to replace help text containing directives with nicely formatted text. See the LROFF documentation for a description of the IRAF text processing directives. Manual pages for ltasks may be stored either directly in the source file as help text segments, or in separate files. If separate source and help files are used, both files should reside in the same directory and should have the same root name, and the help text file should have the extension ".hlp". .sh 4. Anachronisms Certain constructs in the subset preprocessor language are not likely to survive in their present form in the full preprocessor. These include: .ks .nf the STRING declaration the DATA statement the COMMON statement the POINTER data type .fi .ke The STRING declaration will disappear at the same time as the DATA statement. Both will be replaced by initializations of the form .nf real x = 0.0, y[] = {1.,2.,4.} char opcodes[SZ_OPCODES] = {'f','g','e','d'} .fi COMMON declarations, in their present form, are cumbersome and dangerous to use. The global data capability provided by COMMON will be present in the full preprocessor in a more structured form. The POINTER data type will be replaced by a strongly typed (and therefore much more reliable) implementation of pointers, patterned after C. .sh 5. Notes on Topics Not Discussed This present version of the SPP reference manual omits a discussion of the basic i/o facilities, some of which require language support. Dynamic memory management and pointers will be covered in a later revision of the manual. Data structuring is possible in the SPP, using macros, and is discussed in the design documentation for VSIO. Programs written in the subset preprocessor language should adhere to the (currently informal) coding standard being developed for IRAF. The coding standard has not yet been documented. Try to style procedures after those shown in the examples, and in the IRAF system source directories. .bp .sh APPENDIX A: Predefined Constants The subset preprocessor language includes a number of predefined symbolic constants. Included are various machine dependent constants describing the hardware and data types. Other symbolic constants are used for basic file i/o. All predefined constants are of type integer. .nf language and machine definitions: ARB arbitrary (array dimension) BOF,BOFL beginning of file EOF,EOFL end of file EOS end of string EPSILON smallest real x s.t. 1+x > 1 EPSILOND double precision epsilon ERR error status return INDEF indefinite of type REAL INDEF[SILRDX] indefinites for all types MAX_DIGITS number of digits of precision (double) MAX_EXPONENT largest positive exponent MAX_INT largest positive integer MAX_LONG largest positive long integer MAX_REAL largest real or double MAX_SHORT largest short integer MIN_REAL smallest representable real number NBYTES_CHAR number of machine bytes per character NO opposite of YES NULL invalid pointer OK status return, opposite of ERR SZ_BOOL nchars per bool SZ_CHAR nchars per char SZ_COMPLEX nchars per complex SZ_DOUBLE nchars per double SZ_FNAME size of a file name string, chars SZ_INT nchars per int SZ_LINE size of a file line buffer, chars SZ_LONG nchars per long SZ_REAL nchars per real SZ_SHORT nchars per short TY_BOOL code for type bool TY_CHAR code for type char TY_COMPLEX code for type complex TY_DOUBLE code for type double TY_INT code for type int TY_LONG code for type long TY_REAL code for type real TY_SHORT code for type short YES opposite of NO file i/o definitions: APPEND file access mode BINARY_FILE file type NEW_FILE file access mode READ_ONLY file access mode READ_WRITE file access mode STDERR standard error output STDGRAPH standard graphics output STDIMAGE standard image display output STDIN standard input STDOUT standard output STDPLOTTER standard plotter output TEXT_FILE file type WRITE_ONLY file access mode .fi .sh APPENDIX B: Detailed Examples Example 5: Matrix Inversion An SPP translation of Bevington's routine to invert a matrix by gaussian elimination with partial pivoting is shown below. The help text is shown with text formatter commands inserted. The restriction of this procedure to matrices of a fixed size is unfortunate, but we have kept it that way to conform to Bevington's original code. .nf .help matinv 2 "math library" .nf ____________________________________________________________________ NAME matinv -- invert a symmetric matrix and calculate its determinant. SOURCE Bevington, pages 302-303. USAGE call matinv (array, order, determinant) PARAMETERS array (real) Input matrix of fixed size 10 by 10 (smaller matrices may be placed in this matrix). Replaced by the inverse upon output. order The number of rows and columns in the actual matrix. determinant (real) Determinant of input matrix. DESCRIPTION The input matrix, which must be dimensioned [10,10] in the calling program, is inverted, and its determinant is calculated. The inverse overwrites the input matrix. The algorithm used is gaussian elimination with partial pivoting. .endhelp _______________________________________________________________ define MAX_ORDER 10 # maximum size of matrix procedure matinv (array, order, determinant) double array[MAX_ORDER,MAX_ORDER] int order real determinant int ik[MAX_ORDER], jk[MAX_ORDER] int i, j, k, l double maxval, temp begin determinant = 1. do k = 1, order { # Find largest element array[i,j] in rest of matrix. maxval = 0. repeat { do i = k, order do j = k, order if (abs(maxval) <= abs(array[i,j])) { maxval = array[i,j] ik[k] = i jk[k] = j } if (maxval == 0) { # abnormal return determinant = 0.0 return } # Interchange rows and columns to put maxval in # array[k,k]. i = ik[k] if (i >= k) { if (i != k) do j = 1, order { temp = array[k,j] array[k,j] = array[i,j] array[i,j] = -temp } j = jk[k] if (j >= k) break } } if (j != k) do i = 1, order { temp = array[i,k] array[i,k] = array[i,j] array[i,j] = -temp } # Accumulate elements of inverse matrix. do i = 1, order if (i != k) array[i,k] = -array[i,k] / maxval do i = 1, order do j = 1, order if (i != k && j != k) array[i,j] = array[i,j] + array[i,k] * array[k,j] do j = 1, order if (j != k) array[k,j] = array[k,j] / maxval array[k,k] = 1.0 / maxval determinant = determinant * maxval } # Restore ordering of matrix. do l = 1, order { k = order - l + 1 j = ik[k] if (j > k) do i = 1, order { temp = array[i,k] array[i,k] = -array[i,j] array[i,j] = temp } i = jk[k] if (i > k) do j = 1, order { temp = array[k,j] array[k,j] = -array[i,j] array[i,j] = temp } } end .fi .sh Example 6: Pattern Matching The next example was selected for inclusion here because it demonstrates most of the control flow constructs, as well as the use of defined parameters. The STRMATCH procedure searches a string for the specified pattern. The pattern may contain several meta-characters, or characters which are not matched but rather which tell STRMATCH what constitutes a match. For example: .nf if (strmatch (line_buffer, "^{naxis}#=") > 0) ... .fi In this case, STRMATCH would search for the string "naxis =", returning the index of the first character matched or zero. The meta-characters are defined in the INCLUDE file "pattern.h", as follows: .nf # Pattern Matching Metacharacters (STRMATCH, PATMATCH) define CH_BOL '^' # beginning of line symbol define CH_NOT '^' # not, in character classes define CH_EOL '$' # end of line symbol define CH_ANY '?' # match any single character define CH_CLOSURE '*' # zero or more occurrences define CH_CCL '[' # begin character class define CH_CCLEND ']' # end character class define CH_RANGE '-' # as in [a-z] define CH_ESCAPE '\\' # escape character define CH_WHITESPACE '#' # match optional whitespace define CH_IGNORECASE '{' # begin ignoring case define CH_MATCHCASE '}' # begin checking case The source for the STRMATCH procedure, in file "strmatch.x", follows. Though this is not a good example of modular code (the control flow is too complex), it does serve to illustrate the use of many of the control flow constructs. include <ctype.h> include <pattern.h> .help strmatch, gstrmatch .nf __________________________________________________________________ STRMATCH -- Find the first occurrence of the string A in the string B. If not found, return zero, else return the index of the first character following the matched substring. GSTRMATCH -- More general version of strmatch. The indices of the first and last characters matched are returned as arguments. The function value is the same as for STRMATCH. STRMATCH recognizes the meta-characters BOL, EOL, ANY, WHITESPACE, IGNORECASE, and MATCHCASE (BOL and EOL are special only as the first and last chars in the pattern). The null pattern matches any string. Metacharacters can be escaped. .endhelp _____________________________________________________________ # STRMATCH -- Search a string for a pattern. int procedure strmatch (str, pat) char pat[ARB], str[ARB] int first_char, last_char int gstrmatch() begin return (gstrmatch (str, pat, first_char, last_char)) end # GSTRMATCH -- Generalized strmatch which returns the indices of the # match substring. int procedure gstrmatch (str, pat, first_char, last_char) char pat[ARB], str[ARB] int first_char, last_char bool ignore_case, bolflag char ch, pch # string, pattern characters int i, ip, initial_pp, pp begin ignore_case = false bolflag = false ip = 1 initial_pp = 1 if (pat[1] == CH_BOL) { # match at beginning of line? bolflag = true initial_pp = 2 } # Try to match pattern starting at each character offset in # string. for (first_char=ip; str[ip] != EOS; ip=ip+1) { i = ip # Compare pattern to string str[ip]. for (pp=initial_pp; pat[pp] != EOS; pp=pp+1) { switch (pat[pp]) { case CH_WHITESPACE: while (IS_WHITE (str[i])) i = i + 1 case CH_ANY: if (str[i] != '\n') i = i + 1 case CH_IGNORECASE: ignore_case = true case CH_MATCHCASE: ignore_case = false default: pch = pat[pp] if (pch == CH_ESCAPE && pat[pp+1] != EOS) { pp = pp + 1 pch = pat[pp] } else if (pch == CH_EOL || pch == '\n') if (pat[pp+1] == EOS && str[i] == '\n') { first_char = ip last_char = i return (last_char + 1) } ch = str[i] i = i + 1 # Compare ordinary characters. The comparison is # trivial unless case insensitivity is required. if (ignore_case) { if (IS_UPPER (ch)) { if (IS_UPPER (pch)) { if (pch != ch) break } else if (pch != TO_LOWER (ch)) break } else if (IS_LOWER (ch)) { if (IS_LOWER (pch)) { if (pch != ch) break } else if (pch != TO_UPPER (ch)) break } else { if (pch != ch) break } } else { if (pch != ch) break } } } # If the above loop was exited before the end of the pattern # was reached, the pattern did not match. if (pat[pp] == EOS) { first_char = ip last_char = i-1 return (i) } else if (bolflag || str[i] == EOS) break } return (0) # no match end .fi .sh Example 7: Error Handling The following simple procedure reads a list of file names from the CL, and attempts to delete each file. The DELETE library procedure will take an error action if it cannot delete a file; this is not what is desired, so we post an error handler and reissue the error message from DELETE as a warning message. .nf include <error.h> # DELETE_FILES -- Delete a list of files. procedure delete_files() char filename[SZ_FNAME] # name of file to be deleted int list, clpopns(), clgfil() begin # Fetch template and open it as a list of files. list = clpopns ("template") # Read successive file names from the list, and delete each # file. while (clgfil (list, filename, SZ_FNAME) != EOF) iferr (call delete (filename)) call erract (EA_WARN) call clpcls (list) end .fi The Fortran output for the DELETE_FILES procedure is shown below. Note the implementation of the "template" string, the mapping of long identifiers into 6 character Fortran identifiers, and the implementation of the while statement using GOTO. .ks .nf subroutine delets() integer*2 filene(33 +1) integer list, clpops, clgfil integer*2 st0001(9) logical xerpop data st0001 /116,101,109,112,108, 97,116,101, 0/ save list = clpops (st0001) 110 if (.not.(clgfil (list, filene, 33 ) .ne. (-2))) goto 111 call xerpsh call delete (filene) if (.not.xerpop()) goto 120 call erract (3 ) 120 continue goto 110 111 continue call clpcls (list) 100 return end c delets delete_files c filene filename c clpops clpopns .fi .ke .endhelp