Sophie: zsh-doc-1:4.1.0-0.dev5.4mdk i586

zsh-doc-4.1.0-0.dev5.4mdk.i586.rpm

<html>
<head>
<title>A User's Guide to the Z-Shell </title>
</head>
<body  >
<hr>
<ul>
    <li> <a href="zshguide04.html">Next chapter</a>
    <li> <a href="zshguide02.html">Previous chapter</a>
    <li> <a href="zshguide.html">Table of contents</a>
</ul>
<hr>

<a name="syntax"></a><a name="l29"></a>
<h1>Chapter 3: Dealing with basic shell syntax</h1>
<p>This chapter is a more thorough examination of much of what appeared in the
last chapter; to be more specific, I assume you're sitting in front of your
terminal about to use the features you just set up in your initialisation
files and want to know enough to get them going.  Actually, you will
probably spend most of the time editing command lines and in particular
completing commands --- both of these activities are covered in later
chapters.  For now I'm going to talk about commands and the syntax that
goes along with using them.  This will let you write shell functions and
scripts to do more of your work for you.
<p>In the following there are often several consecutive paragraphs
about quite minor features.  If you find you read this all through the
first time, maybe you need to get out more.  Most people will probably find
it better to skim through to find what the subject matter is, then come
back if they later find they want to know more about a particular aspect of
the shell's commands and syntax.
<p>One aspect of the syntax is also left to a later chapter: there's just so
much to it, and it can be so useful if you know enough to get it right,
that it can't all be squashed in here.  The subject is expansion, covering
a multitude of things such as parameter expansion, globbing and history
expansions.  You've already met the basics of these in the last chapter;
but if you want to know how to pick a particular file with a globbing
expression with pinpoint accuracy, or how to make a single parameter
expansion reduce a long expression to the words you need, you should read
that chapter; it's more or less self-contained, so you don't necessarily
need to know everything in this one.
<p>We start with the most basic issue in any command line interpreter,
running commands.  As you know, you just type words separated by spaces,
where the first word is a command and the remainder are arguments to it.
It's important to distinguish between the types of command.
<p><a name="l30"></a>
<h2>3.1: External commands</h2>
<p>External commands are the easiest, because they have the least interaction
with the shell --- many of the commands provided by the shell itself,
which are described in the next section, are built into the shell especially
to avoid this difficulty.
<p>The only major issue is therefore how to find them.  This is done through
the parameters <code>$path</code> and <code>$PATH</code>, which, as I described in the last
chapter, are tied together because although the first one is more useful
inside the shell --- being an array, its various parts can be manipulated
separately --- the second is the one that is used by other commands called
by the shell; in the jargon, <code>$PATH</code> is `exported to the environment',
which means exactly that other commands called by the shell can see its
value.
<p>So suppose your <code>$path</code> contains
<pre>

  /home/pws/bin /usr/local/bin /bin /usr/bin

</pre>

and you try to run `<code>ls</code>'.  The shell first looks in <code>/home/pws/bin</code>
for a command called <code>ls</code>, then in <code>/usr/local/bin</code>, then in <code>/bin</code>,
where it finds it, so it executes <code>/bin/ls</code>.  Actually, the operating
system itself knows about paths if you execute a command the right way, so
the shell doesn't strictly need to.
<p>There is a subtlety here.  The shell tries to remember where the commands
are, so it can find them again the next time.  It keeps them in a so-called
`hash table', and you find the word `hash' all over the place in the
documentation: all it means is a fast way of finding some value, given a
particular key.  In this case, given the name of a command, the shell can
find the path to it quickly.  You can see this table, in the form
`key<code>=</code>value', by typing `<code>hash</code>'.
<p>In fact the shell only does this when the option <code>HASH_CMDS</code> is set, as
it is by default.  As you might expect, it stops searching when it finds
the directory with the command it's looking for.  There is an extra
optimisation in the option <code>HASH_ALL</code>, also set by default: when the
shell scans a directory to find a command, it will add all the other
commands in that directory to the hash table.  This is sensible because on
most UNIX-like operating systems reading a whole lot of files in the same
directory is quite fast.
<p>The way commands are stored has other consequences.  In particular, zsh
won't look for a new command if it already knows where to find one.  If I
put a new <code>ls</code> command in <code>/usr/local/bin</code> in the above example, zsh
would continue to use <code>/bin/ls</code> (assuming it had already been found).  To
fix this, there is the command <code>rehash</code>, which actually empties the
command hash table, so that finding commands starts again from scratch.
Users of csh may remember having to type <code>rehash</code> quite a lot with new
commands: it's not so bad in zsh, because if no command was already hashed,
or the existing one disappeared, zsh will automatically scan the path
again; furthermore, zsh performs a <code>rehash</code> of its own accord if
<code>$path</code> is altered.  So adding a new duplicate command somewhere towards
the head of <code>$path</code> is the main reason for needing <code>rehash</code>.
<p>One thing that can happen if zsh hasn't filled its command hash table and
so doesn't know about all external commands is that the <code>AUTO_CD</code> option,
mentioned in the previous chapter and again below, can think you are trying
to change to a particular directory with the same name as the command.
This is one of the drawbacks of <code>AUTO_CD</code>.
<p>To be a little bit more technical, it's actually not so obvious that
command hashing is needed at all; many modern operating systems can find
commands quickly without it.  The clincher in the case of zsh is that the
same hash table is necessary for command completion, a very commonly used
feature.  If you type `<code>compr&lt;TAB&gt;</code>', the shell completes this to
`<code>compress</code>'.  It can only do this if it has a list of commands to
complete, and this is the hash table.  (In this case it didn't need to know
where to find the command, just its name, but it's only a little extra work
to store that too.)  If you were following the previous paragraphs, you'll
realise zsh doesn't necessarily know <em>all</em> the possible commands at the
time you hit <code>TAB</code>, because it only looks when it needs to.  For this
purpose, there is another option, <code>HASH_LIST_ALL</code>, again set by default,
which will make sure the command hash table is full when you try to
complete a command.  It only needs to do this once (unless you alter
<code>$path</code>), but it does mean the first command completion is slow.  If
<code>HASH_LIST_ALL</code> is not set, command completion is not available:  the
shell could be rewritten to search the path laboriously every single time
you try to complete a command name, but it just doesn't seem worth it.
<p>The fact that <code>$PATH</code> is passed on from the shell to commands called from
it (strictly only if the variable is marked for export, as it usually is
--- this is described in more detail with the <code>typeset</code> family of builtin
commands below) also has consequences.  Some commands call subcommands of
their own using <code>$PATH</code>.  If you have that set to something unusual, so
that some of the standard commands can't be found, it could happen that a
command which <em>is</em> found nonetheless doesn't run properly because it's
searching for something it can't find in the path passed down to it.  That
can lead to some strange and confusing error messages.
<p>One important thing to remember about external commands is that the shell
continues to exist while they are running; it just hangs around doing
nothing, waiting for the job to finish (though you can tell it not to, as
we'll see).  The command is given a completely new environment in which to
run; changes in that don't affect the shell, which simply starts up where
it left off after the command has run.  So if you need to do something
which changes the state of the shell, an external command isn't good
enough.  This brings us to builtin commands.
<p><a name="l31"></a>
<h2>3.2: Builtin commands</h2>
<p>Builtin commands, or builtins for short, are commands which are part of the
shell itself.  Since builtins are necessary for controlling the shell's own
behaviour, introducing them actually serves as an introduction to quite
a lot of what is going on in the shell.  So a fair fraction of what
would otherwise appear later in the chapter has accumulated here, one way
or another.  This does make things a little tricksy in places; count how
many times I use the word `<code>subtle</code>' and keep it for your grandchildren
to see.
<p>I just described one reason for builtins, but there's a simpler one: speed.
Going through the process of setting up an entirely new environment for the
command at the beginning, swapping between this command and anything else
which is being run on the computer, then destroying it again at the end is
considerable overkill if all you want to do is, say, print out a message on
the screen.  So there are builtins for this sort of thing.
<p><a name="l32"></a>
<h3>3.2.1: Builtins for printing</h3>
<p>The commands `<code>echo</code>' and `<code>print</code>' are shell builtins; they just show
what you typed, after the shell has removed all the quoting.  The
difference between the two is really historical: `<code>echo</code>' came first, and
only handled a few simple options; ksh provided `<code>print</code>', which had more
complex options and so became a different command.  The difference remains
between the two commands in zsh; if you want wacky effects, you should look
to <code>print</code>.  Note that there is usually also an external command called
<code>echo</code>, which may not be identical to zsh's; there is no standard
external command called <code>print</code>, but if someone has installed one on your
system, the chances are it sends something to the printer, not the screen.
<p>One special effect is `<code>print -z</code>' puts the arguments onto the editing
buffer stack, a list maintained by the shell of things you are about to
edit.  Try:
<pre>

  print -z print -z print This is a line

</pre>

(it may look as if something needs quoting, but it doesn't)
and hit return three times.  The first time caused everything after the
first `<code>print -z</code>' to appear for you to edit, and so on.
<p>For something more useful, you can write functions that give you a line to
edit:
<pre>

  fn() { print -z print The time now is $(date); }

</pre>

Now when you type `<code>fn</code>', the line with the date appears on the command
line for you to edit.  The option `<code>-s</code>' is a bit similar; the line
appears in the history list, so you will see it if you use up-arrow, but
it doesn't reappear automatically.
<p>A few other useful options, some of which you've already seen, are
<dl>
  <p></p><dt><strong><code>-r</code></strong><dd> don't interpret special character sequences like `<code>\n</code>'
  <p></p><dt><strong><code>-P</code></strong><dd> use `<code>%</code>' as in prompts
  <p></p><dt><strong><code>-n</code></strong><dd> don't put a newline at the end in case there's
    more output to follow
  <p></p><dt><strong><code>-c</code></strong><dd> print the output in columns --- this means
    that `<code>print -c *</code>' has the effect of a sort of poor person's
    `<code>ls</code>', only faster
  <p></p><dt><strong><code>-l</code></strong><dd> use one line per argument instead of one column,
    which is sometimes useful for sticking lists into files, and for
    working out what part of an array parameter is in each element.
</dl>
<p>If you don't use the <code>-r</code> option, there are a whole lot of special
character sequences.  Many of these may be familiar to you from C.
<dl>
  <p></p><dt><strong><code>\n</code></strong><dd> newline
  <p></p><dt><strong><code>\t</code></strong><dd> tab
  <p></p><dt><strong><code>\e</code> or <code>\E</code></strong><dd> escape character
  <p></p><dt><strong><code>\a</code></strong><dd> ring the bell (alarm), usually a euphemism for a hideous
    beep
  <p></p><dt><strong><code>\b</code></strong><dd> move back one character.
  <p></p><dt><strong><code>\c</code></strong><dd> don't print a newline --- like the <code>-n</code> option, but
    embedded in the string.  This alternative comes from Berkeley UNIX.
  <p></p><dt><strong><code>\f</code></strong><dd> form feed, the phrase for `advance to next page' from the
    days when terminals were called teletypes, maybe more familiar to you
    as <code>^L</code>
  <p></p><dt><strong><code>\r</code></strong><dd> carriage return --- when printed, the annoying <code>^M</code>'s
    you get in DOS files, but actually rather useful with `<code>print</code>',
    since it will erase everything to the start of the line.  The
    combination of the <code>-n</code> option and a <code>\r</code> at the start of the
    print string can give the illusion of a continously changing status
    line.
  <p></p><dt><strong><code>\v</code></strong><dd> vertical tab, which I for one have never used (I just tried
    it now and it behaved like a newline, only without assuming a carriage
    return, but that's up to your terminal).
</dl>
In fact, you can get any of the 255 characters possible, although your
terminal may not like some or all of the ones above 127, by specifying a
number after the backslash.  Normally this consists of three octal
characters, but you can use two hexadecimal characters after <code>\x</code> instead
--- so `<code>\n</code>', `<code>\012</code>' and `<code>\x0a</code>' are all newlines.  `<code>\</code>'
itself escapes any other character, i.e. they appear as themselves even if
they normally wouldn't.
<p>Two notes: first, don't get confused because `<code>n</code>' is the fourteenth
letter of the alphabet; printing `<code>\016</code>' (fourteen in octal) won't do
you any good.  The remedy, after you discover your text is unreadable (for
VT100-like terminals including xterm), is to print `<code>\017</code>'.
<p>Secondly, those backslashes can land you in real quoting difficulties.
Normally a backslash on the command line escapes the next character ---
this is a <em>different</em> form of escaping to <code>print</code>'s --- so
<pre>

  print \n

</pre>

doesn't produce a newline, it just prints out an `<code>n</code>'.  So you need to
quote that.  This means
<pre>

  print \\ 

</pre>

passes a single backslash to quote, and
<pre>

  print \\n

</pre>

or
<pre>

  print '\n'

</pre>

prints a newline (followed by the extra one that's usually there).  To
print a real backslash, you would thus need
<pre>

  print \\\\ 

</pre>

Actually, you can get away with the two if there's nothing else after ---
<code>print</code> just shrugs its shoulders and outputs what it's been given --- but
that's not a good habit to get into.  There are other ways of doing this:
since single quotes quote anything, including backslashes (they are the
only way of making backslashes behave like normal characters), and since
the `<code>-r</code>' option makes print treat characters normally,
<pre>

  print -r '\'

</pre>

has the same effect.  But you need to remember the two levels of quoting
for backslashes.  Quotes aren't special to <code>print</code>, so
<pre>

print \'

</pre>

is good enough for printing a quote.
<p><p><strong><code>echotc</code></strong><br><br>
    
<p>There's an oddity called `<code>echotc</code>', which takes as its argument
`termcap' capabilities.  Termcap is a now rather old-fashioned way of
giving the commands necessary for performing various standard operations on
terminals: moving the cursor, clearing to the end of the line, turning on
standout mode, and so on.  It has now been replaced almost everywhere by
`terminfo', a completely different way of specifying capabilities, and by
`curses', a more advanced system for manipulating objects on a character
terminal.  This means that the arguments you need to give to <code>echotc</code> can
be rather hard to come by; try the <code>termcap</code> manual page, otherwise
you'll have to search the web.  The reason the <code>zsh</code> manual doesn't give
a list is that the shell only uses a few well-known sequences, and there
are very many others which will work with <code>echotc</code>, because the sequences
are interpreted by a the terminal, not the shell.
<p>This chunk gives you a flavour:
<pre>

  echotc md
  echo -n bold
  echotc mr
  echo -n reverse
  echotc me
  echo

</pre>

This should show `<code>bold</code>' in bold characters, and `<code>reverse</code>' in bold
reverse video.  The `<code>md</code>' capability turns on bold mode; `<code>mr</code>' turns
on reverse video; `<code>me</code>' turns off both modes.  A more typical zsh way of
doing this is:
<pre>

  print -P '%Bbold%Sreverse%b%s'

</pre>

which should show the same thing, but using prompt escapes --- prompts are
the most common use of special fonts.  The `<code>%S</code>' is because zsh calls
reverse `standout' mode, because it does.  (On a colour xterm, you may find
`bold' is interpreted as `blue'.)
<p>There's a lot more you can do with <code>echotc</code> if you really try.  The shell
has just acquired a way of printing terminfo sequences, predictably called
<code>echoti</code>, although it's only available on systems where zsh needs
terminfo to compile --- this happens when the termcap code is actually a
part of terminfo.  The good news about this is that terminfo tends to be
better documented, so you have a good chance of finding out the
capabilities you want from the <code>terminfo</code> manual page.
<p><a name="l33"></a>
<h3>3.2.2: Other builtins just for speed</h3>
<p>There are only a few other builtins which are there just to make things go
faster.  Strictly, tests could go into this category, but as I explained in
the last chapter it's useful to have tests in the form
<pre>

  if [[ $var1 = $var2 ]]; then
    print doing something
  fi

</pre>

be treated as a special syntax by the shell, in case <code>$var1</code> or <code>$var2</code>
expands to nothing which would otherwise confuse it.  This example
consists of two features described below:  the test itself, between the
double square brackets, which is true if the two substituted values are the
same string, and the `<code>if</code>' construct which runs the commands in the
middle (here just the <code>print</code>) if that test was true.
<p>The builtins `<code>true</code>' and `<code>false</code>' do nothing at all, except return a
command status zero or one, respectively.  They're just used as
placeholders:  to run a loop forever --- <code>while</code> will also be explained
in more detail later --- you use
<pre>

while true; do
  print doing something over and over
done

</pre>

since the test always succeeds.
<p>A synonym for `<code>true</code>' is `<code>:</code>'; it's often used in this form to give
arguments which have side effects but which shouldn't be used --- something
like
 <pre>

  : ${param:=value}

</pre>

which is a common idiom in all Bourne shell derivatives.  In the parameter
expansion, <code>$param</code> is given the value <code>value</code> if it was empty before,
and left alone otherwise.  Since that was the only reason for the
parameter expansion, you use <code>:</code> to ignore the argument.  Actually, the
shell blithely builds the command line --- the colon, followed by whatever
the value of <code>$param</code> is, whether or not the assignment happened --- then
executes the command; it just so happens that `<code>:</code>' takes no notice of
the arguments it was given.  If you're switching from ksh, you may expect
certain synonyms like this to be aliases, rather than builtins themselves,
but in zsh they are actually builtins; there are no aliases predefined by
the shell.  (You can still get rid of them using `<code>disable</code>', as
described below.)
<p><a name="l34"></a>
<h3>3.2.3: Builtins which change the shell's state</h3>
<p>A more common use for builtins is that they change something inside the
shell, or report information about what's going on in the shell.  There is
one vital thing to remember about external commands.  It applies, too, to
other cases we'll meet where the shell `forks', literally splitting itself
into two parts, where the forked-off part behaves just like an external
command.  In both of these cases, the command is in a different <em>process</em>,
UNIX's basic unit of things that run.  (In fact, even Windows knows about
processes nowadays, although they interact a little bit differently with
one another.)
<p>The vital thing is that no change in a separate process started by the
shell affects the shell itself.  The most common case of this is the
current directory --- every process has its own current directory.  You can
see this by starting a new zsh:
<pre>

  % pwd               # show the current directory
  ~
  % zsh               # start a new shell, which 
                      # is a separate process
  % cd tmp
  % pwd               # now I'm in a different
                      # directory...
  ~/tmp
  % exit              # leave the new shell...
  % pwd               # now I'm back where I was...
  ~

</pre>

Hence the <code>cd</code> command must be a shell builtin, or this would happen
every time you ran it.
<p>Here's a more useful example.  Putting parentheses around a command asks
the shell to start a different process for it.  That's useful when you
specifically <em>don't</em> want the effects propagating back:
<pre>

  (cd some-other-dir; run-some-command)

</pre>

runs the command, but doesn't change the directory the `real' shell is in,
only its forked-off `subshell'.  Hence,
<pre>

  % pwd
  ~
  % (cd /; pwd)
  /
  % pwd
  ~

</pre>

<p>There's a more subtle case:
<pre>

  cd some-other-dir | print Hello

</pre>

Remember, the `<code>|</code>' (`pipe') connects the output of the first command to
the input of the next --- though actually no information is passed that way
in this example.  In zsh, all but the last portion of the `pipeline'
thus created is run in different processes.  Hence the <code>cd</code> doesn't
affect the main shell.  I'll refer to it as the `parent' shell, which is
the standard UNIX language for processes; when you start another command or
fork off a subshell, you are creating `children' (without meaning to be
morbid, the children usually die first in this case).  Thus, as you would
guess,
<pre>

  print Hello | cd some-other-dir

</pre>

<em>does</em> have the effect of changing the directory.  Note that other
shells do this differently; it is always guaranteed to work this way in
zsh, because many people rely on it for setting parameters, but many
shells have the <em>left</em> hand of the pipeline being the bit that runs
in the parent shell.  If both sides of the pipe symbol are external
commands of some sort, both will of course run in subprocesses.
<p>There are other ways you change the state of the shell, for example by
declaring parameters of a particular type, or by telling it how to
interpret certain commands, or, of course, by changing options.  Here are
the most useful, grouped in a vaguely logical fashion.
<p><a name="l35"></a>
<h3>3.2.4: cd and friends</h3>
<p>You will not by now be surprised to learn that the `<code>cd</code>' command changes
directory.  There is a synonym, `<code>chdir</code>', which as far as I know no-one
ever uses.  (It's the same name as the system call, so if you had been
programming in C or Perl and forgot that you were now using the shell,
you might use `<code>chdir</code>'.  But that seems a bit far-fetched.)
<p>There are various extra features built into <code>cd</code> and <code>chdir</code>.  First,
if you miss out the directory to which you want to change, you will be
taken to your home directory, although it's not as if `<code>cd ~</code>' is all
that hard to type.
<p>Next, the command `<code>cd -</code>' is special:  it takes you to the last
directory you were in.  If you do a sequence of <code>cd</code> commands, only the
immediately preceding directory is remembered; they are not stacked up.
<p>Thirdly, there is a shortcut for changing between similarly named
directories.  If you type `<code>cd &lt;old&gt; &lt;new&gt;</code>', then the shell will look
for the first occurrence of the string `<code>&lt;old&gt;</code>' in the current directory,
and try to replace it with `<code>&lt;new&gt;</code>'.  For example,
<pre>

  % pwd
  ~/src/zsh-3.0.8/Src
  % cd 0.8 1.9
  ~/src/zsh-3.1.9/Src

</pre>

The <code>cd</code> command actually reported the new directory, as it usually does
if it's not entirely obvious where it's taken you.
<p>Note that only the <em>first</em> match of <code>&lt;old&gt;</code> is taken.  It's an easy
mistake to think you can change from <code>/home/export1/pws/mydir1/something</code>
to <code>/home/export1/pws/mydir2/something</code> with `<code>cd 1 2</code>', but that first
`<code>1</code>' messes it up.  Arguably the shell could be smarter here.  Of
course, `<code>cd r1 r2</code>' will work in this case.
<p><code>cd</code>'s friend `<code>pwd</code>' (print working directory) tells you what the
current working directory is; this information is also available in the
shell parameter <code>$PWD</code>, which is special and automatically updated when
the directory changes.  Later, when you know all about expansion, you will
find that you can do tricks with this to refer to other directories.
For example, <code>${PWD/old/new}</code> uses the parameter substitution mechanism
to refer to a different directory with <code>old</code> replaced by <code>new</code> --- and
this time <code>old</code> can be a pattern, i.e. something with wildcard matches in
it.  So if you are in the <code>zsh-3.0.8/Src</code> directory as above and want to
copy a file from the <code>zsh-3.1.9/Src</code> directory, you have a shorthand:
<pre>

  cp ${PWD/0.8/1.9}/myfile.c .

</pre>

<p><p><strong>Symbolic links</strong><br><br>
    
<p>Zsh tries to track directories across symbolic links.  If you're not
familiar with these, you can think of them as a filename which behaves like
a pointer to another file (a little like Windows' shortcuts, though UNIX
has had them for much longer and they work better).  You create them like
this (<code>ln</code> is not a builtin command, but its use to make symbolic links
is very standard these days):
<pre>

  ln -s existing-file-name name-of-link

</pre>

for example
<pre>

  ln -s /usr/bin/ln ln

</pre>

creates a file called <code>ln</code> in the current directory which does nothing
but point to the file <code>/usr/local/bin/ln</code>.  Symbolic links are very good
at behaving as much like the original file as you usually want; for
example, you can run the <code>ln</code> link you've just created as if it were
<code>/usr/bin/ln</code>.  They show up differently in a long file listing with
`<code>ls -l</code>', the last column showing the file they point to.
<p>You can make them point to any sort of file at all, including directories,
and that is why they are mentioned here.  Suppose you create a symbolic
link from your home directory to the root directory and change into it:
<pre>

  ln -s / ~/mylink
  cd ~/mylink

</pre>

If you don't know it's a link, you expect to be able to change to the
parent directory by doing `<code>cd ..</code>'.  However, the operating system ---
which just has one set of directories starting from <code>/</code> and going down,
and ignores symbolic links after it has followed them, they really are
just pointers --- thinks you are in the root directory <code>/</code>.  This can be
confusing.  Hence zsh tries to keep track of where <em>you</em> probably think
you are, rather than where the system does.  If you type `<code>pwd</code>', you
will see `<code>/home/you/mylink</code>' (wherever your home directory is), not
`<code>/</code>'; if you type `<code>cd ..</code>', you will find yourself back in your home
directory.
<p>You can turn all this second-guessing off by setting the option
<code>CHASE_LINKS</code>; then `<code>cd ~/mydir; pwd</code>' will show you to be in <code>/</code>,
where changing to the parent directory has no effect; the parent of the
root directory is the root directory, except on certain slightly
psychedelic networked file systems.  This does have advantages: for
example, `<code>cd ~/mydir; ls ..</code>' always lists the root directory, not your
home directory, regardless of the option setting, because <code>ls</code> doesn't
know about the links you followed, only zsh does, and it treats the <code>..</code>
as referring to the root directory.  Having <code>CHASE_LINKS</code> set allows
`<code>pwd</code>' to warn you about where the system thinks you are.
<p>An aside for non-UNIX-experts (over 99.9% of the population of the world
at the last count): I said `symbolic links' instead of just `links' because
there are others called `hard links'.  This is what `<code>ln</code>' creates if you
don't use the <code>-s</code> option.  A hard link is not so much a pointer to a
file as an alternative name for a file.  If you do
<pre>

  ln myfile othername
  ls -l

</pre>

where <code>myfile</code> already exists you can't tell which of <code>myfile</code> and
<code>othername</code> is the original --- and in fact the system doesn't care.  You
can remove either, and the other will be perfectly happy as the name for
the file.  This is pretty much how renaming files works, except that
creating the hard link is done for you in that case.  Hard links have
limitations --- you can't link to directories, or to a file on another disk
partition (and if you don't know what a disk partition is, you'll see what
a limitation that can be).  Furthermore, you usually want to know which is
the original and which is the link --- so for most users, creating symbolic
links is more useful.  The only drawback is that following the pointers is
a tiny bit slower; if you think you can notice the difference, you
definitely ought to slow down a bit.
<p>The target of a symbolic link, unlike a hard link, doesn't actually have to
exist and no checking is performed until you try to use the link.  The best
thing to do is to run `<code>ls -lL</code>' when you create the link; the <code>-L</code>
part tells <code>ls</code> to follow links, and if it worked you should see that
your link is shown as having exactly the same characteristics as the file
it points to.  If it is still shown as a link, there was no such file.
<p>While I'm at it, I should point out one slight oddity with symbolic links:
the name of the file linked to (the first name), if it is not an absolute
path (beginning with <code>/</code> after any <code>~</code> expansion), is treated relative
to the directory where the link is created --- not the current directory
when you run <code>ln</code>.  Here:
<pre>

  ln -s ../mydir ~/links/otherdir

</pre>

the link <code>otherdir</code> will refer to <code>mydir</code> in <em>its own</em> parent
directory, i.e. <code>~/links</code> --- not, as you might think, the parent of the
directory where you were when you ran the command.  What makes it worse is
that the second word, if is not an absolute path, <em>is</em> interpreted
relative to the directory where you ran the command.
<p><p><strong>$cdpath and AUTO_CD</strong><br><br>
    
<p>We're nowhere near the end of the magic you can do with directories yet
(and, in fact, I haven't even got to the zsh-specific parts).  The next
trick is <code>$cdpath</code> and <code>$CDPATH</code>.  They look a lot like <code>$path</code> and
<code>$PATH</code> which you met in the last chapter, and I mentioned them briefly
back in the last chapter in that context:  <code>$cdpath</code> is an array of
directories, while <code>$CDPATH</code> is colon-separated list behaving otherwise
like a scalar variable.  They give a list of directories whose
subdirectories you may want to change into.  If you use a normal cd command
(i.e. in the form `<code>cd </code>dirname', and dirname does not begin with
a <code>/</code> or <code>~</code>, the shell will look through the directories in
<code>$cdpath</code> to find one which contains the subdirectory dirname.
If <code>$cdpath</code> isn't set, as you'd guess, it just uses the current
directory.
<p>Note that <code>$cdpath</code> is always searched in order, and you can put a <code>.</code>
in it to represent the current directory.  If you do, the current directory
will always be searched <em>at that point</em>, not necessarily first, which may
not be what you expect.  For example, let's set up some directories:
<pre>

  mkdir ~/crick ~/crick/dna
  mkdir ~/watson ~/watson/dna
  cdpath=(~/crick .)
  cd ~/watson
  cd dna

</pre>

So I've moved to the directory <code>~/watson</code>, which contains the
subdirectory <code>dna</code>, and done `<code>cd dna</code>'.  But because of <code>$cdpath</code>,
the shell will look first in <code>~/crick</code>, and find the <code>dna</code> there, and
take you to that copy of the self-reproducing directory, not the one in
<code>~/watson</code>.  Most people have <code>.</code> at the start of their <code>cdpath</code> for
that reason.  However, at least <code>cd</code> warns you --- if you tried it, you
will see that it prints the name of the directory it's picked in cases like
this.
<p>In fact, if you don't have <code>.</code> in your directory at all, the shell will
always look there first; there's no way of making <code>cd</code> never change to
a subdirectory of the current one, short of turning <code>cd</code> into a
function.  Some shells don't do this; they use the directories in
<code>$cdpath</code>, and only those.
<p>There's yet another shorthand, this time specific to zsh: the option
<code>AUTO_CD</code> which I mentioned in the last chapter.  That way a command
without any arguments which is really a directory will take you to that
directory.  Normally that's perfect --- you would just get a `command not
found' message otherwise, and you might as well make use of the option.
Just occasionally, however, the name of a directory clashes with the name
of a command, builtin or external, or a shell function, and then there can
be some confusion:  zsh will always pick the command as long as it knows
about it, but there are cases where it doesn't, as I described above.
<p>What I didn't say in the last chapter is that <code>AUTO_CD</code> respects
<code>$cdpath</code>; in fact, it really is implemented so that `dirname' on
its own behaves as much like `<code>cd</code> dirname' as is possible without
tying the shell's insides into knots.
<p><p><strong>The directory stack</strong><br><br>
    
<p>One very useful facility that zsh inherited from the C-shell family
(traditional Korn shell doesn't have it) is the directory stack.  This is a
list of directories you have recently been in.  If you use the command
`<code>pushd</code>' instead of `<code>cd</code>', e.g. `<code>pushd</code> dirname', then the
directory you are in is saved in this list, and you are taken to
dirname, using <code>$CDPATH</code> just as <code>cd</code> does.  Then when you type
`<code>popd</code>', you are taken back to where you were.  The list can be as long
as you like; you can <code>pushd</code> any number of directories, and each <code>popd</code>
will take you back through the list (this is how a `stack', or more
precisely a `last-in-first-out' stack usually operates in computer jargon,
hence the name `directory stack').
<p>You can see the list --- which always starts with the current
directory --- with the <code>dirs</code> command.  So, for example:
<pre>

  cd ~
  pushd ~/src
  pushd ~/zsh
  dirs

</pre>

displays
<pre>

  ~/zsh ~/src ~

</pre>

and the next <code>popd</code> will take you back to <code>~/src</code>.  If you do it, you
will see that <code>pushd</code> reports the list given by <code>dirs</code> automatically as
it goes along; you can turn this off with the option <code>PUSHD_SILENT</code>, when
you will have to rely on typing <code>dirs</code> explicitly.
<p>In fact, a lot of the use of this comes not from using simple <code>pushd</code> and
<code>popd</code> combinations, but from two other features.  First, `<code>pushd</code>' on
its own swaps the top two directories on the stack.  Second, <code>pushd</code> with
a numeric argument preceded by a `<code>+</code>' or `<code>-</code>' can take you to one of
the other directories in the list.  The command `<code>dirs -v</code>' tells you the
numbers you need; <code>0</code> is the current directory.  So if you get,
<pre>

  0       ~/zsh
  1       ~/src
  2       ~

</pre>

then `<code>pushd +2</code>' takes you to <code>~</code>.  (A little suspension of disbelief
that I didn't just use <code>AUTO_CD</code> and type `<code>..</code>' is required here.)
If you use a <code>-</code>, it counts from the other end of the list; <code>-0</code> (with
apologies to the numerate) is the last item, i.e. the same as <code>~</code> in this
case.  Some people are used to having the `<code>-</code>' and `<code>+</code>'
arguments behave the other way around; the option <code>PUSHD_MINUS</code> exists
for this.
<p>Apart from <code>PUSHD_SILENT</code> and <code>PUSHD_MINUS</code>, there are a few other
relevant options.  Setting <code>PUSHD_IGNORE_DUPS</code> means that if you
<code>pushd</code> to a directory which is already somewhere in the list, the
duplicate entry will be silently removed.  This is useful for most human
operations --- however, if you are using <code>pushd</code> in a function or script
to remember previous directories for a future matching <code>popd</code>, this can
be dangerous and you probably want to turn it off locally inside the
function.
<p><code>AUTO_PUSHD</code> means that any directory-changing command, including an
auto-cd, is treated as a <code>pushd</code> command with the target directory as
argument.  Using this can make the directory stack get very long, and there
is a parameter <code>$DIRSTACKSIZE</code> which you can set to specify a maximum
length.  The oldest entry (the highest number in the `<code>dirs -v</code>' listing)
is automatically removed when this length is exceeded.  There is no limit
unless this is explicitly set.
<p>The final <code>pushd</code> option is <code>PUSHD_TO_HOME</code>.  This makes <code>pushd</code> on
its own behave like <code>cd</code> on its own in that it takes you to your home
directory, instead of swapping the top two directories.  Normally a series
of `<code>pushd</code>' commands works pretty much like a series of `<code>cd -</code>'
commands, always taking you the directory you were in before, with the
obvious difference that `<code>cd -</code>' doesn't consult the directory stack, it
just remembers the previous directory automatically, and hence it can
confuse <code>pushd</code> if you just use `<code>cd -</code>' instead.
<p>There's one remaining subtlety with <code>pushd</code>, and that is what happens to
the rest of the list when you bring a particular directory to the front
with something like `<code>pushd +2</code>'.  Normally the list is simply cycled, so
the directories which were +3, and +4 are now right behind the new head of
the list, while the two directories which were ahead of it get moved to the
end.  If the list before was:
<pre>

  dir1  dir2  dir3  dir4

</pre>

then after <code>pushd +2</code> you get
<pre>

  dir3  dir4  dir1 dir2

</pre>

That behaviour changed during the lifetime of zsh, and some of us preferred
the old behaviour, where that one directory was yanked to the front and the
rest just closed the gap:
<pre>

  # Old behaviour
  dir3  dir1  dir2  dir4

</pre>

so that after a while you get a `greatest hits' group at the front of the
list.  If you like this behaviour too (I feel as if I'd need to have written
papers on group theory to like the new behaviour) there is a function
<code>pushd</code> supplied with the source code, although it's short enough to
repeat here --- this is in the form for autoloading in the zsh fashion:
<pre>

  # pushd function to emulate the old zsh behaviour.
  # With this, pushd +/-n lifts the selected element
  # to the top of the stack instead of cycling
  # the stack.

  emulate -R zsh
  setopt localoptions

  if [[ ARGC -eq 1 &amp;&amp; "$1" == [+-]&lt;-&gt; ]] then
          setopt pushdignoredups
          builtin pushd ~$1
  else
          builtin pushd "$@"
  fi

</pre>

The `<code>&amp;&amp;</code>' is a logical `and', requiring both tests to be true.  The
tests are that there is exactly one argument to the function, and that it
has the form of a `<code>+</code>' or a `<code>-</code>' followed by any number (`<code>&lt;-&gt;</code>' is
a special zsh pattern to match any number, an extension of forms like
`<code>&lt;1-100&gt;</code>' which matches any number in the range 1 to 100 inclusive).
<p><p><strong>Referring to other directories</strong><br><br>
    
<p>Zsh has two ways of allowing you to refer to particular directories.  They
have in common that they begin with a <code>~</code> (in very old versions of zsh,
the second form actually used an `<code>=</code>', but the current way is much more
logical).
<p>You will certainly be aware, because I've made a lot of use of it, that a
`<code>~</code>' on its own or followed by a <code>/</code> refers to your own home
directory.  An extension of this --- again from the C-shell, although the
Korn shell has it too in this case --- is that <code>~name</code> can refer to the
home directory of any user on the system.  So if your user name is <code>pws</code>,
then <code>~</code> and <code>~pws</code> are the same directory.
<p>Zsh has an extension to this; you can actually name your own directories.
This was described in chapter 2, &agrave; propos of prompts, since that is the
major use:
<pre>

  host% PS1='%~? '
  ~? cd zsh/Src
  ~/zsh/Src? zsrc=$PWD
  ~/zsh/Src? echo ~zsrc
  /home/pws/zsh/Src
  ~zsrc?

</pre>

Consult chapter 2 for the ways of forcing a parameter to be recognised
as a named directory.
<p>There's a slightly more sophisticated way of doing this directly:
<pre>

  hash -d zsrc=~/zsh/Src

</pre>

makes <code>~zsrc</code> appear in prompts as before, and in this case there is no
parameter <code>$zsrc</code>.  This is the purist's way (although very few zsh users
are purists).  You can guess what `<code>unhash -d zsrc</code>' does; this works
with directories named via parameters, too, but leaves the parameter itself
alone.
<p>It's possible to have a named directory with the same name as a user.  In
that case `<code>~name</code>' refers to the directory you named explicitly, and
there is no easy way of getting <code>name</code>'s home directory without removing
the name you defined.
<p>If you're using named directories with one of the <code>cd</code>-like commands or
<code>AUTO_CD</code>, you can set the option <code>CDABLEVARS</code> which allows you to
omit the leading <code>~</code>; `<code>cd zsrc</code>' with this option would take you to
<code>~zsrc</code>.  The name is a historical artifact and now a misnomer; it really
is named directories, not parameters (i.e. variables), which are used.
<p>The second way of referring to directories with <code>~</code>'s is to use numbers
instead of names:  the numbers refer to directories in the directory
stack.  So if <code>dirs -v</code> gives you
<pre>

  0       ~zsf
  1       ~src

</pre>

then <code>~+1</code> and <code>~-0</code> (not very mathematical, but quite logical if you
think about it) refer to <code>~src</code>.  In this case, unlike pushd
arguments, you can omit the <code>+</code> and use <code>~1</code>.  The option
<code>PUSHD_MINUS</code> is respected.  You'll see this was used in the <code>pushd</code>
function above: the trick was that <code>~+3</code>, for example, refers to the same
element as <code>pushd +3</code>, hence <code>pushd ~+3</code> pushed that directory onto the
front of the list.  However, we set <code>PUSHD_IGNORE_DUPS</code>, so that the
value in the old position was removed as well, giving us the effect we
wanted of simply yanking the directory to the front with no trick cycling.
<p><a name="l36"></a>
<h3>3.2.5: Command control and information commands</h3>
<p>Various builtins exist which control how you access commands, and which
show you information about the commands which can be run.
<p>The first two are strictly speaking `precommand modifiers' rather than
commands:  that means that they go before a command line and modify its
behaviour, rather than being commands in their own right.  If you put
`<code>command</code>' in front of a command line, the command word (the next one
along) will be taken as the name of an external command, however it would
normally be interpreted; likewise, if you put `<code>builtin</code>' in front, the
shell will try to run the command as a builtin command.  Normally, shell
functions take precedence over builtins which take precedence over external
commands.  So, for example, if your printer control system has the command
`<code>enable</code>' (as many System V versions do), which clashes with a builtin I
am about to talk about, you can run `<code>command enable lp</code>' to enable a
printer; otherwise, the builtin enable would have been run.  Likewise, if
you have defined <code>cd</code> to be a function, but this time want to call the
normal builtin <code>cd</code>, you can say `<code>builtin cd mydir</code>'.
<p>A common use for <code>command</code> is inside a shell function of the same name.
Sometimes you want to enhance an ordinary command by sticking some extra
stuff around it, then calling that command, so you write a shell function
of the same name.  To call the command itself inside the shell function,
you use `<code>command</code>'.  The following works, although it's obviously not
all that useful as it stands:
<pre>

  ls() {
    command ls "$[@]"
  }

</pre>

so when you run `<code>ls</code>', it calls the function, which calls the real
<code>ls</code> command, passing on the arguments you gave it.
<p>You can gain longer lasting control over the commands which the shell will
run with the `<code>disable</code>' and `<code>enable</code>' commands.  The first normally
takes builtin arguments; each such builtin will not be recognised by the
shell until you give an `<code>enable</code>' command for it.  So if you want to be
able to run the external <code>enable</code> command and don't particularly care
about the builtin version, `<code>disable enable</code>' (sorry if that's confusing)
will do the trick.  Ha, you're thinking, you can't run `<code>enable enable</code>'.
That's correct: some time in the dim and distant past, <code>builtin enable
enable</code>' would have worked, but currently it doesn't; this may change, if I
remember to change it.  You can list all disabled builtins with just
`<code>disable</code>' on its own --- most of the builtins that do this sort of
manipulation work like that.
<p>You can manipulate other sets of commands with <code>disable</code> and <code>enable</code>
by giving different options: aliases with the option <code>-a</code>, functions with
<code>-f</code>, and reserved words with <code>-r</code>.  The first two you probably know
about, and I'll come to them anyway, but `reserved words' need describing.
They are essentially builtin commands which have some special syntactic
meaning to the shell, including some symbols such as `<code>{</code>' and `<code>[[</code>'.
They take precedence over everything else except aliases --- in fact, since
they're syntactically special, the shell needs to know very early on that
it has found a reserved word, it's no use just waiting until it tries to
execute a command.  For example, if the shell finds `<code>[[</code>' it needs to
know that everything until `<code>]]</code>' must be treated as a test rather than
as ordinary command arguments.  Consequently, you wouldn't often want to
disable a reserved word, since the shell wouldn't work properly.  The most
obvious reason why you might would be for compatibility with some other
shell which didn't have one.  You can get a complete list with:
<pre>

  whence -wm '*' | grep reserved

</pre>

which I'll explain below, since I'm coming to `<code>whence</code>'.
<p>Furthermore, I tend to find that if I want to get rid of aliases or
functions I use the commands `<code>unalias</code>' and `<code>unfunction</code>' to get rid
of them permanently, since I always have the original definitions stored
somewhere, so these two options may not be that useful either.  Disabling
builtins is definitely the most useful of the four possibilities for
<code>disable</code>.
<p>External commands have to be manipulated differently.  The types given
above are handled internally by the shell, so all it needs to do is
remember what code to call.  With external commands, the issue instead
is how to find them.  I mentioned <code>rehash</code> above, but didn't tell you
that the <code>hash</code> command, which  you've already
seen with the <code>-d</code> option, can be used to tell the shell how to find an
external command:
<pre>

  hash foo=/path/to/foo

</pre>

makes <code>foo</code> execute the command using the path shown (which doesn't even
have to end in `<code>foo</code>').  This is rather like an alias --- most people
would probably do this with an alias, in fact --- although a little faster,
though you're unlikely to notice the difference.  You can remove this with
<code>unhash</code>.  One gotcha here is that if the path is rehashed, either by
calling <code>rehash</code> or when you alter <code>$path</code>, the entire hash table is
emptied, including anything you put in in this way; so it's not
particularly useful.
<p>In the midst of all this, it's useful to be able to find out what the shell
thinks a particular command name does.  The command `<code>whence</code>' tells you
this; it also exists, with slightly different options, under the names
<code>where</code>, <code>which</code> and <code>type</code>, largely to provide compatibility with
other shells.  I'll just stick to <code>whence</code>.
<p>Its standard output isn't actually sparklingly interesting.  If it's a
command somehow known to the shell internally, it gets echoed back, with
the alias expanded if it was an alias; if it's an external command it's
printed with the full path, showing where it came from; and if it's not
known the command returns status 1 and prints nothing.
<p>You can make it more useful with the <code>-v</code> or <code>-c</code> options, which are
more verbose; the first prints out an information message, while the second
prints out the definitions of any functions it was asked about (this is
also the effect of using `<code>which</code>' instead of `<code>whence</code>).  A very
useful option is <code>-m</code>, which takes any arguments as patterns using the
usual zsh pattern format, in other words the same one used for matching
files.  Thus
<pre>

  whence -vm "*"

</pre>

prints out every command the shell knows about, together with what it
thinks of it.
<p>Note the quotes around the `<code>*</code>' --- you have to remember these anywhere
where the pattern is not to be used to generate filenames on the command
line, but instead needs to be passed to the command to be interpreted.  If
this seems a rather subtle distinction, think about what would happen if
you ran
<pre>

  # Oops.  Better not try this at home.
  # (Even better, don't do it at work either.)
  whence -vm *

</pre>

in a directory with the files `<code>foo</code>' and (guess what) `<code>bar</code>' in
it.  The shell hasn't decided what command it's going to run when it first
looks at the command line; it just sees the `<code>*</code>' and expands the line to
<pre>

  whence -vm foo bar

</pre>

which isn't what you meant.
<p>There are a couple of other tricks worth mentioning:  <code>-p</code> makes
the shell search your path for them, even if the name is matched as
something else (say, a shell function).  So if you have <code>ls</code>
defined as a function,
<pre>

  which -p ls

</pre>

will still tell what `<code>command ls</code>' would find.  Also, the option
<code>-a</code> searches for all commands; in the same example, this would show you
both the <code>ls</code> command and the <code>ls</code> function, whereas <code>whence</code> would
normally only show the function because that's the one that would be run.
The <code>-a</code> option also shows if it finds more than one external command in
your path.
<p>Finally, the option <code>-w</code> is useful because it identifies the type of a
command with a single word:  <code>alias</code>, <code>builtin</code>, <code>command</code>,
<code>function</code>, <code>hashed</code>, <code>reserved</code> or <code>none</code>.  Most of those are
obvious, with <code>command</code> being an ordinary external command; <code>hashed</code> is
an external command which has been explicitly given a path with the
<code>hash</code> builtin, and <code>none</code> means it wasn't recognised as a command at
all.  Now you know how we extracted the reserved words above.
<p>A close relative of <code>whence</code> is <code>functions</code>, which applies, of course,
to shell functions; it usually lists the definitions of all functions given
as arguments, but its relatives (of which <code>autoload</code> is one) perform
various other tricks, to be described in the section on shell functions
below.  Be careful with <code>function</code>, without the `s', which is completely
different and not like <code>command</code> or <code>builtin</code> --- it is actually a
keyword used to <em>define</em> a function.
<p><a name="l37"></a>
<h3>3.2.6: Parameter control</h3>
<p>There are various builtins for controlling the shells parameters.
You already know how to set and use parameters, but it's a good deal more
complicated than that when you look at the details.
<p><p><strong>Local parameters</strong><br><br>
    
<p>The principal command for manipulating the behaviour of parameters is
`<code>typeset</code>'.  Its easiest usage is to declare a parameter; you just give
it a list of parameter names, which are created as scalar parameters.  You
can create parameters just by assigning to them, but the major point of
`<code>typeset</code>' is that if a parameter is created that way inside a function,
the parameter is restored to its original value, or removed if it didn't
previously exist, at the end of the function --- in other words, it has
`local scope' like the variables which you declare in most ordinary
programming languages.  In fact, to use the jargon it has `dynamical'
rather than `syntactic' scope, which means that the same parameter is
visible in any function called within the current one; this is different
from, say, C or FORTRAN where any function or subroutine called wouldn't
see any variable declared in the parent function.
<p>The following makes this more concrete.
<pre>

  var='Original value'
  subfn() {
    print $var
  }
  fn() {
    print $var
    typeset var='Value in function'
    print $var
    subfn
  }
  fn
  print $var

</pre>

This chunk of code prints out
<pre>

  Original value
  Value in function
  Value in function
  Original value

</pre>

The first three chunks of the code just define the parameter <code>$var</code>, and
two functions, <code>subfn</code> and <code>fn</code>.  Then we call <code>fn</code>.  The first thing
this does is print out <code>$var</code>, which gives `<code>Original value</code>' since we
haven't changed the original definition.  However, the <code>typeset</code> next
does that; as you see, we can assign to the parameter during the typeset.
Thus when we print <code>$var</code> out again, we get `<code>Value in function</code>'.
Then <code>subfn</code> is called, which prints out the same value as in <code>fn</code>,
because we haven't changed it --- this is where C or FORTRAN would differ,
and wouldn't recognise the variable because it hadn't been declared in that
function.  Finally, <code>fn</code> exits and the original value is restored, and is
printed out by the final `<code>print</code>'.
<p>Note the value changes twice: first at the <code>typeset</code>, then again at the
end of <code>fn</code>.  The value of <code>$var</code> at any point will be one of those
two values.
<p>Although you can do assignments in a <code>typeset</code> statement, you can't
assign to arrays (I already said this in the last chapter):
<pre>

  typeset var=(Doesn\'t work\!)

</pre>

because the syntax with the parentheses is special; it only works when the
line consists of nothing but assignments.  However, the shell doesn't
complain if you try to assign an array to a scalar, or vice versa; it just
silently converts the type:
<pre>

  typeset var='scalar value'
  var=(array value)

</pre>

I put in the assignment in the typeset statement to rub the point in that
it creates scalars, but actually the usual way of setting up an array in
a function is
<pre>

  typeset var
  var=()

</pre>

which creates an empty scalar, then converts that to an empty array.
Recent versions of the shell have `<code>typeset -a var</code>' to do that in one go
--- but you <em>still</em> can't assign to it in the same statement.
<p>There are other catches associated with the fact that <code>typeset</code> and its
relatives are just ordinary commands with ordinary sets of arguments.
Consider this:
<pre>

  % typeset var=`echo two words`
  % print $var
  two

</pre>

What has happened to the `<code>words</code>'?  The answer is that backquote
substitution, to be discussed below, splits words when not quoted.  So the
<code>typeset</code> statement is equivalent to
<pre>

  % typeset var=two words

</pre>

There are two ways to get round this; first, use an ordinary assignment:
<pre>

  % typeset var
  % var=`echo two words`

</pre>

which can tell a scalar assignment, and hence knows not to split words, or
quote the backquotes,
<pre>

  % typeset var="`echo two words`"

</pre>

<p>There are three important types we haven't talked about; both of these can
only be created with <code>typeset</code> or one of the similar builtins I'll list
in a moment.  They are integer types, floating point types, and associative
array types.
<p><p><strong>Numeric parameters</strong><br><br>
    
<p>Integers are created with `<code>typeset -i</code>', or `<code>integer</code>' which is
another way of saying the same thing.  They are used for arithmetic, which
the shell can do as follows:
<pre>

  integer i
  (( i = 3 * 2 + 1 ))

</pre>

The double parentheses surround a complete arithmetic expression:  it
behaves as if it's quoted.  The expression inside can be pretty much
anything you might be used to from arithmetic in other programming
languages.  One important point to note is that parameters don't need to
have the <code>$</code> in front, even when their value is being taken:
<pre>

  integer i j=12
  (( i = 3 * ( j + 4 ) ** 2 ))

</pre>

Here, <code>j</code> will be replaced by 12 and <code>$i</code> gets the value 768 (sixteen
squared times three).  One thing you might not recognise is the <code>**</code>,
which is the `to the power of' operator which occurs in FORTRAN and Perl.
Note that it's fine to have parentheses inside the double parentheses ---
indeed, you can even do
<pre>

  (( i = (3 * ( j + 4 )) ** 2 ))

</pre>

and the shell won't get confused because it knows that any parentheses
inside must be in balanced pairs (until you deliberately confuse it with
your buggy code).
<p>You would normally use `<code>print $i</code>' to see what value had been given to
<code>$i</code>, of course, and as you would expect it gets printed out as a decimal
number.  However, <code>typeset</code> allows you to specify another base for
printing out.  If you do
<pre>

  typeset -i 16 i
  print $i

</pre>

after the last calculation, you should see <code>16#900</code>, which means 900 in
base 16 (hexadecimal).  That's the only effect the option `<code>-i 16</code>' has
on <code>$i</code> --- you can assign to it and use it in arithmetical expressions
just as normal, but when you print it out it appears in this form.  You can
use this base notation for inputting numbers, too:
<pre>

  (( i = 16#ff * 2#10 ))

</pre>

which means 255 (<code>ff</code> in hexadecimal) times 2 (<code>10</code> in binary).
The shell understands C notation too, so `<code>16#ff</code>' could have been
expressed `<code>0xff</code>'.
<p>Floating point variables are very similar.  You can declare them with
`<code>typeset -F</code>' or `<code>typeset -E</code>'.  The only difference between the two
is, again, on output; <code>-F</code> uses a fixed point notation, while <code>-E</code> uses
scientific (mnemonic: exponential) notation.  The builtin `<code>float</code>' is
equivalent to `<code>typeset -E</code>' (because Korn shell does it, that's why).
Floating point expressions also work the way you are probably used to:
<pre>

  typeset -E e
  typeset -F f
  (( e = 32/3, f = 32.0/3.0 ))
  print $e $f

</pre>

prints
<pre>

  1.000000000e+01 10.6666666667

</pre>

Various points:  the `<code>,</code>' can separate different expressions, just like in
C, so the <code>e</code> and <code>f</code> assignments are performed separately.  The <code>e</code>
assignment was actually an integer division, because neither 32 nor 3 is a
floating point number, which must contain a dot.  That means an integer
division was done, producing 10, which was then converted to a floating
point number only at the end.  Again, this is just how grown-up languages
work, so it's no use cursing.  The <code>f</code> assignment was a full floating
point performance.  Floating point parameters weren't available before
version <code>3.1.7</code>.
<p>Although this is really a matter for a later chapter, there is a library of
floating point functions you can load (actually it's just a way of linking
in the system mathematical library).  The usual incantation is `<code>zmodload
zsh/mathfunc</code>'; you may not have `dynamic loading' of libraries on your
system, which may mean that doesn't work.  If it does, you can do things
like
<pre>

  (( pi = 4.0 * atan(1.0) ))

</pre>

Broadly, all the functions which appear in most system mathematical
libraries (see the manual page for <code>math</code>) are available in zsh.
<p>Like all other parameters created with <code>typeset</code> or one of its cousins,
integer and floating point parameters are local to functions.  You may
wonder how to create a global parameter (i.e. one which is valid outside as
well as inside the function) which has an integer or floating point value.
There's a recent addition to the shell (in version 3.1.6) which allows
this: use the flag <code>-g</code> to typeset along with any others.  For example,
<pre>

  fn() {
    typeset -Fg f
    (( f = 42.75 ))
  }
  fn
  print $f

</pre>

If you try it, you will see the value of <code>$f</code> has survived beyond the
function.  The <code>g</code> stands for global, obviously, although it's not quite
that simple:
<pre>

  fn() {
    typeset -Fg f
  }
  outerfn() {
    typeset f='scalar value'
    fn
    print $f
  }
  outerfn

</pre>

The function <code>outerfn</code> creates a local scalar value for <code>f</code>; that's
what <code>fn</code> sees.  So it was not really operating on a `global' value, it
just didn't create a new one for the scope of <code>fn</code>.  The error message
comes because it tried to preserve the value of <code>$f</code> while changing its
type, and the value wasn't a proper floating point expression.  The error
message,
<pre>

  fn: bad math expression: operator expected at `value'

</pre>

comes about because assigning to numeric parameters always does an
arithmetic evaluation.  Operating on `<code>scalar value</code>' it found
`<code>scalar</code>' and assumed this was a parameter, then looked for an operator
like `<code>+</code>' to come next; instead it found `<code>value</code>'.  If you want to
experiment, change the string to `<code>scalar + value</code>' and set
`<code>value=42</code>', or whatever, then try again. This is a little confusing
(which is a roundabout way of saying it confused me), but consistent with
how zsh usually treats parameters.
<p>Actually, to a certain extent you don't need to use the integer and
floating point parameters.  Any time zsh needs a numeric expression
it will force a scalar to the right value, and any time it produces a
numeric expression and assigns it to a scalar, it will convert the result
to a string.  So
<pre>

  typeset num=3            # This is the *string* `3'.
  (( num = num + 1 ))      # But this works anyway
                           # ($num is still a string).

</pre>

This can be useful if you have a parameter which is sometimes a number,
sometimes a string, since zsh does all the conversion work for you.
However, it can also be confusing if you always want a number, because zsh
can't guess that for you; plus it's a little more efficient not to have to
convert back and forth; plus you lose accuracy when you do, because if the
number is stored as a string rather than in the internal numeric
representation, what you say is what you get (although zsh tends to give
you quite a lot of decimal places when converting implicitly to strings).
Anyway, I'd recommend that if you know a parameter has to be an integer or
floating point value you should declare it as such.
<p>There is a builtin called <code>let</code> to handle mathematical expressions, but
since
<pre>

  let "num = num + 1"

</pre>

is equivalent to
<pre>

  (( num = num + 1 ))

</pre>

and the second form is easier and more memorable, you probably won't need
to use it.  If you do, remember that (unlike BASIC) each mathematical
expression should appear as one argument in quotes.
<p><p><strong>Associative arrays</strong><br><br>
    
<p>The one remaining major type of parameter is the associative array; if
you use Perl, you may call it a `hash', but we tend not to since that's
really a description of how it's implemented rather than what it does.
(All right, what it does is hash things.  Now shut up.)
<p>These have to be declared by a typeset statement --- there's no getting
round it.  There are some quite eclectic builtins that
produce a filled-in associative array for you, but the only way to tell zsh
you want your very own associative array is
<pre>

  typeset -A assoc

</pre>

to create <code>$assoc</code>.  As to what it does, that's best shown by example:
<pre>

  typeset -A assoc
  assoc=(one eins two zwei three drei)
  print ${assoc[two]}

</pre>

which prints `<code>zwei</code>'.  So it works a bit like an ordinary array, but the
numeric <em>subscript</em> of an ordinary array which would have appeared inside
the square bracket is replaced by the string <em>key</em>, in this case <code>two</code>.
The array assignment was a bit deceptive; the `values' were actually
pairs, with `<code>one</code>' being the key for the value `<code>eins</code>', and so on.  The
shell will complain if there are an odd number of elements in such a list.
This may also be familiar from Perl.  You can assign values one at a time:
<pre>

  assoc[four]=vier

</pre>

and also unset one key/value pair:
<pre>

  unset 'assoc[one]'

</pre>

where the quotes stop the square brackets from being interpreted as a pattern
on the command line.
<p>Expansion has been held over, but you might like to know about the ways of
getting back what you put in.  If you do
<pre>

  print $assoc

</pre>

you just see the values --- that's exactly the same as with an ordinary
array, where the subscripts 1, 2, 3, etc. aren't shown.  Note they are in
random order --- that's the other main difference from ordinary arrays;
associative arrays have no notion of an order unless you explicitly sort
them.
<p>But here the keys may be just as interesting.  So there is:
<pre>

  print ${(k)assoc}
  print ${(kv)assoc}

</pre>

giving (if you've followed through all the commands above):
<pre>

  four two three
  four vier two zwei three drei

</pre>

which print out the keys instead of the values, and the key and value pairs
much as you entered them.  You can see that, although the order of the
pairs isn't obvious, it's the same each time.  From this example you can
work out how to copy an associative array into another one:
<pre>

  typeset -A newass
  newass=(${(kv)assoc})

</pre>

where the `<code>(kv)</code>' is important --- as is the <code>typeset</code> just before the
assignment, otherwise <code>$newass</code> would be a badass ordinary array.  You
can also prove that <code>${(v)assoc}</code> does what you would probably expect.
There are lots of other tricks, but they are mostly associated with clever
types of parameter expansion, to be described in a later chapter.
<p><p><strong>Other typeset and type tricks</strong><br><br>
    
<p>There are variants of <code>typeset</code>, some mentioned sporadically above.
There is nothing you can do with any of them that you can't do with
<code>typeset</code> --- that wasn't always the case; we've tried to improve the
orthogonality of the options.  They differ in the options which are set by
default, and the additional options which are allowed.  Here's a list:
<code>declare</code>, <code>export</code>, <code>float</code>, <code>integer</code>, <code>local</code>, <code>readonly</code>.
I won't confuse you by describing all in detail; see the manual.
<p>If there is an odd one out, it's <code>export</code>, which not only marks a
parameter for export but has the <code>-g</code> flag turned on by default, so that
that parameter is not local to the function; in other words, it's
equivalent to <code>typeset -gx</code>.  However, one holdover from the days when
the options weren't quite so logical is that <code>typeset -x</code> behaves like
<code>export</code>, in other words the <code>-g</code> flag is turned on by default.  You
can fix this by unsetting the option <code>GLOBAL_EXPORT</code> --- the option only
exists for compatibility; logically it should always be unset.  This is
partly because in the old days you couldn't export local parameters, so
<code>typeset -x</code> either had to turn on <code>-g</code> or turn off <code>-x</code>; that was
fixed for the 3.1.9 release, and (for example) `<code>local -x</code>' creates a
local parameter which is exported to the environment; both the parameter
itself, and the value in the environment, will be restored when the
function exits.  The builtin <code>local</code> is essentially a form of <code>typeset</code>
which renounces the <code>-g</code> flag and all its works.
<p>Another old restriction which has gone is that you couldn't make special
parameters, in particular <code>$PATH</code>, local to a function; you just modified
the original parameter.  Now if you say `<code>typeset PATH</code>', things happen
the way you probably expect, with <code>$PATH</code> having its usual effect, and
being restored to its old value when the function exits.  Since <code>$PATH</code>
is still special, though, you should make sure you assign something to it
in the function before calling external commands, else it will be empty and
no commands will be found.  It's possible that you specifically don't want
some parameter you make local to have the special property; 3.1.7 and after
allow the typeset flag <code>-h</code> to hide the specialness for that parameter,
so in `<code>typeset -h PATH</code>', <code>PATH</code> would be an ordinary variable for the
duration of the enclosing function.  Internally, the same value as was
previously set would continue to be used for finding commands, but it
wouldn't be exported.
<p>The second main use of <code>typeset</code> is to set attributes for the parameters.
In this case it can operate on an existing parameter, as well as creating a
new one.  For example,
<pre>

  typeset -r msg='This is an important message.'

</pre>

sets the readonly flag (-r) for the parameter <code>msg</code>.  If the parameter
didn't exist, it would be created with the usual scoping rules; but if it
did exist at the current level of scoping, it would be made readonly with
the value assigned to it, meaning you can't set that particular copy of
the parameter.  For obvious reasons, it's normal to assign a value to a
readonly parameter when you first declare it.  Here's a reality check on
how this affects scoping:
<pre>

   msg='This is an ordinary parameter'
   fn() {
     typeset msg='This is a local ordinary parameter'
     print $msg
     typeset -r msg='This is a local readonly parameter'
     print $msg
     msg='Watch me cause an error.'
   }
   fn
   print $msg
   msg='This version of the parameter'\ 
   ' can still be overwritten'
   print $msg

</pre>

outputs
<pre>

  This is a local ordinary parameter
  This is a local readonly parameter
  fn:5: read-only variable: msg
  This is an ordinary parameter
  This version of the parameter can still be overwritten

</pre>

Unfortunately there was a bug with this code until recently --- thirty
seconds ago, actually:  the second <code>typeset</code> in <code>fn</code> incorrectly added
the readonly flag to the existing <code>msg</code> <em>before</em> attempting to set the
new value, which was wrong and inconsistent with what happens if you create
a new local parameter.  Maybe it's reassuring that the shell can get
confused about local parameters, too.  (I don't find it reassuring in the
slightest, since <code>typeset</code> is one of the parts of the code where I tend
to fix the bugs, but maybe you do.)
<p>Anyway, when the bug is fixed, you should get the output shown, because the
first typeset created a local variable which the second typeset made
readonly, so that the final assignment caused an error.  Then the <code>$msg</code>
in the function went out of scope, and the ordinary parameter, with no
readonly restriction, was visible again.
<p>I mentioned another special typeset option in the previous chapter:
<pre>

  typeset -T TEXINPUTS texinputs

</pre>

to tie together the scalar <code>$TEXINPUTS</code> and the array <code>$texinputs</code> in
the same way that <code>$PATH</code> and <code>$path</code> work.  This is a one-off; it's
the only time <code>typeset</code> takes exactly two parameter names on the command
line.  All other uses of typeset take a list of parameters to which any
flags given are applied.  See the manual for the remaining flags, although
most of the more interesting ones have been discussed.
<p>The other thing you need to know about flags is that you use them with a
`<code>+</code>' sign to turn off the corresponding attribute.  So
<pre>

  typeset +r msg

</pre>

allows you to set <code>$msg</code> again.  From version <code>4.1</code>, you won't be able
to turn off the readonly attribute for a special paramter; that's because
there's too much scope for confusion, including attempting to set constant
strings in the code.  For example, `<code>$ZSH_VERSION</code>' always prints a fixed
string; attempting to change that is futile.
<p>The final use of typeset is to list parameters.  If you type `<code>typeset</code>'
on its own, you get a complete list of parameters and their values.  From
3.1.7, you can turn on the flag <code>-H</code> for a parameter, which means to hide
its value while you're doing this.  This can be useful for some of the more
enormous parameters, particularly special parameters which I'll talk about
in the chapter on modules, which tend to swamp the display <code>typeset</code>
produces.
<p>You can also list parameters of a particular type, by listing the flags you
want to know about.  For example,
<pre>

  typeset -r

</pre>

lists all readonly parameters.  You might expect `<code>typeset +r</code>' to list
parameters which <em>don't</em> have that attribute, but actually it lists the
same parameters but without showing their value.  `<code>typeset +</code>' lists all
parameters in this way.
<p>Another good way of finding out about parameters is to use the special
expansion `<code>${(t)</code>param<code>}</code>', for example
<pre>

  print ${(t)PATH}

</pre>

prints `<code>scalar-export-special</code>':  <code>$PATH</code> is a scalar parameter, with
the <code>-x</code> flag set, and has a special meaning to the shell.  Actually,
`<code>special</code>' means something a bit more than that:  it means the internal
code to get and set the parameter behaves in a way which has side effects,
either to the parameter itself or elsewhere in the shell.  There are
other parameters, like <code>$HISTFILE</code>, which are used by the shell, but
which are get and set in a normal way --- they are only special in that the
value is looked at by the shell; and, after all, any old shell function can
do that, too.  Contrast this with <code>$PATH</code> which has all that
paraphernalia to do with hashing commands to take care of when it's set,
as I discussed above, and I hope you'll see the difference.
<p><p><strong>Reading into parameters</strong><br><br>
    
<p>The `<code>read</code>' builtin, as its name suggests, is the opposite to
`<code>print</code>' (there's no `<code>write</code>' command in the shell, though there is
often an external command of that name to send a message to another user),
but reading, unlike printing, requires something in the shell to change to
take the value, so unlike <code>print</code>, <code>read</code> is forced to be a builtin.
Inevitably, the values are read into a parameter.  Normally they are taken
from standard input, very often the terminal (even if you're running a
script, unless you redirected the input).  So the simplest case is just
<pre>

  read param

</pre>

and if you type a line, and hit return, it will be put into <code>$param</code>,
without the final newline.
<p>The <code>read</code> builtin actually does a bit of processing on the input.  It
will usually strip any initial or final whitespace (spaces or tabs) from
the line read in, though any in the middle are kept.  You can read a set of
values separated by whitespace just by listing the parameters to assign
them to; the last parameter gets all the remainder of the line without it
being split.  Very often it's easiest just to read into an array:
<pre>

  % read -A array
        this is a line typed in now, \ 
      by me,    in this   space
  % print ${array[1]} ${array[12]}
  this space

</pre>

(I'm assuming you're using the native zsh array format, rather than the one
set with <code>KSH_ARRAYS</code>, and shall continue to assume this.)
<p>It's useful to be able to print a prompt when you want to read something.
You can do this with `<code>print -n</code>', but there's a shorthand:
<pre>

  % read line'?Please enter a line: '
  Please enter a line: some words
  % print $line
  some words

</pre>

Note the quotes surround the `<code>?</code>' to prevent it being taken as part of a
pattern on the command line.  You can quote the whole expression from the
beginning of `<code>line</code>', if you like; I just write it like that because I
know parameter names don't need quoting, because they can't have funny
characters in.  It's almost logical.
<p>Another useful trick with <code>read</code> is to read a single character; the
`<code>-k</code>' option does this, and in fact you can stick a number immediately
after the `<code>k</code>' which specifies a number to read.  Even easier, the
`<code>-q</code>' option reads a single character and returns status 0 if it was
<code>y</code> or <code>Y</code>, and status 1 otherwise; thus you can read the answer to
yes/no questions without using a parameter at all.  Note, however, that if
you don't supply a parameter, the reply gets assigned in any case to
<code>$REPLY</code> if it's a scalar --- as it is with <code>-q</code> --- or <code>$reply</code> if
it's an array --- i.e. if you specify <code>-A</code>, but no parameter name.  These
are more examples of the non-special parameters which the shell uses --- it
sets <code>$REPLY</code> or <code>$reply</code>, but only in the same way you would set
them; there are no side-effects.
<p>Like <code>print</code>, <code>read</code> has a <code>-r</code> flag for raw mode.  However, this
just has one effect for <code>read</code>:  without it, a <code>\</code> at the end of the
line specifies that the next line is a continuation of the current one (you
can do this when you're typing at the terminal).  With it, <code>\</code> is not
treated specially.
<p>Finally, a more sophisticated note about word-splitting.  I said that, when
you are reading to many parameters or an array, the word is split on
whitespace.  In fact the shell splits words on any of the characters found
in the (genuinely special, because it affects the shell's guts) parameter
<code>$IFS</code>, which stands for `input field separator'.  By default --- and in
the vast majority of uses --- it contains space, tab, newline and a null
character (character zero:  if you know that these are usually used to mark
the end of strings, you might be surprised the shell handles these as
ordinary characters, but it does, although printing them out usually
doesn't show anything).  However, you can set it to any string: enter
<pre>

  fn() {
    local IFS=:
    read -A array
    print -l $array
  }
  fn

</pre>

and type
<pre>

one word:two words:three words:four

</pre>

The shell will show you what's in the array it's read, one `word' per
line:
<pre>

one word
two words
three words
four

</pre>

You'll see the bananas, er, words (joke for the over-thirties) have been
treated as separated by a colon, not by whitespace.  Making <code>$IFS</code> local
didn't work in old versions of zsh, as with other specials; you had to save
it and restore it.
<p>The <code>read</code> command in zsh doesn't let you do line editing, which some
shells do.  For that, you should use the <code>vared</code> command, which runs the
line editor to edit a parameter, with the <code>-c</code> option, which allows
<code>vared</code> to create a new parameter.  It also takes the option <code>-p</code> to
specify a prompt, so one of the examples above can be rewritten
<pre>

  vared -c -p 'Please enter a line: ' line

</pre>

which works rather like read but with full editing support.  If you give
the option <code>-h</code> (history), you can even retrieve values from previous
command lines.  It doesn't have all the formatting options of read,
however, although when reading an array (use the option <code>-a</code> with <code>-c</code>
if creating a new array) it will perform splitting.
<p><p><strong>Other builtins to control parameters</strong><br><br>
    
<p>The remaining builtins which handle parameters can be dealt with more
swiftly.
<p>The builtin <code>set</code> simply sets the special parameter which is passed as an
argument to functions or scripts, and which you access as <code>$*</code> or <code>$@</code>,
or <code>$&lt;number&gt;</code> (Bourne-like format), or via <code>$argv</code> (csh-like format),
known however you set them as the `positional parameters':
<pre>

  % set a whole load of words
  % print $1
  a
  % print $*
  a whole load of words
  % print $argv[2,-2]
  whole load of

</pre>

It's exactly as if you were in a function and had called the function with
the arguments `<code>a whole load of words</code>'.  Actually, set can also be used
to set shell options, either as flags, e.g. `<code>set -x</code>', or as words after
`<code>-o</code>' , e.g. `<code>set -o xtrace</code>' does the same as the previous example.
It's generally easier to use <code>setopt</code>, and the upshot is that you need to
be careful when setting arguments this way in case they begin with a
`<code>-</code>'.  Putting `<code>-</code><code>-</code>' before the real arguments fixes this.
<p>One other use of <code>set</code> is to set any array, via
<pre>

  set -A any_array words to assign to any_array

</pre>

which is equivalent to (and the standard Korn shell version of)
<pre>

  any_array=(words to assign to any_array)

</pre>

One case where the <code>set</code> version is more useful is if the name of an
array itself comes from a parameter:
<pre>

  arrname=myarray
  set -A $arrname words to assign

</pre>

has no easy equivalent in the other form; the left hand side of an
ordinary assignment won't expand a parameter:
<pre>

  # Doesn't work; syntax error
  $arrname=(words to assign)

</pre>

This worked in old versions of zsh, but that was on the non-standard side.
The <code>eval</code> command, described below, gives another way around this.
<p>Next comes `<code>shift</code>', which simply moves an array up one element,
deleting the original first one.  Without an array name, it operates on the
positional parameters.  You can also give it a number to shift other than
one, before the array name.
<pre>

  shift array

</pre>

is equivalent to
<pre>

  array=(${array[2,-1]})

</pre>

(almost --- I'll leave the subtleties here for the chapter on expansion)
which picks the second to last elements of the array and assigns them back
to the original array.  Note, yet again, that <code>shift</code> operates using the
<em>name</em>, not the <em>value</em> of the array, so no `<code>$</code>' should appear in
front, otherwise you get something similar to the trick I showed for `<code>set
-A</code>'.
<p>Finally, <code>unset</code> unsets a parameter, and I already showed you could unset
a key/value pair of an associative array.  There is one subtlety to be
mentioned here.  Normally, <code>unset</code> just makes the parameter named
disappear off the face of the earth.  However, if you call <code>unset</code> in a
function, its ghost lives on in the sense that any parameter you create in
the same name will be scoped as the original parameter was.  Hence:
<pre>

  var='global value'
  fn() {
    typeset var='local value'
    unset var
    var='what about this?'
  }
  fn
  print $var

</pre>

The final statement prints `<code>global value</code>':  even though the local
copy of <code>$var</code> was unset, the shell remembers that it was local, so
the second <code>$var</code> in the function is also local and its value
disappears at the end of the function.
<p><a name="l38"></a>
<h3>3.2.7: History control commands</h3>
<p>The easiest way to access the shell's command history is by editing it
directly.  The second easiest way is to use the `<code>!</code>'-history mechanism.
Other ways of manipulating it are based around the <code>fc</code> builtin, which
probably once stood for something.  I talked quite a bit about it in the
last chapter, and don't really have anything to add.  Just note that
the two other commands based around it are <code>history</code> and <code>r</code>.
<p><a name="l39"></a>
<h3>3.2.8: Job control and process control</h3>
<p>One of the major contributions of the C-shell was job control.  You need
to know about foreground and background tasks, and again I introduced
these in the last chapter along with the options that control them.
Here is an introduction to the relevant builtins.
<p>You start a background job in two ways.  First, directly, by putting an
`<code>&amp;</code>' after it:
<pre>

  sleep 10 &amp;

</pre>

and secondly by starting it in the normal way (i.e. in the foreground),
then typing <code>^Z</code>, and using the <code>bg</code> command to put it in the
background.  Between typing <code>^Z</code> and <code>bg</code>, the job is still there, but
is not running; it is `suspended' or `stopped' (systems use different
descriptions for the same thing), waiting for you to decide what to do
with it.  In either case, the job then continues without the shell waiting
for it.  It will still try and read from or write to the terminal if
that's how you started it; you need to use the shell's redirection
facilities right at the start if you want to change that, there's
nothing you can do after the job has already started.
<p>A job will stop if it needs to read from the terminal.  You see a
message like:
<pre>

  [1]  + 1348 suspended (tty input)  jobname and arguments

</pre>

which means the job is suspended very much like you had just typed
<code>^Z</code>.  You need to bring the job into the forground, as described
below, so that you can type something to it.
<p>By the way, the key to type to suspend a command may not be <code>^Z</code>; it
usually is, but that can be changed.  Run `<code>stty -a</code>' and look for what
is listed after `<code>susp =</code>' --- probably, but not necessarily, <code>^Z</code>.  So
if you want to use another character --- it must be a single character;
this is handled deep in the terminal interface, not in the shell --- you
can run
<pre>

  stty susp '^]'

</pre>

or whatever.  You will note from the <code>stty</code> output that various other job
control characters can be changed similarly.  The <code>stty</code> command is
external and its format for both output and input can vary quite a bit
from system to system.
<p>Instead of putting the command into the background, you can bring it back
to the foreground again with <code>fg</code>.  This is useful for temporarily
stopping what you are doing so you can do something else.  These days you
would probably do it in another window; in the old days when people
logged in from simple terminals this was even more useful.  A typical
example of this is
<pre>

  more file                        # look at file
  ^Z                               # suspend
  [1] + 8592 suspended  more file  # message printed
  ...                              # do something else
  fg %1                            # resume the `more'

</pre>

The `<code>%</code>' is the usual way of referring to jobs.  The number after it is
what appeared in square brackets with the suspended message; I don't know
why the shell doesn't use the `<code>%</code>' notation there, too.  You also see
that with the `continued' message when you put something into the
background, and again at the end with the `done' message which tells you
a background job is finished.  The `<code>%</code>' can take other forms; the most
common is to follow it by the name of a command, such as `<code>%more</code>' in
this case.  The forms <code>%+</code> and <code>%-</code> refer to the most recent and second
most recent jobs --- the `<code>+</code>' in the `suspended' message is telling you
that the <code>more</code> job could be referred to like that.
<p>Most of the job control commands will actually assume you are talking
about `<code>%+</code>' if you don't give an argument, so assuming I hadn't
started any other commands in the background, I could just have put
`<code>fg</code>' at the end of the sequence of commands above.  This actually
cuts both ways: <code>fg</code> is the default operation on jobs referred to with
the `<code>%</code>' notation, so just typing `<code>%1</code>' with no command name would
have worked, too.
<p>You can jog your memory about what's going on with the `<code>jobs</code>'
command.  It looks like a series of messages of the form beginning with the
number in square brackets; usually the jobs will either be `running' or
`suspended'.  This will tell you the numbers you need.
<p>One other useful thing you can do with a job is to tell the shell to forget
about it.  This is only really useful if it is already running in the
background; then you can run `<code>disown</code>' with the job identifier.  It's
useful for jobs you want to continue after you've logged out, as well as
jobs that have their own windows which you can therefore control directly.
With disowned jobs, the shell doesn't warn you that they are still there
when you log out.  You can actually disown a background job when you start
it by putting `<code>&amp;|</code>' or `<code>&amp;!</code>' at the end of the line instead of simply
`<code>&amp;</code>'.  Note that if the job was suspended when you disowned it, it
will stay disowned; this is pretty pointless, so you probably should run
`<code>bg</code>' on it first.
<p>The next most likely thing you want to do with a job is kill it, or maybe
suspend it when it's already in the background and you can't just type
<code>^Z</code>.  This is where the <code>kill</code> builtin comes in.  There's more
to this than there is to the builtins mentioned above.  First, you can use
<code>kill</code> with other processes that weren't started from the current shell.
In that case, you would use a number to identify it, with no <code>%</code> ---
that's why the <code>%</code>'s were there in the other cases.  Of course, you need
to find out the number; the usual way is with the <code>ps</code> command, which is
not a builtin but which appears on all UNIX-like systems.  As a stupid
example, here I start a disowned process which does very little, look for
it, then kill it:
<pre>

  % sleep 60 &amp;|
  % ps -f
  UID        PID  PPID  C STIME TTY          TIME CMD
  pws        623   614  0 22:12 pts/0    00:00:00 zsh
  pws       8613   623  0 23:12 pts/0    00:00:00 sleep 60
  pws       8615   623  0 23:12 pts/0    00:00:00 ps -f
  % kill 8613
  % ps -f
  UID        PID  PPID  C STIME TTY          TIME CMD
  pws        623   614  0 22:12 pts/0    00:00:00 zsh
  pws       8616   623  0 23:12 pts/0    00:00:00 ps -f

</pre>

The process has disappeared the second time I look.  Notice that in the
usual lugubrious UNIX way the shell didn't bother to tell you the process
had been killed; however, it will report an error if it failed it to
send it the signal.  Sending it the signal is all the shell cares about;
the shell won't warn if you if the process decided it didn't want to die
when told to, so it's still a good idea to check.
<p>Sometimes you want to wait for a process to exit; the <code>wait</code> builtin can
do this, and like <code>kill</code> can take a process number as well as a job
number.  However, that's a bit deceptive --- you can't actually wait for a
process which wasn't started directly from the shell.  Indeed, the
mechanism for waiting is all bound up with the way UNIX handles processes;
unless its parent waits for it, a process becomes a `zombie' and hangs
around until the system's foster parent, the `init' process (always process
number 1) waits for it instead.  It's all a little bit baroque, but for the
shell user, wait just means you can hang on until something you started has
finished.  Indeed, that's how foreground processes work: the shell in
effect uses the internal version of <code>wait</code> to hang around until the job
exits.  (Well, actually that's a lie; the system wakes it up from whatever
it's doing to tell it a child has finished, so all it has to do is doze off
to wait.)
<p>Furthermore, you can wait for a process even if job control isn't running.
Job control, basically anything involving those <code>%</code>'s, is only useful
when you are sitting at a terminal fiddling with commands; it doesn't
operate when you run scripts, say.  Then the shell has much less freedom in
how to control its jobs, but it can still wait for a background process,
and it can still use <code>kill</code> on a process if it knows its number.  For
this purpose, the shell stores the ID of the last process started in the
background in the parameter <code>$!</code>; there's probably a good reason for
the `<code>!</code>', but I don't know what it is.  This happens regardles of job
control.
<p><p><strong>Signals</strong><br><br>
    
<p>The <code>kill</code> command can do a good deal more than just kill a process.
That is the default action, which is why the command has that name.  But
what it's really doing is sending a `signal' to a process.  Signals are
the simplest way of communicating to another process; in fact, they are
about the only simple way if you haven't made special arrangements for
the process to read messages from you.  Signal names are written like
<code>SIGINT</code>, <code>SIGTSTP</code>, <code>SIGKILL</code>; to send a particular signal to a
process, you remove the <code>SIG</code>, stick a hyphen in front, and use that
as the first argument to <code>kill</code>, e.g.:
<pre>

  kill -KILL 8613

</pre>

Some of the things you already know about are actually doing just that.
When you type <code>^C</code> to stop a process, you are actually sending it a
<code>SIGINT</code> for `interrupt', as if you had done
<pre>

  kill -INT 8613

</pre>

The usual signal sent by <code>kill</code> is not, as you might have guessed,
<code>SIGKILL</code>, but actually <code>SIGTERM</code> for `terminate'; <code>SIGKILL</code> is
stronger as the process can't block that signal, as it can with many (we'll
see how the shell can do that in a moment).  It's familiar to UNIX hackers
as `<code>kill -9</code>', because all the signals also have numbers.  You can see
the list of signals in zsh by doing:
<pre>

  % print $signals
  EXIT HUP INT QUIT ILL TRAP ABRT BUS FPE KILL USR1
  SEGV USR2 PIPE ALRM TERM STKFLT CLD CONT STOP TSTP
  TTIN TTOU URG XCPU XFSZ VTALRM PROF WINCH POLL PWR
  UNUSED ZERR DEBUG

</pre>

Your list will probably be different from mine; this is for Linux, and the
list is very system-specific, even though the first nine are generally the
same, and many of the others are virtually always present.  Actually,
<code>SIGEXIT</code> is an invention by the shell for you to allow the shell to do
something when a function exits (see the section on `traps' below); you
can't actually use `<code>kill -EXIT</code>'.  Thus <code>SIGHUP</code> is the first real
signal, and indeed that's number one, so you have to shift the contents
of <code>$signals</code> along one to get the right numbers.  <code>SIGTERM</code> and
<code>SIGINT</code> usually have the same effect, stopping the process, unless
that has decided to handle the signal some other way.
<p>The last two signals are bogus, too:  <code>SIGZERR</code> is to allow the shell
to do something on an error (non-zero exit status), while with
<code>SIGDEBUG</code> you can do it on every command.  Again, the `something' to
be executed is a `trap', as I'll discuss in a short while.
<p>Typing <code>^Z</code> to suspend a process actually sends the process a <code>SIGTSTP</code>
(terminal stop, since it usually comes from the terminal), while
<code>SIGSTOP</code> is similar but usually doesn't come from a terminal.  Even
restarting a process as with <code>bg</code> sends it a signal, in this case
<code>SIGCONT</code>.  It seems a bit odd to signal a process to restart; why can't
the operating system just restart it when you ask?  The real answer is
probably that signals provide an easy way for you to talk to the operating
system without grovelling around in the dirt too much.
<p>Before I talk about how you make the shell handle signals it receives,
there is one extra oddment: the <code>suspend</code> builtin effectively sends the
shell a signal to suspend it, as if you'd typed <code>^Z</code>, though as you've
probably found by now that doesn't suspend the shell itself.  It's only
useful to do this if the shell is running under some other programme, else
there's no way of restoring it and suspending is effectively the same as
exiting the shell.  For this reason, the shell won't let you call
<code>suspend</code> in a login shell, because it assumes that is running as the top
level (though in the previous chapter you learnt there's actually nothing
that special about login shells; you can start one just with `zsh -l').
If you're logged in remotely via <code>rsh</code> or <code>ssh</code>, it's usually more
convenient to use the keystrokes `<code>~^Z</code>' which those define, rather than
zsh's mechanism; they have to be at the beginning of a line, so hit return
first if necessary.  This returns you to your local terminal; you can
resume the remote login with `<code>fg</code>' just like any other programme.
<p><p><strong>Traps</strong><br><br>
    
<p>The way of making the shell handle signals is called `traps'.  There are
actually two mechanisms for this.  I'll present the more standard one and
then talk about the advantages and drawbacks of the other one at the end.
<p>The standard version (shared with other shells) is via the `<code>trap</code>'
builtin.  The first argument is a chunk of shell code to execute, which
obviously needs to be quoted when you pass it as an argument, and the
remaining arguments are a list of signals to handle, minus the <code>SIG</code>
prefix.  So:
<pre>

  trap "echo I\\'m trapped." INT

</pre>

tells the shell what to do on <code>SIGINT</code>, i.e. <code>^C</code>.  Note the extra
layer of quoting:  the double quotes surround the code, so that when they
are stripped <code>trap</code> sees the chunk
<pre>

  echo I\'m trapped

</pre>

Usually the shell would abort what it was doing and return to the main
prompt when you hit <code>^C</code>.  Now, however, it will simply print the message
and carry on.  You can try this, for example, with
<pre>

  read line

</pre>

If you hit <code>^C</code> while it's waiting for input, you'll see the message go
up, but the shell will still wait for you to type a line.
<p>A warning about this:  <code>^C</code> is only trapped within the shell itself.  If
you start up an external programme, it will have its own mechanism for
handling signals, and if it usually aborts on <code>^C</code> it still will.  But
there's a sting in the tail: do
<pre>

  cat

</pre>

which waits for input to output again (you need to use <code>^D</code> to exit
normally).  If you type <code>^C</code> here, the command will be aborted, as I said
--- but you still get the message `<code>I'm trapped</code>'.  That's because the
shell is able to tell that the command got that particular signal, and
calls the trap when the <code>cat</code> exits.  Not all shells do this;
furthermore, some commands which handle signals themselves won't give the
shell enough information to know that a signal arrived, and in that case
the trap won't be called.  Such commands are usually the more sophisticated
things like editors or screen managers or whatever; you just have to find
out by trial and error.
<p>You can also make the shell ignore the signal completely.  To do this, the
first argument should be an empty string:
<pre>

  trap '' INT

</pre>

Now <code>^C</code> will have no effect, and <em>this</em> time the effect <em>is</em> passed
on directly to commands called from the shell --- try the <code>cat</code> example
and you won't be able to interrupt it; type <code>^D</code> or use the lesser known
but more powerful <code>^\</code> (control with backslash), which sends
<code>SIGQUIT</code>.  If it hasn't been disabled, this will also produce a file
<code>core</code>, which contains debugging information about what the programme was
doing when it exited --- never call your own files <code>core</code>.  You can trap
<code>SIGQUIT</code> too, if you want.  (The shell itself usually ignores
<code>SIGQUIT</code>; it's only useful for external commands.)
<p>Now the other sort of trap.  I could have written for the first example:
<pre>

  TRAPINT() {
    print I\'m trapped.
  }

</pre>

As you can see, this is just a function:  functions beginning <code>TRAP</code> are
special.  However, it's a real function too; you can call it by hand with
the command `TRAPINT', and it will run perfectly happily with no funny side
effects.
<p>There is a difference between the way the two types work.  In the
`<code>trap</code>' sort of trap, the code is just evaluated just as if it appeared
as instructions to the shell at the point where the trap happened.  So if
you were in a function, you would see the environment of that function with
its local variables; if you set a local variable with <code>typeset</code>, it would
be visible in the function just as if it were created there.
<p>However, in the function type of trap, the code is provided with its own
function environment.  Now if you use <code>typeset</code> the parameter created is
local only to the trap.  In most cases, that's all the difference there is;
it's up to you to decide which is more convenient.  As you can see, the
function type of trap doesn't require the extra layer of quoting, so looks
a little smarter.  Conveniently, the `<code>trap</code>' command on its own lists
all traps in the form of the shell code you'd need to recreate them, and you
can see which sort is which.
<p>There are two cases where the difference sticks out.  One is that the
function type has some extra wiring to allow you both to trap a signal,
and pretend to anyone watching that the shell didn't handle it.  An example
will show this:
<pre>

  TRAPINT() {
    print "Signal caught, stopping anyway."
    return $(( 128 + $1 ))
  }

</pre>

That second line may look as rococo as the Amalienburg, but it's meaning is
this: <code>$1</code>, the first argument to the function, is set to the number of
the signal.  In this case it will be 2 because that's the standard number
for <code>SIGINT</code>.  That means the arithmetic substitution <code>$((...))</code>
returns 130, the command `<code>return 130</code>' is executed, and the function
returns with status 130.  Returning with non-zero status is special in
function traps:  it tells the shell you want to abort the surrounding
command even though the trap was handled, and that you want the status
associated with that to be 130.  It so happens that this is how UNIX handles
returns from normal traps.  Without setting a trap, do
<pre>

  % cat
  ^C
  % print $?

</pre>

and you'll see that this, too, has given the status 130, 128 plus the value
of <code>SIGINT</code>.  So if you <em>do</em> have the trap set, you'll see the message,
but the command will abort --- even if it was running inside the shell.
<p>Try
<pre>

  % read line
  ^C

</pre>

to see that happening.  If you look at the status in <code>$?</code> you'll find
it's actually 1, not 130; that's because the <code>read</code> command, when it
aborted, overrode the return value from the trap.  But it does that with an
untrapped <code>^C</code>, too, so that's not really an exception to what I've just
said.
<p>If you've been paying attention, you'll realise that traps set with the
<code>trap</code> builtin can't do it in quite this way, because the function they
return from would be whatever function you were in.  You can see that:
<pre>

  trap 'echo Returning...; return;' INT
  fn() {
    print In fn...
    read param
    print Leaving fn..
  }

</pre>

If you run <code>fn</code> and hit <code>^C</code>, the signal is trapped and the message
printed, but because of the <code>return</code>, the shell quits <code>fn</code> immediately
and you don't see the final message.  If you missed out the `<code>return;</code>'
(try it), the shell would carry on with the rest of <code>fn</code> after you typed
something to <code>read</code>.  Of course you can use this mechanism to leave
functions after trapping a signal; it just so happens that in this case the
mechanism with <code>TRAPINT</code> is a little closer to what untrapped signals do
and hence a little neater.
<p>One final flourish of late Baroque splendour:  the trap for <code>SIGEXIT</code>, the
one called when a function (or the shell itself, in fact) exits is a bit
special because in the case of exiting a function it will be called in the
environment of the calling function.  So if you need to do something like
set a local variable for an enclosing function you can have
<pre>

  trap 'typeset param_in_enclosing_func=value' EXIT

</pre>

do it for you; you couldn't do that with <code>TRAPEXIT</code> because the code
would have its own function, so that even though it would be called after
the first function exited, it wouldn't run directly in the enclosing one
but in a separate <code>TRAPEXIT</code> function.  You can even set an EXIT trap
for the enclosing function by defining a nested `<code>trap .. EXIT</code>'
inside that trap itself.
<p>I lied, because there is one more special thing about <code>TRAPEXIT</code>: it's
always reset after you exit a function and the trap itself has been called.
Most traps just hang around until you explicitly unset them.  There is an
option, <code>LOCAL_TRAPS</code>, which makes traps set inside functions as well
insulated as possible from those outside, or inside deeper functions.  In
other words, the old trap is saved and then restored when you exit the
function; the scoping works pretty much like that for <code>typeset</code>, and in
the same way traps for the enclosing scope, apart from any for <code>EXIT</code>,
remain in effect inside a function unless you explicitly override them;
and, again in the same way, if you unset it inside the function it will
still be restored on exit.
<p><code>LOCAL_TRAPS</code> is the fixed behaviour of some other shells.  In zsh,
without the option set:
<pre>

  trap 'echo Hi.' INT
  fn() {
     trap 'echo Bye.' INT
  }

</pre>

Calling <code>fn</code> simply replaces the trap defined outside the function with
the one defined inside while:
<pre>

  trap 'echo Hi.' INT
  fn() {
     setopt localtraps
     trap 'echo Bye.' INT
  }

</pre>

puts the original `Hi' trap back after the function exits.
<p>I haven't told you how to unset a trap for good:  the answer is
<pre>

 trap - INT

</pre>

As you would guess, you can use <code>unfunction</code> with function-type traps;
that will correctly remove the trap as well as deleting the function.
However, `<code>trap -</code>' works with both, so that's the recommended way.
<p><p><strong>Limits on processes</strong><br><br>
    
<p>One other way that jobs started by the shell can be controlled is by using
limits.  These are actually limits set by the operating system, but the
shell gives you a way of controlling them: the <code>limit</code> and <code>unlimit</code>
commands.  Type `<code>limit</code>' on its own to see a summary.  I get:
<pre>

  cputime         unlimited
  filesize        unlimited
  datasize        unlimited
  stacksize       8MB
  coredumpsize    0kB
  memoryuse       unlimited
  maxproc         2048
  descriptors     1024
  memorylocked    unlimited
  addressspace    unlimited

</pre>

where the item on the left of each line is what is being limited, and on
the right is the value.  The manual page to look at, at least on Linux is
for the function <code>getrusage</code>; that's the function the shell is calling
when you run <code>limit</code> or <code>unlimit</code>.
<p>In this case, the items are:
<dl>
  <p></p><dt><strong><code>cputime</code></strong><dd> the total CPU time used by a process
  <p></p><dt><strong><code>filesize</code></strong><dd> maximum size of a file
  <p></p><dt><strong><code>datasize</code></strong><dd> the maximum size of data in use by a programme
  <p></p><dt><strong><code>stacksize</code></strong><dd> the maximum size of the stack, which is the area
    of memory used to store information during function calls
  <p></p><dt><strong><code>coredumpsize</code></strong><dd> the maximum size of a <code>core</code> file, which is
    an image of memory left by a programme that crashes, allowing you
    to debug it with <code>gdb</code>, <code>dbx</code>, <code>ddd</code> or some other debugger
  <p></p><dt><strong><code>memoryuse</code></strong><dd> the maximum main memory, i.e. programme memory which
    is in active use and hasn't been `swapped out' to disk
  <p></p><dt><strong><code>maxproc</code></strong><dd> the maximum number of simultaneous processes
  <p></p><dt><strong><code>descriptors</code></strong><dd> the maximum number of simultaneously open
    files (`descriptors' are the internal mechanism for referring to an
    open file on UNIX-like systems)
  <p></p><dt><strong><code>memorylocked</code></strong><dd> the maximum amount of memory locked in (I don't
    know what that is, either)
  <p></p><dt><strong><code>addressspace</code></strong><dd> the total amount of virtual memory,
    i.e. any memory whether it is main memory, or refers to somewhere on
    a disk, or indeed anything else.
</dl>
You may well see other names; the shell decides when it is compiled what
limits are supported by the system.
<p>Of those, the one I use most commonly is <code>coredumpsize</code>:  sometimes when
I'm debugging a file I want a crashed programme to produce a `<code>core</code>'
files so I can run <code>gdb</code> or <code>dbx</code> on it (`<code>unlimit coredumpsize</code>'),
while other times they are just untidy (`<code>limit coredumpsize 0</code>').
Probably you would only alter any of the others if you knew there was a
problem, for example a number-crunching programme used so much memory that
the rest of the system was badly affected and you wanted to limit
<code>datasize</code> to 64 megabyte or whatever.  You could write this as:
<pre>

  limit datasize 64m

</pre>

<p>There is a distinction made between `hard' and `soft' limits.  Both have
the same effect on programmes, but you can remove or reduce `soft' limits,
while only the superuser (the system administrator's login, root) can do
that to `hard' limits.  Usually, therefore, <code>limit</code> and <code>unlimit</code>
manipulate soft limits; to show or set hard limits, give the option
<code>-h</code>.  If I do `<code>limit -h</code>', I get the same list of limits as above,
but with <code>stacksize</code> and <code>coredumpsize</code> unlimited --- that means I can
reduce or remove the limits on those if I want, they're just set for my own
convenience.
<p>Why is <code>stacksize</code> set in this way?  As I said, it refers to the memory
in which the functions in programmes store variables and any other local
information.  If one function calls another, it uses more memory.  You can
get into a situation where functions call themselves recursively and there
is no way out until the machine runs out of memory; limiting <code>stacksize</code>
prevents this.  You can actually see this with zsh itself (probably better
not to try this if you'd rather the shell you're running didn't crash):
<pre>

  % fn() { fn; }
  % fn

</pre>

defines a function which keeps calling itself.  To do this, all the
functions <em>inside</em> zsh are calling themselves as well, using more and
more stack memory.  Actually, zsh uses other forms of memory inside each
function and my version of zsh crashes due to exhaustion of that memory
instead.  However, it depends on the system how this works out.
<p><p><strong>Times</strong><br><br>
    
<p>One way of returning information on process resources is with the
`<code>times</code>' command.  It simply shows the total CPU time used by the
shell and by the programmes called for it --- in that order, and without
description, so you need to remember.  On each line, the first number is
the time spent in user space and the second is the time spent in system
space.  If you're not concerned about the details of programmes the
difference is pretty irrelevant, but if you are, then the difference is
very roughly that between the time spent in the code you actually see
before you compile a programme, and the time spent in `hidden' code
where the system is doing something for you.  It's not such an obvious
distinction, because many library routines, such as mathematical
functions, are run in user mode as no privileged access to internal bits
of the system is required.  Typically, system time is concerned with the
details of input and output --- though even there it's not so simple,
because the C output routines <code>printf</code>, <code>puts</code>, <code>fread</code> and others
have user mode code which then calls the system routines <code>read</code>,
<code>write</code> and so on.
<p>You can measure the time taken by a particular external command by
putting `<code>time</code>', in the singular this time, in front of it; this is
essentially another precommand modifier, and is a shell reserved word
rather than a builtin.  This gives fairly obvious information.  You can
specify the information using the <code>$TIMEFMT</code> parameter, which has its
own percent escapes, different from the ones used in prompts.  It exists
partly because the shell allowed you to access all sorts of other
information about the process which ran, such as `page faults' ---
occasions when the system had to fetch a part of the programme or data
from disk because it wasn't in the main memory.  However, that
disappeared because it was too much work to convert the feature to
configure itself automatically for different operating systems.  It may
be time to resurrect it.
<p>You can also force the time to be shown automatically by setting the
parameter <code>$REPORTTIME</code>; if a command runs for more than this many
seconds, the <code>$TIMEFMT</code> output will be shown automatically.
<p><a name="l40"></a>
<h3>3.2.9: Terminals, users, etc.</h3>
<p><p><strong>Watching for other users</strong><br><br>
    
<p>Although this is more associated with parameters than builtins, the
`<code>log</code>' command will tell you whether any of a group of people you want
to watch out for have logged in or out.  To use this, you set the <code>$watch</code>
array parameter to a list of user names, or `<code>all</code>' for everyone, or
`<code>notme</code>' for everyone except yourself.  Even if you don't use <code>log</code>,
any changes will be reported just before the shell prints a prompt.  It
will be printed using the <code>$WATCHFMT</code> parameter:  once again, this takes
its own set of percent escapes, listed in the <code>zshparam</code> manual.
<p><p><strong><code>ttyctl</code></strong><br><br>
    
<p>There is a command <code>ttyctl</code> which is designed to keep badly behaved
external commands from messing up the terminal settings.  Most programmes
are careful to restore any settings they change, but there are exceptions.
After `<code>ttyctl -f</code>', the terminal is frozen; zsh will restore the
settings, no matter what an external programme does with it.  This includes
deliberate attempts to change the terminal settings with the `<code>stty</code>'
command, so the default is unfrozen, `<code>ttyctl -u</code>'.
<p><a name="l41"></a>
<h3>3.2.10: Syntactic oddments</h3>
<p>This section collects together a few builtins which, rather than
controlling the behaviour of some feature of the shell, have some other
special effect.
<p><p><strong>Controlling programme flow</strong><br><br>
    
<p>The four functions here are <code>exit</code>, <code>return</code>, <code>break</code>, <code>continue</code>
and <code>source</code> or <code>.</code>: they determine what the shell does next.  You've met
<code>exit</code> --- leave the shell altogether --- and <code>return</code> --- leave the
current function.  Be very careful not to confuse them.  Calling <code>exit</code>
in a shell function is usually bad:
<pre>

  % fn() { exit; }
  % fn

</pre>

This makes you entire shell session go away, not just the function.  If you
write C programmes, you should be very familiar with both, although there
is one difference in this case:  <code>return</code> at the top level in an
interactive shell actually does nothing, rather than leaving the shell as
you might expect.  However, in a script, return outside a function
<em>does</em> cause the entire script to stop.  The reason for this is that
zsh allows you to write autoloaded functions in the same form as
scripts, so that they can be used as either; this wouldn't work if
<code>return</code> did nothing when the file was run as a script.  Other shells
don't do this:  <code>return</code> does nothing at the top level of a script, as
well as interactively.  However, other shells don't have the feature
that function definition files can be run as scripts, either.
<p>The next two commands, <code>break</code> and <code>continue</code>, are to do with
constructs like `<code>if</code>'-blocks and loops, and it will be much easier if I
introduce them when I talk about those below.  They will also already be
familiar to C programmers.  (If you are a FORTRAN programmer, however,
<code>continue</code> is <em>not</em> the statement you are familiar with; it is instead
equivalent to <code>CYCLE</code> in FORTRAN90.)
<p>The final pair of commands are <code>.</code> and <code>source</code>.  They are similar to
one another and cause another file to be read as a stream of commands in
the current shell --- not as a script, for which a new shell would be
started which would finish at the end of the script.  The two are intended
for running a series of commands which have some effect on the current shell,
exactly like the startup files.  Indeed, it's a very common use to have a
call to one or other in a startup file; I have in my <code>~/.zshrc</code>
<pre>

  [[ -f ~/.aliasrc ]] &amp;&amp; . ~/.aliasrc

</pre>

which tests if the file <code>~/.aliasrc</code> exists, and if so runs the commands
in it; they are treated exactly as if they had appeared directly at that
point in <code>.zshrc</code>.
<p>Note that your <code>$path</code> is used to find the file to read from; this is a
little surprising if you think of this as like a script being run, since
zsh doesn't search for a script, it uses the name exactly as you gave it.
In particular, if you don't have `<code>.</code>' in your <code>$path</code> and you use the
form `<code>.</code>' rather than `<code>source</code>' you will need to say explicitly when
you want to source a file in the current directory:
<pre>

  . ./file

</pre>

otherwise it won't be found.
<p>It's a little bit like running a function, with the file as the function
body.  Indeed, the shell will set the positional parameters <code>$*</code> in
just the same way.  However, there's a crucial difference: there is no
local parameter scope.  Any variables in a sourced file, as in one of
the startup files, are in the same scope as the point from which it was
started.  You can, therefore, source a file from inside a function and
have the parameters in the sourced file local, but normally the only way
of having parameters only for use in a sourced file is to unset them
when you are finished.
<p>The fact that both <code>.</code> and <code>source</code> exist is historical:  the former
comes from the Bourne shell, and the latter from the C shell, which seems
deliberately to have done everything differently.  The point noted above,
that source always searches the current directory (and searches it first),
is the only difference.
<p><p><strong>Re-evaluating an expression</strong><br><br>
    
<p>Sometimes it's very useful to take a string and run it as if it were a set
of shell commands.  This is what <code>eval</code> does.  More precisely, it sticks
the arguments together with spaces and calls them.  In the case of
something like
<pre>

  eval print Hello.

</pre>

this isn't very useful; that's no different from a simple
<pre>

  print Hello.

</pre>

The difference comes when what's on the command line has something to be
expanded, like a parameter:
<pre>

  param='print Hello.'
  eval $param

</pre>

Here, the <code>$param</code> is expanded just as it would be for a normal command.
Then <code>eval</code> gets the string `<code>print Hello.</code>' and executes it as shell
command line.  Everything --- really everything --- that the shell would
normally do to execute a command line is done again; in effect, it's run as
a little function, except that no local context for parameters is created.
If this sounds familiar, that's because it's exactly the way traps defined
in the form
<pre>

  trap 'print Hello.' EXIT

</pre>

are called.  This is one simple way out of the hole you can sometimes get
yourself into when you have a parameter which contains the name of
another parameter, instead of some data, and you want to get your hands on
the data:
<pre>

  # somewhere above...
  origdata='I am data.'
  # but all you know about is
  paramname=origdata
  # so to extract the data you can do...
  eval data=\$$paramname

</pre>

Now <code>$data</code> contains the value you want.  Make sure you understand the
series of expansions going on:  this sort of thing can get very confusing.
First the command line is expanded just as normal.  This turns the argument
to <code>eval</code> into `<code>data=$origdata</code>'.  The `<code>$</code>' that's still there was
quoted by a backslash; the backslash was stripped and the `<code>$</code>' left; the
<code>$paramname</code> was evaluated completely separately --- quoted characters
like the <code>\$</code> don't have any effect on expansions --- to give
<code>origdata</code>.  Eval calls the new line `<code>data=$origdata</code>' as a command
in its own right, with the now obvious effect.  If you're even slightly
confused, the best thing to do is simply to quote everything you don't want
to be immediately expanded:
<pre>

  eval 'data=$'$paramname

</pre>

or even
<pre>

  eval 'data=${'$paramname'}'

</pre>

may perhaps make your intentions more obvious.
<p>It's possible when you're starting out to confuse `<code>eval</code>' with the
<code>`...`</code> and <code>$(...)</code> commands, which also take the command in the
middle `<code>...</code>' and evaluate it as a command line.  However, these two
(they're identical except for the syntax) then insert the output of that
command back into the command line, while <code>eval</code> does no such thing;
it has no effect at all on where input and output go.  Conversely,
the two forms of command substitution don't do an extra level of
expansion.  Compare:
<pre>

  % foo='print bar'
  % eval $foo
  bar

</pre>

with
<pre>

  % foo='print bar'
  % echo $($foo)
  zsh: command not found: print bar


</pre>

The <code>$</code>(...) substitution took <code>$foo</code> as the command
line.  As you are now painfully aware, zsh doesn't split scalar
parameters, so this was turned into the single word `<code>print bar</code>',
which isn't a command.  The blank line is `<code>echo</code>' printing the empty
result of the failed substitution.
<p><a name="l42"></a>
<h3>3.2.11: More precommand modifiers: <code>exec</code>, <code>noglob</code></h3>
<p>Sometimes you want to run a command <em>instead</em> of the shell.  This
sometimes happens when you write a shell script to process the arguments to
an external command, or set parameters for it, then call that command.  For
example:
<pre>

  export MOZILLA_HOME=/usr/local/netscape
  netscape "$@"

</pre>

Run as a script, this sets an environment variable, then starts
<code>netscape</code>.  However, as always the shell waits for the command to
finish.  That's rather wasteful here, since there's nothing more for the
shell to do; you'd rather it simply magically turned into the <code>netscape</code>
command.  You can actually do this:
<pre>

  export MOZILLA_HOME=/usr/local/netscape
  exec netscape "$@"

</pre>

`<code>exec</code>' tells the shell that it doesn't need to wait; it can just
make the command to run replace the shell.  So this only uses a single
process.
<p>Normally, you should be careful not to use <code>exec</code> interactively, since
normally you don't want the shell to go away.  One legitimate use is to
replace the current zsh with a brand new one if (say) you've set a whole
load of options you don't like and want to restore the ones you usually
have on startup:
<pre>

  exec zsh

</pre>

Or you may have the bad taste to start a completely different shell
altogether.  Conversely, a good piece of news about <code>exec</code> is that it is
common to all shells, so you can use it from another shell to start zsh in
the way I've just shown.
<p>Like `<code>command</code>' and `<code>builtin</code>', `<code>exec</code>' is a `precommand modifier'
in that it alters the way a command line is interpreted.  Here's one more:
<pre>

  noglob print *

</pre>

If you've remembered what `glob' means, this is all fairly obvious.  It
instructs the shell not to turn the `<code>*</code>' into a list of all the files in
the directory, but instead to let well alone.  You can do this by quoting
the `<code>*</code>', of course; often <code>noglob</code> is used as part of an alias to set
up commands where you never need filename generation and don't want to have
to bother quoting everything.  However, note that <code>noglob</code> has no effect
on any other type of expansion:  parameter expansion and backquote
(<code>`....`</code>) expansion, for example, happen as normal; the only thing that
doesn't is turning patterns into a list of matching files.  So it doesn't
take away the necessity of knowing the rules of shell expansion.  If you
need that, the best thing to do is to use <code>read</code> or <code>vared</code> (see
below) to read a line into a parameter, which you pass to your function:
<pre>

  read -r param
  print $param

</pre>

The <code>-r</code> makes sure <code>$param</code> is the unadulterated input.
<p><a name="l43"></a>
<h3>3.2.12: Testing things</h3>
<p>I told you in the last chapter that the right way to write tests in zsh was
using the `<code>[[ ... ]]</code>' form, and why.  So you can ignore the two
builtins `<code>test</code>' and `<code>[</code>', even though they're the ones that resemble
the Bourne shell.  You can safely write
<pre>

  if [[ $foo = '' ]]; then
    print The parameter foo is empty.  O, misery me.
  fi

</pre>

or
<pre>

  if [[ -z $foo ]]; then
    print Alack and alas, foo still has nothing in it.
  fi

</pre>

instead of monstrosities like
<pre>

  if test x$foo != x; then
    echo The emptiness of foo.  Yet are we not all empty\?
  fi

</pre>

because even if <code>$foo</code> does expand to an empty string, which is what is
implied if the tests are true, `<code>[[ ... ]]</code>' remembers there was
something there and gets the syntax right.  Rather than a builtin, this is
actually a reserved word --- in fact it has to be, to be syntactically
special --- but you probably aren't too bothered about the difference.
<p>There are two sorts of tests, both shown above:  those with three
arguments, and those with two.  The three-argument forms all have some
comparison in the middle; in addition to `<code>=</code>' (or `<code>==</code>', which
means the same here, and which according to the manual page we should be
using, though none of us does), there are `<code>!=</code>' (not equal), `<code>&lt;</code>',
`<code>&gt;</code>', `<code>&lt;=</code>' and `<code>&gt;=</code>'.  All these do <em>string</em> comparisons,
i.e. they compare the sort order of the strings.
<p>Since there are better ways of sorting things in zsh, the `<code>=</code>' and
`<code>!=</code>' forms are by far the most common.  Actually, they do something a
bit more than string comparison: the expression on the right can be a
pattern.  The patterns understood are just the same as for matching
filenames, except that `<code>/</code>' isn't special, so it can be matched by a
`<code>*</code>'.  Note that, because `<code>=</code>' and `<code>!=</code>' are treated specially by
the shell, you shouldn't quote the patterns:  you might think that unless
you do, they'll be turned into file names, but they won't.  So
<pre>

  if [[ biryani = b* ]]; then
    print Word begins with a b.
  fi

</pre>

works.  If you'd written <code>'b*'</code>, including the quotes, it wouldn't have
been treated as a pattern; it would have tested for a string which was
exactly the two letters `<code>b*</code>' and nothing else.  Pattern matching like
this can be very powerful.  If you've done any Bourne shell programming,
you may remember the only way to use patterns there was via the `<code>case</code>'
construction:  that's still in zsh (see below), and uses the same sort of
patterns, but the test form shown above is often more useful.
<p>Then there are other three-argument tests which do numeric comparison.
Rather oddly, these use letters rather than mathematical symbols:
`<code>-eq</code>', `<code>-lt</code>' and `<code>-le</code>' compare if two numbers are equal, less
than, or less than or equal, to one another.  You can guess what `<code>-gt</code>'
and `<code>-ge</code>' do.  Note this is the other way round to Perl, which much
more logically uses `<code>==</code>' to test for equality of numbers (not `<code>=</code>',
since that's always an assignment operator in Perl) and `<code>eq</code>' (minus the
minus) to test for equality of strings.  Unfortunately we're now stuck with
it this way round.  If you are only comparing numbers, it's better to use the
`<code>(( ... ))</code>' expression, because that has a proper understanding of
arithmetic.  However,
<pre>

  if [[ $number -gt 3 ]]; then
    print Wow, that\'s big
  fi

</pre>

and
<pre>

  if (( $number &gt; 3 )); then
    print Wow, that\'s STILL big
  fi

</pre>

are essentially equivalent.  In the second case, the status is zero (true)
if the number in the expression was non-zero (sorry if I'm confusing you
again) and vice versa.  This means that
<pre>

  if (( 3 )); then
    print It seems that 3 is non-zero, Watson.
  fi

</pre>

is a perfectly valid test.  As in C, the test operators in arithmetic
return 1 for true and 0 for false, i.e. `<code>$number &gt; 3</code>' is 1 if <code>$number</code>
is greater than 3 and 0 otherwise; the inversion to shell logic, zero for
true, only occurs at the final step when the expression has been completely
evaluated and the `<code>(( ... ))</code>' command returns.  At least with `<code>[[
... ]]</code>' you don't need to worry about the extra negation; you can simply
think in logical terms (although that's hard enough for a lot of people).
<p>Finally, there are a few other odd comparisons in the three-argument form:
<pre>

  if [[ file1 -nt file2 ]]; then
    print file1 is newer than file2
  fi

</pre>

does the test implied by the example; there is also `<code>-ot</code>' to test for
an older file, and there is also the little-used `<code>-ef</code>' which tests for
an `equivalent file', meaning that they refer to the same file --- in other
words, are linked; this can be a hard or a symbolic link, and in the second
case it doesn't matter which of the two is the symbolic link.  (If you were
paying attention above, you'll know it can't possibly matter in the first
case.)
<p>In addition to these tests, which are pretty recognisable from most
programming languages --- although you'll just have to remember that the
`<code>=</code>' family compares strings and not numbers --- there are another set
which are largely peculiar to UNIXy scripting languages.  These are all in
the form of a hyphen followed by a letter as the test, which always takes a
single argument.  I showed one:  `-z $var' tests whether `<code>$var</code>' has
zero length.  It's opposite is `-n $var' which tests for non-zero length.
Perhaps this is as good a time as any to point out that the arguments to
these commands can be any single word expression, not just variables or
filenames.  You are quite at liberty to test
<pre>

  if [[ -z "$var is sqrt(`print bibble`)" ]]; then
    print Flying pig detected.
  fi

</pre>

if you like.  In fact, the tests are so eager to make sure that they only
have a one word argument that they will treat things like arrays, which
usually return a whole set of words, as if they were in double quotes,
joining the bits with spaces:
<pre>

  array=(two words)
  if [[ $array = 'two words' ]]; then
    print "The array \$array is OK.  O, joy."
  fi

</pre>

<p>Apart from `<code>-z</code>' and `<code>-n</code>', most of the two-argument tests are to do
with files: `<code>-e</code>' tests that the file named next exists, whatever type
of file it is (it might be a directory or something weirder); `<code>-f</code>'
tests if it exists and is a regular file (so it isn't a directory or
anything weird this time); `<code>-x</code>' tests whether you can execute it.
There are all sorts of others which are listed in the manual page for
various properties of files.  Then there are a couple of others: `<code>-o
&lt;option&gt;</code>' you've met and tests whether the option is set, and `<code>-t
&lt;fd&gt;</code>' tests whether the file descriptor is attached to a terminal.  A file
descriptor is a number which for the shell must be between 0 and 9
inclusive (others may exist, but you can't access them directly); 0 is the
standard input, 1 the standard output, and 2 the channel on which errors
are usually printed.  Hence `<code>[[ -t 0 ]]</code>' tests whether the input is
coming from a terminal.
<p>There are only four other things making up tests.  `<code>&amp;&amp;</code>' and `<code>||</code>'
mean logical `and' and `or', `<code>!</code>' negates the effect of a test, and
parentheses `<code>( ... )</code>' can be used to surround a set of tests which are
to be treated as one.  These are all essentially the same as in C.  So
<pre>

  if [[ 3 -gt 2 &amp;&amp; ( me &gt; you || ! -z bah ) ]]; then
    print will I, won\'t I...
  fi

</pre>

will, because 3 is numerically greater than 2; the expression in
parentheses is evaluated and though `me' actually comes before `you' in the
alphabet, so the first test fails, `<code>-z bah</code>' is false because you gave
it a non-empty string, and hence `<code>! -z bah</code>' is true.  So both sides of
the `<code>&amp;&amp;</code>' are true and the test succeeds.
<p><a name="l44"></a>
<h3>3.2.13: Handling options to functions and scripts</h3>
<p>It's often convenient to have your own functions and scripts process
single-letter options in the way a lot of builtin command (as well as a
great many other UNIX-style commands) do.  The shell provides a builtin for
this called `<code>getopts</code>'.  This should always be called in some kind of
loop, usually a `<code>while</code>' loop.  It's easiest to explain by example.
<pre>

  testopts() {
    # $opt will hold the current option
    local opt
    while getopts ab: opt; do
      # loop continues till options finished
      # see which pattern $opt matches...
      case $opt in
        (a)
           print Option a set
           ;;
        (b)
           print Option b set to $OPTARG
           ;;
	# matches a question mark
	# (and nothing else, see text)
        (\?)
           print Bad option, aborting.
           return 1
           ;;
      esac
    done
    (( OPTIND &gt; 1 )) &amp;&amp; shift $(( OPTIND - 1 ))
    print Remaining arguments are: $*
  }

</pre>

There's quite a lot here if you're new to shell programming.  You might
want to read the stuff on structures like <code>while</code> and <code>case</code> below
and then come back and look at this.  First let's see what it does.
<pre>

  % testopts -b foo -a -- args
  Option b set to foo
  Option a set
  Remaining arguments are: args

</pre>

<p>Here's what's happening.  `<code>getopts ab: opt</code>' is the argument to the
`<code>while</code>'.  That means that the <code>getopts</code> gets run; if it succeeds
(returns status zero), then the loop after the `<code>do</code>' is run.  When
that's finished, the <code>getopts</code> command is run again, and so on until it
fails (returns a non-zero status).  It will do that when there are no more
options left on the command line.  So the loop processes the options one by
one.  Each time through, the number of the next argument to look at is left
in the parameter <code>$OPTIND</code>, so this gradually increases; that's how
<code>getopts</code> knows how far it has got.
<p>The first argument to the <code>getopts</code> is `<code>ab:</code>'.  That means `<code>a</code>' is
an option which doesn't take an argument, while `<code>b</code>' is an argument
which takes a single argument, signified by the colon after it.  You can
have any number of single-letter (or even digit) arguments, which are
case-sensitive; for example `<code>ab:c:ABC:</code>' are six different options,
three with arguments.  If the option found has an argument, that is stored
in the parameter <code>$OPTARG</code>; <code>getopts</code> then increments <code>$OPTIND</code> by
however much is necessary, which may be 2 or just 1 since `<code>-b foo</code>' and
`<code>-bfoo</code>' are both valid ways of giving the argument.
<p>If an option is found, we use the `<code>case</code>' mechanism to find out what
it was.  The idea of this is simple, even if the syntax has the look of
an 18th-century French chateau: the argument `<code>$opt</code>' is tested
against all of the patterns in the `<code>pattern</code>)' lines until one
matches, and the commands are executed until the next `<code>;;</code>'.  It's
the shell's equivalent of C's `<code>switch</code>'.  In this example, we just
print out what the <code>getopts</code> brought in.  Note the last line, which is
called if <code>$opt</code> is a question mark --- it's quoted because `<code>?</code>' on
its own can stand for any single character.  This is how <code>getopts</code>
signals an unknown option.  If you try it, you'll see that <code>getopts</code>
prints its own error message, so ours was unnecessary: you can turn the
former off by putting a colon right at the start of the list of options,
making it `<code>:ab:</code>' here.
<p>Actually, having this last pattern as an <em>un</em>quoted `<code>?</code>' isn't such a
bad idea.  Suppose you add a letter to the list that <code>getopts</code> should
handle and forget to add a corresponding item in the <code>case</code> list for it.
If the last item matches any character, you will get the behaviour for an
unhandled option, which is probably the best thing to do.  Otherwise
nothing in the <code>case</code> list will match, the shell will sail blithely on to
the next call to <code>getopts</code>, and when you try to use the function with the
new option you will be left wondering quite what happened to it.
<p>The last piece of the <code>getopts</code> jigsaw is the next line, which tests if
<code>$OPTIND</code> is larger than 1, i.e. an option was found and <code>$OPTIND</code> was
advanced --- it is automatically set to 1 at the start of every function or
script.  If it was, the `<code>shift</code>' builtin with a numeric argument, but no
array name, moves the positional parameters, i.e. the function's arguments,
to shift away the options that have been processed.  The <code>print</code> in the
next line shows you that only the remaining non-option arguments are left.
You don't need to do that --- you can just start using the remaining
arguments from <code>$argv[$OPTIND]</code> on --- but it's a pretty good way of
doing it.
<p>In the call, I showed a line with `<code>-</code><code>-</code>' in it: that's the standard way
of telling <code>getopts</code> that the options are finished; even if later words
start with a <code>-</code>, they are not treated as options.  However, <code>getopts</code>
stops anyway when it reaches a word not beginning with `<code>-</code>', so that
wasn't necessary here.  But it works anyway.
<p>You can do all of what <code>getopts</code> does without <em>that</em> much difficulty
with a few extra lines of shell programming, of course.  The best argument
for using <code>getopts</code> is probably that it allows you to group single-letter
options, e.g. `<code>-abc</code>' is equivalent to `<code>-a -b -c</code>' if none of them
was defined to have an argument.  In this case, <code>getopts</code> has to remember
the position <em>inside</em> the word on the command line for you to call it
next, since the `<code>a</code>' `<code>b</code>' and `<code>c</code>' still appear on different
calls.  Rather unsatisfactorily, this is hidden inside the shell (as it is
in other shells --- we haven't fixed <em>all</em> of everybody else's bad design
decisions); you can't get at it or reset it without altering <code>$OPTIND</code>.
But if you read the small print at the top of the guide, you'll find I
carefully avoided saying everything was satisfactory.
<p>While we're at it, why do blocks starting with `<code>if</code>' and `<code>then</code>' end
with `<code>fi</code>', and blocks starting with `<code>case</code>' end with `<code>esac</code>',
while those starting with `<code>while</code>' and `<code>do</code>' end with `<code>done</code>', not
`<code>elihw</code>' (perfectly pronounceable in Welsh, after all) or `<code>od</code>'?
Don't ask me.
<p><a name="l45"></a>
<h3>3.2.14: Random file control things</h3>
<p>We're now down into the odds and ends.  If you know UNIX at all, you will
already be familiar with the <code>umask</code> command and its effect on file
creation, but as it is a builtin I will describe it here.  Create a file
and look at it:
<pre>

  % touch tmpfile
  % ls -l tmpfile
  -rw-r--r--    1 pws   pws    0 Jul 19 21:19 tmpfile

</pre>

(I've shortened the output line for the TeX version of this document.)
You'll see that the permissions are read for everyone, write-only for the
owner.  How did the command (<code>touch</code>, not a builtin, creates an empty
file if there was none, or simply updates the modification time of an
existing file) know what permissions to set?
<pre>

  % umask
  022
  % umask 077
  % rm tmpfile; touch tmpfile
  % ls -l tmpfile
  -rw-------    1 pws   pws    0 Jul 19 21:22 tmpfile

</pre>

<code>umask</code> was how it knew.  It gives an octal number corresponding to the
permissions which should <em>not</em> be given to a newly created file (only
newly created files are affected; operations on existing files don't
involve <code>umask</code>).  Each digit is made up of a 4 for read, 2 for write, 1
for executed, in the same order that <code>ls</code> shows for permissions: user,
then group, then everyone else.  (On this Linux/GNU-based system, like many
others, users have their own groups, so both are called `<code>pws</code>'.)
So my original `022' specified that everything should be allowed except
writing for group and other, while `077' disallowed any operation by group
and other.  These are the two most common settings, although here `002' or
`007' would be useful because of the way groups are specific to users,
making it easier to grant permission to specific other users to write in
my directories.  (Except there aren't any other users.)
<p>You can also use <code>chmod</code>-like permission changing expressions in
<code>umask</code>.  So
<pre>

  % umask go+rx

</pre>

would restore group and other permissions for reading and executing, hence
returning the mask to 022.  Note that because it is <em>adding</em> permissions,
just like <code>chmod</code> does, it is <em>removing</em> numbers from the umask.
<p>You might have wondered about execute permissions, since `<code>touch</code>'
didn't give any, even where it was allowed by <code>umask</code>.  That's because
only operations which create executable programmes, such as running a
compiler and linker, set that bit; the normal way of opening a new file
--- internally, the UNIX <code>open</code> function, with the <code>O_CREAT</code> flag
set --- doesn't touch it.  For the same reason, if you create shell
scripts which you want to be able to execute by typing the name, you
have to make them executable yourself:
<pre>

  % chmod +x myscript

</pre>

and, indeed, you can think of <code>chmod</code> as <code>umask</code>'s companion for
files which already exist.  It doesn't need to be a builtin, because the
files you are operating on are external to <code>zsh</code>; <code>umask</code>, on the
other hand, operates when you create a file from within <code>zsh</code> or any
child process, so needs to be a builtin.  The fact that it's inherited
means you can set <code>umask</code> before you start an editor, and files
created by that editor will reflect the permissions.
<p>Note that the value set by <code>umask</code> is also inherited and used by
<code>chmod</code>.  In the example of <code>chmod</code> I gave, I didn't see <em>which</em>
type of execute permission to add; <code>chmod</code> looks at my <code>umask</code> and
decides based on that --- in other words, with 022, everybody would be
allowed to execute <code>myscript</code>, while with 077, only I would, because
of the 1's in the number: (0+0+0)+(4+2+1)+(4+2+1).  Of course, you can
be explicit with chmod and say `<code>chmod u+x myscript</code>' and so on.
<p>Something else that may or may not be obvious:  if you run a script by
passing it as an argument to the shell,
<pre>

  % zsh myscript

</pre>

what matters is <em>read</em> permission.  That's what the shell's doing to the
script to find the commands, after all.  Execute permission applies when
the system (or, in some cases, including zsh, the parent shell where you
typed `<code>myscript</code>') has to decide whether to find a shell to run the
script by itself.
<p><a name="l46"></a>
<h3>3.2.15: Don't watch this space, watch some other</h3>
<p>Finally for builtins, some things which really belong elsewhere.  There are
three commands you use to control the shell's editor.  These will be
described in the next chapter, where I talk all about the editor.
<p>The <code>bindkey</code> command allows you to attach a set of keystrokes to a
command.  It understands an abbreviated notation for the keystrokes.
<pre>

  % bindkey '^Xc' copy-prev-word

</pre>

This binds the keystrokes consisting of <code>Ctrl</code> held down with <code>x</code>, then
<code>c</code>, to the command which copies the previous word on the line to the
current position.  The commands are listed in the <code>zshzle</code> manual page.
<code>bindkey</code> can also do things with keymaps, which are a complete set of
mappings between keys and commands like the one I showed.
<p>The <code>vared</code> command is an extremely useful builtin for editing a shell
variable.  Usually much the easiest way to change <code>$path</code> (or <code>$PS1</code>,
or whatever) is to run `<code>vared path</code>':  note the lack of a `<code>$</code>', since
otherwise you would be editing whatever <code>$path</code> was expanded to.  This
is because very often you want to leave most of what's there and just
change the odd character or word.  Otherwise, you would end up doing this
with ordinary parameter substitutions, which are a lot more complicated and
error prone.  Editing a parameter is exactly like editing a command line,
except without the prompt at the start.
<p>Finally, there is the <code>zle</code> command.  This is the most mysterious, as it
offers a fairly low-level interface to the line editor; you use it to
define your own editing commands.  So I'll leave this alone for now.
<p><a name="l47"></a>
<h3>3.2.16: And also</h3>
<p>There is one more standard builtin that I haven't covered: <code>zmodload</code>,
which allows you to manipulate add-on packages for zsh.  Many extras are
supplied with the shell which aren't normally loaded to keep down the use
of memory and to avoid having too many rarely used builtins, etc., getting
in the way.  In the last chapter I will talk about some of these.  To be
more honest, a lot of the stuff in between actually uses these addons,
generically referred to as modules --- the line editor, zle, is itself a
separate module, though heavily dependent on the main shell --- and you've
probably forgotten I mentioned above using `<code>zmodload zsh/mathfunc</code>' to
load mathematical functions.
<p><a name="l48"></a>
<h2>3.3: Functions</h2>
<p>Now it's time to look at functions in more details.  The various issues
to be discussed are: loading functions, handling parameters, compiling
functions, and repairing bike tyres when the rubber solution won't stick
to the surface.  Unfortunately I've already used so much space that I'll
have to skip the last issue, however topical it might be for me at the
moment.
<p><a name="l49"></a>
<h3>3.3.1: Loading functions</h3>
<p>Well, you know what happens now.  You can define functions on the command
line:
<pre>

  fn() {
    print I am a function
  }

</pre>

which you call under the name `<code>fn</code>'.  As you type, the shell knows that
it's in the middle of a function definition, and prompts you until you get
to the closing brace.
<p>Alternatively, and much more normally, you put a file called <code>fn</code>
somewhere in a directory listed in the <code>$fpath</code> array.  At this point,
you need to be familiar with the <code>KSH_AUTOLOAD</code> option described in the
last chapter.  From now on, I'm just going to assume your autoloadable
function files contain just the body of the function, i.e. <code>KSH_AUTOLOAD</code>
is not set.  Then the file <code>fn</code> would contain:
<pre>

  print I am a function

</pre>

and nothing else.
<p>Recent versions of zsh, since <code>3.1.6</code>, set up <code>$fpath</code> for you.  It
contains two parts, although the second may have multiple directories.
The first is, typically, <code>/usr/local/share/zsh/site-functions</code>,
although the prefix may vary.  This is empty unless your system
administrator has put something in it, which is what it's there for.
<p>The remaining part may be either a single directory such as
<code>/usr/local/share/zsh/3.1.9/functions</code>, or a whole set of directories
starting with that path.  It simply depends whether the person
installing zsh wanted to keep all the functions in the same directory,
or have them sorted according to what they do.  These directories are
full of functions.  However, none of the functions is autoloaded
automatically, so unless you specifically put `<code>autoload ...</code>' in a
startup file, the shell won't actually take any notice of them.  As
you'll see, part of the path is the shell version.  This makes it very
easy to keep multiple versions of zsh with functions which use features
that may be different between the two versions.  By the way, if these
directories don't exist, you should check <code>$fpath</code> to see if they are
in some other location, and if you can't find any correspondence between
what's in <code>$fpath</code> and what's on the disk even when you start the
shell with <code>zsh -f</code> to suppress loading of startup files, complain to
the system administrator: he or she has either not installed them
properly, or has made <code>/etc/zshenv</code> stomp on <code>$fpath</code>, both of which
are thoroughly evil things to do.  (<code>/etc/zshrc</code>, <code>/etc/zprofile</code>
and <code>/etc/zlogin</code> shouldn't stomp on <code>$fpath</code> either, of course.  In
fact, they shouldn't do very much; that's up to the user.)
<p>One point about <code>autoload</code> is the `<code>-U</code>' option.  This turns off the
use of any aliases you have defined when the function is actually loaded
--- the flag is remembered with the name of the function for future
reference, rather than being interpreted immediately by the <code>autoload</code>
command.  Since aliases can pretty much redefine any command into any
other, and are usually interpreted while a function is being defined or
loaded, you can see that without this flag there is fair scope for
complete havoc.
<pre>

   alias ls='echo Your ls command has been requisitioned.'
   lsroot() {
     ls -l /
   }
   lsroot

</pre>

That's not what the function writer intended.  (Yes, I know it actually
<em>is</em>, because I wrote it to show the problem, but that's not what I
<em>meant</em>.) So <code>-U</code> is recommended for all standard functions, where you
have no easy way of telling quite what could be run inside.
<p>Recently, the functions for the new completion system (described in chapter
6) have been changing the fastest.  The either begin with <code>comp</code> or an
underscore, `<code>_</code>'.  If the <code>functions</code> directory is subdivided, most
of the subdirectories refer to this.  There are various other classes of
functions distributed with the shell:
<dl>
  <li > Functions beginning <code>zf</code> are associated with zftp, a builtin
  system for FTP transfers.  Traditional FTP clients, ones which don't
  use a graphical interface, tend to be based around a set of commands
  on a command line --- exactly what zsh is good at.  This also makes it
  very easy to write macros for FTP jobs --- they're just shell
  functions.  This is described in the final chapter along with other
  modules.  It's based around a single builtin, <code>zftp</code>, which is
  loaded from the module <code>zsh/zftp</code>.
<p><li > Functions beginning <code>prompt</code>, which may be in the <code>Prompts</code>
  subdirectory, are part of a `prompt themes' system which makes it
  easy for you to switch between preexisting prompts.  You load it
  with `<code>autoload -U promptinit; promptinit</code>'.   Then `<code>prompt -h</code>'
  will tell you what to do next.  If you have new completion loaded
  (with `<code>autoload -U compinit; compinit</code>', what else) the arguments
  to `<code>prompt</code>' can be listed with <code>^D</code> and completed with a TAB;
  they are various sorts of prompt which you may or may not like.
<p><li > Functions with long names and hyphens, like <code>predict-on</code> and
  <code>incremental-complete-word</code>.  These are editor functions; you use them
  with
<pre>

  zle -N predict-on
  bindkey &lt;keystroke&gt; predict-on

</pre>

  Here, the <code>predict-on</code> function automatically looks back in the history
  list for matching lines as you type.  You should also bind
  <code>predict-off</code>, which is loaded when <code>predict-on</code> is first called.
  <code>incremental-complete-word</code> is a fairly simple attempt at showing
  possible completions for the current word as you type; it could do with
  improving.
<p><li > Everything else; these may be in the <code>Misc</code> subdirectory.  These
  are a very mixed bag which you should read to see if you like any.  One
  of the most useful is <code>zed</code>, which allows you to edit a small file
  (it's really just a simple front-end to <code>vared</code>).  The <code>run-help</code>
  file shows you the sort of thing you might want to define for use with
  the <code>\eh</code> (<code>run-help</code>) keystroke.  <code>is-at-least</code> is a function
  for finding out if the version of the shell running is recent enough,
  assuming you know what version you need for a given feature.  Several of
  the other functions refer to the old completion system --- which you
  won't need, since you will be reading chapter 6 and using the new
  completion system, of course.
</dl>
<p>If you have your own functions --- and if you use zsh a lot, you almost
certainly will eventually --- it's a good idea to add your own personal
directory to the front of <code>$fpath</code>, so that everything there takes
precedence over the standard functions.  That allows you to override a
completion function very easily, just by copying it and editing it.  I tend
to do something like this in my <code>.zshenv</code>:
<pre>

  [[ $fpath = *pws* ]] || fpath=(~pws $fpath)

</pre>

to protect against the possibility that the directory I want to add is
already there, in case I source that startup file again, and there are
other similar ways.  (You may well find your own account isn't called
<code>pws</code>, however.)
<p>Chances are you will always want your own functions to be autoloaded.
There is an easy way of doing this:  put it just after the line I showed
above:
<pre>

  autoload ${fpath[1]}/*(:t)

</pre>

The <code>${fpath[1]}/*</code> expands to all the files in the directory at the head
of the <code>$fpath</code> array.  The <code>(:t)</code> is a `glob modifier': applied to a
filename generation pattern, it takes the tail (basename) of all the files
in the list.  These are exactly the names of the functions you want to
autoload.  It's up to you whether you want the <code>-U</code> argument here.
<p><a name="l50"></a>
<h3>3.3.2: Function parameters</h3>
<p>I covered local parameters in some detail when I talked about <code>typeset</code>,
so I won't talk about that here.  I didn't mention the other parameters
which are effectively local to a function, the ones that pass down the
arguments to the function, so here is more detail.  They work pretty much
identically in scripts.
<p>There are basically two forms.  There is the form inherited from Bourne
shell via Korn shell, with the typically uninformative names: <code>$#</code>,
<code>$*</code>, <code>$@</code> and the numerical parameters <code>$1</code> etc. --- as high a
number as the shell will parse is allowed, not just single digits.  Then
there is the form inherited from the C shell: <code>$ARGC</code> and <code>$argv</code>.
I'll mainly use the Bourne shell versions, which are far more commonly
used, and come back to some oddities of the C shell variety at the end.
<p><code>$#</code> tells you how many arguments were passed to the function, while
<code>$*</code> gives those arguments as an array.  This was the only array
available in the Bourne shell, otherwise there would probably have been a
more obvious way of doing it.  To get the size and the number of elements
of the array you don't use <code>${#*}</code> and <code>${*[1]}</code> etc. (well, you usually
don't --- zsh is typically permissive here), you use <code>$1</code>, <code>$2</code>.
Despite the syntax, these are rather like ordinary array elements; if you
refer to one off the end, you will get an empty string, but no error,
unless you have the option <code>NO_UNSET</code> set.  It is this not-quite array
which gets shifted if you use the <code>shift</code> builtin without an argument:
the old <code>$1</code> disappears, the old <code>$2</code> becomes <code>$1</code>, and so on, while
<code>$#</code> is reduced by one.  If there were no arguments (<code>$#</code> was zero),
nothing happens.
<p>The form <code>$@</code> is very similar to <code>$*</code>, and you can use it in place of
that in most contexts.  There is one place where they differ.  Let's define
a function which prints the number of its arguments, then the arguments.
<pre>

  args() {
    print $# $*
  }

</pre>

Now some arguments.  We'll do this for the current shell --- it's a
slightly odd idea, that you can set the arguments for what's already
running, or that an interactive shell has arguments at all, but
nonetheless it's true:
<pre>

  set arguments to the shell
  print $*

</pre>

sets <code>$*</code> and hence prints the message `<code>arguments to the shell</code>'.  We
now pass <em>these</em> arguments on to the function in two different ways:
<pre>

  args $*
  args $@

</pre>

This outputs
<pre>

  4 arguments to the shell
  4 arguments to the shell

</pre>

-- no surprises so far.  Remember, too, that zsh doesn't split words on
spaces unless you ask it too.  So:
<pre>

  % set 'one word'
  % args $*
  1 one word
  % args $@
  1 one word

</pre>

<p>Now here's the difference:
<pre>

  % set two words
  % args "$*"
  1 two words
  % args "$@"
  2 two words

</pre>

In quotes, <code>"$*"</code> behaves as a normal array, joining the words with
spaces.  However, <code>"$@"</code> doesn't --- it still behaves as if it was
unquoted.  You can't see from the arguments themselves in this case, but
you can from the digit giving the number of arguments the function has.
<p>This probably seems pretty silly.  Why quote something to have it behave
like an unquoted array?  The original answer lies back in Bourne shell
syntax, and relates to the vexed question of word splitting.  Suppose we
turn on Bourne shell behaviour, and try the example of a word with spaces
again:
<pre>

  % setopt shwordsplit
  % set 'one word'
  % args $*
  2 one word
  % args $@
  2 one word
  % args "$*"
  1 one word
  % args "$@"
  1 one word

</pre>

Aha!  <em>This</em> time <code>"$@"</code> kept the single word with the space intact.
In other words, <code>"$@"</code> was a slightly primitive mechanism for suppressing
splitting of words, while allowing the splitting of arrays into elements.
In zsh, you would probably quite often use <code>$*</code>, not <code>"$@"</code>, safe in
the knowledge that nothing was split until you asked it to be; and if you
wanted it split, you would use the special form of substitution <code>${=*}</code>
which does that:
<pre>

  % unsetopt shwordsplit
  % args $*
  1 one word
  % args ${=*}
  2 one word

</pre>

(I can't tell you why the `<code>=</code>' was chosen for this purpose, except that
it consists of two split lines, or in an assignment it splits two things,
or something.)  This works with any parameter, whether scalar or array,
quoted or unquoted.
<p>However, that's actually not quite the whole story.  There are times when
the shell removes arguments, because there's nothing there:
<pre>

  % set hello '' there
  % args $*
  2 hello there

</pre>

The second element of the array was empty, as if you'd typed
<pre>

  2=

</pre>

--- yes, you can assign to the individual positional parameters directly,
instead of using <code>set</code>.  When the array was expanded on the command line,
the empty element was simply missed out altogether.  The same happens with
all empty variables, including scalars:
<pre>

  % empty=
  % args $empty
  0

</pre>

But there are times when you don't want that, any more than you want word
splitting --- you want <em>all</em> arguments passed just as you gave them.
This is another side effect of the <code>"$@"</code> form.
<pre>

  % args "$@"
  3 hello there

</pre>

Here, the empty element was passed in as well.  That's why you often find
<code>"$@"</code> being used in zsh when wordsplitting is already turned off.
<p>Another note: why does the following not work like the example with <code>$*</code>?
<pre>

  % args hello '' there
  3 hello there

</pre>

The quotes were kept here.  Why?  The reason is that the shell doesn't
elide an argument if there were quotes, even if the result was empty:
instead, it provides an empty string.  So this empty string was passed as
the second argument.  Look back at:
<pre>

  set hello '' there

</pre>

Although you probably didn't think about it at the time, the same thing was
happening here.  Only with the <code>'</code><code>'</code> did we get an empty string assigned
to <code>$2</code>; later, this was missed out when passing <code>$*</code> to the function.
The same difference occurs with scalars:
<pre>

  % args $empty
  0
  % args "$empty"
  1

</pre>

The <code>$empty</code> expanded to an empty string in each case.  In the first case
it was unquoted and was removed; this is like passing an empty part of
<code>$*</code> to a command.  In the second case, the quotes stopped that from
being removed completely; this is similar to setting part of <code>$*</code> to an
empty string using <code>''</code>.
<p>That's all thoroughly confusing the first time round.  Here's a table
to try and make it a bit clearer.
<pre>

                       |   Number of arguments
                       |     if $* contains...
                       |  (two words)
Expression   Word      |       'one word'
on line   splitting?   |             empty string
--------------------------------------------------
$*             n       |     2     1     0
$@             n       |     2     1     0
"$*"           n       |     1     1     1
"$@"           n       |     2     1     1
                       |                    
$*             y       |     2     2     0
$@             y       |     2     2     0
"$*"           y       |     1     1     1
"$@"           y       |     2     1     1
                       |                    
${=*}          n       |     2     2     0
${=@}          n       |     2     2     0
"${=*}"        n       |     2     2     1
"${=@}"        n       |     2     2     1

</pre>

On the left is shown the expression to be passed to the function, and in
the three right hand columns the number of arguments the function will get if
the positional parameters are set to an array of two words, a single
word with a space in the middle, or a single word which is an empty string
(the effect of `<code>set -- '</code><code>'</code>' respectively.  The second column shows
whether word splitting is in effect, i.e. whether the <code>SH_WORD_SPLIT</code>
option is set.  The first four lines show the normal zsh behaviour; the
second four show the normal sh/ksh behaviour, with word splitting turned
on --- only the case where a word has a space in it changes, and then
only when no quotes are supplied.  The final four show what happens when
you use the `<code>${=..}</code>' method to turn on word splitting, for
convenience: that's particularly simple, since it always forces words to
be split, even inside quotation marks.
<p>I would recommend that anyone not wedded to the Bourne shell behaviour use
the top set as standard: in particular, `<code>$*</code>' for normal array behaviour
with removal of empty items, `<code>"$@"</code>' for normal array behaviour with
empty items left as empty items, and `<code>"$*"</code>' for turning arrays into
single strings.  If you need word-splitting, you should use `<code>${=*}</code>' or
`<code>"${=@}"</code>' for splitting with/without removal of empty items (obviously
there's no counterpart to the quoted-array behaviour here).  Then keep
<code>SH_WORD_SPLIT</code> turned off.  If you are wedded to the Bourne shell
behaviour, you're on your own.
<p>The final matter is the C shell syntax.  There are two extra variables
but, luckily, not much extra in the way of complexity.  <code>$ARGC</code> is
essentially identical to <code>$#</code>, and <code>$argv</code> corresponds to <code>$*</code>,
but is a real array this time, so instead of <code>$1</code> you have
<code>${argv[1]}</code> and so on.  They use the convention that scalars used by
the shell are all uppercase, while arrays are all lowercase.  This
feature is probably the only reason anyone would need these variants.
For example, <code>${argv[2,-1]}</code> means all arguments from the second to
the last, inclusive: negative indices count from the end, and a comma
indicates a slice of an array, so that <code>${argv[1,-1]}</code> is always the
same as the full array.  Otherwise, my advice would be to stick with the
Bourne shell variants, however cryptic they may look at first sight, for
the usual reason that zsh isn't really like the C shell and if you
pretend it is, you will come a cropper sooner or later.
<p>It looks like you're missing <code>"$@"</code>, but actually you can do that with
<code>"${argv[@]}"</code>.  This, like negative indices and slices, works with all
arrays.
<p>There's one slight oddity with <code>$ARGC</code> and <code>$argv</code>, which isn't
really a deliberate feature of the shell at all, but just in case you
run into it:  although the values in them are of course local to
functions, the variables <code>$ARGC</code> and <code>$argv</code> <em>themselves</em> are
actually treated like global variables.  That means if you apply a
<code>typeset -g</code> command to them, it will affect the behaviour of
<code>$ARGC</code> and <code>$argv</code> in all functions, even though they have
different values.  It's probably not a good idea to rely on this
behaviour.
<p>I've been a little tricky here, because I've been talking about two levels
of functions at once:  <code>$*</code> and friends as set in the current function,
or even at the top level, as well as how they are passed down to commands
such as my <code>args</code> function.  Of course, in the second case the same
behaviour applies to all commands, not just functions.  What I mean is, in
<pre>

  fn() {
    cat $*
    cat "$*"
  }

</pre>

the `<code>cat</code>' command will see the differences in behaviour between the two
calls just as <code>args</code> would.  That should be obvious.
<p>Let me finally mention again a feature I noted in passing:
<pre>

  1='first argument'

</pre>

sets the first command argument for the current shell or function,
indepently of any others.  People sometimes complain that
<pre>

  1000000='millionth argument'

</pre>

suddenly makes the shell use a lot more memory.  That's not a bug at
all:  you've asked the shell to set the millionth element of an array,
but not any others, so the shell creates an array a million elements
long with the first 999,999 empty, except for any arguments which were
already set.  It's not surprising this takes up a lot of memory.
<p><a name="l51"></a>
<h3>3.3.3: Compiling functions</h3>
<p>Since version 3.1.7, it has been possible to compile functions to their
internal format.  It doesn't make the functions run any faster, it just
reduces their loading time; the shell just has to bring the function
into memory, then it `runs it as it does any other function.  On many
modern computers, therefore, you don't gain a great deal from this.  I
have to admit I don't use it, but there are other definite advantages.
<p>Note that when I say `compiled' I don't mean the way a C compiler, say,
would take a file and turn it into the executable code which the
processor understands; here, it's simply the format that the shell
happens to use internally --- it's useless without a suitable version of
zsh to run it.  Also, it's no use thinking you can hide your code from
prying eyes this way, like you can to some extent with an ordinary
compiler (disassembling anything non-trivial from scratch being a
time-consuming job): first of all, ordinary command lines appear inside
the compiled files, except in slightly processed form, and secondly
running `<code>functions</code>' on a compiled function which has been loaded
will show you just as much as it would if the function had been loaded
normally.
<p>One other advantage is that you can create `digest' files, which are
sets of functions stored in a single file.  If you often use a large
fraction of those files, or they are small, or you like the function
itself to appear when you run `functions' rather than a message saying
it hasn't been loaded, then this works well.  In fact, you can compile
all the functions in a single directory in one go.  You might think this
uses a lot of memory, but often zsh will simply `memory map' the file,
which means rather than reserving extra main memory for it and reading
it in --- the obvious way of reading files --- it will tell the
operating system to make the file available as if it were memory, and
the system will bring it into memory piece by piece, `paging' the file
as it is needed.  This is a very efficient way of doing it.  Actually,
zsh supports both this method and the obvious method of simply reading
in the file (as long as your operating system does); this is described
later on.
<p>A little extra, in case you're interested: if you read in a file
normally, the system will usually reserve space on a disk for it, the
`swap', and do paging from there.  So in this case you still get the
saving of main memory --- this is standard in all modern operating
systems.  However, it's not <em>as</em> efficient: first of all, you had to
read the file in in the first place.  Secondly it eats up swap space,
which is usually a fixed amount of disk, although if you've got enough
main memory, the system probably won't bother allocating swap.  Thirdly
--- this is probably the clincher for standard zsh functions on a large
system --- if the file is directly mapped read-only, as it is in this
case, the system only needs one place in main memory, plus the single
original file on disk, to keep the function, which is very much more
efficient.  With the other method, you would get multiple copies in both
main memory and (where necessary) swap.  This is how the system treats
directly executable programmes like the shell itself --- the data is
specific to each process, but the programme itself can be shared because
it doesn't need to be altered when it's running.
<p>Here's a simple example.
<pre>

  % echo 'echo hello, world' &gt;hw
  % zcompile hw
  % ls
  hw    hw.zwc
  % rm hw
  % fpath=(. $fpath)
  % autoload hw
  % hw
  hello, world

</pre>

We created a simple `hello, world' function, and compiled it.  This
produces a file called `<code>hw.zwc</code>'.  The extension stands for `Z-shell
Word Code', because it's based on the format of words (integers longer
than a single byte) used internally by the shell.  Then we made sure the
current directory was in our <code>$fpath</code>, and autoloaded the function,
which ran as expected.  We deleted the original file for demonstration
purposes, but as long as the `<code>.zwc</code>' file is newer, that will be
used, so you don't need to remove the originals in normal use.  In fact,
you shouldn't, because you will lose any comments and formatting
information in it; you can regenerate the function itself with the
`<code>functions</code>' command (try it here), but the shell only remembers the
information actually needed to run the commands.  Note that the function
was in the zsh autoload format, not the ksh one, in this case (but see
below).
<p><p><strong>And there's more</strong><br><br>
    
<p>Now some bells and whistles.  Remember the <code>KSH_AUTOLOAD</code> thing?  When
you compile a function, you can specify which format --- native zsh or
ksh emulation --- will be used for loading it next time, by using the
option <code>-k</code> or <code>-z</code>, instead of the default, which is to examine the
option (as would happen if you were autoloading directly from the file).
Then you don't need to worry about that option.  So, for example, you
could compile all the standard zsh functions using `<code>zcompile -z</code>' and
save people the trouble of making sure they are autoloaded correctly.
<p>You can also specify that aliases shouldn't be expanded when the files
are compiled by using <code>-U</code>: this has roughly the same effect as saying
<code>autoload -U</code>, since when the shell comes to load a compiled file, it
will never expand aliases, because the internal format assumes that all
processing of that kind has already been done.  The difference in this
case is if you <em>don't</em> specify <code>-U</code>: then the aliases found when you
compile the file, not when you load the function from it, will be used.
<p>Now digest files.  Here's one convenient way of doing it.
<pre>

  % ls ~/tmp/fns
  hw1   hw2
  % fpath=(~/tmp/fns $fpath)
  % cd ~/tmp
  % zcompile fns fns/*
  % ls
  fns   fns.zwc

</pre>

We've made a directory to put functions in, <code>~/tmp/fns</code>, and stuck some
random files in it.  The <code>zcompile</code> command, this time, was given several
arguments: a filename to use for the compiled functions, and then a list of
functions to compile into it.  The new file, <code>fns.zwc</code>, sits in the same
directory where the directory <code>fns</code>, found in <code>$fpath</code>, is.  The shell
will actually search the digest file instead of the directory.  More
precisely, it will search both, and see which is the more recent, and use
that as the function.  So now
<pre>

  % autoload hw1
  % hw1
  echo hello, first world

</pre>

<p>You can test what's in the digest file with:
<pre>

  % zcompile -t fns
  zwc file (read) for zsh-3.1.9-dev-3
  fns/hw1
  fns/hw2

</pre>

Note that the names appear as you gave them on the command line,
i.e. with <code>fns/</code> in front.  Only the basenames are important for
autoloading functions.  The note `<code>(read)</code>' in the first line means
that zsh has marked the functions to be read into the shell, rather than
memory mapped as discussed above; this is easier for small functions,
particularly if you are liable to remove or alter a file which is
mapped, which will confuse the shell.  It usually decides which method
to use based on size; you can force memory mapping by giving the <code>-M</code>
option.  Memory mapping doesn't work on all systems (currently including
Cygwin).
<p>I showed this for compiling files, but you can actually tell the shell
to output compiled functions --- in other words, it will look along
<code>$fpath</code> and compile the functions you specify.  I find compiling
files easier, when I do it at all, since then I can use patterns to find
them as I did above.  But if you want to do it the other way, you should
note two other options: <code>-a</code> will compile files by looking along
<code>$fpath</code>, while <code>-c</code> will output any functions already loaded by the
shell (you can combine the two to use either).  The former is
recommended, because then you don't lose any information which was
present in the autoload file, but not in the function stored in memory
---- this is what would happen if the file defined some extra widgets
(in the non-technical sense) which weren't part of the function called
subsequently.
<p>If you're perfectly happy with the shell <em>only</em> searching a digest
file, and not comparing the datestamp with files in the directory, you
can put that directly into your <code>$fpath</code>, i.e. <code>~/tmp/fns.zwc</code> in
this case.  Then you can get rid of the original directory, or archive
it somewhere for reuse.
<p>You can compile scripts, too.  Since these are in the same format as a
zsh autoload file, you don't need to do anything different from
compiling a single function.  You then run (say) <code>script.zwc</code> by
typing `<code>zsh script</code>' --- note that you should omit the <code>.zwc</code>, as
zsh decides if there's a compiled version of a script by explicitly
appending the suffix.  What's more, you can run it using `<code>.</code>' or
`<code>source</code>' in just the same way (`<code>. script</code>') --- this means you
can compile your startup files if you find they take too long to run
through; the shell will spot a <code>~/.zshrc.zwc</code> as it would any other
sourceable file.  It doesn't make much sense to use the memory mapping
method in this case, since once you've sourced the files you never want
to run them again, so you might as well specify `<code>zcompile -R</code>' to use
the reading (non-memory-mapping) method explicitly.
<p>If you ever look inside a <code>.zwc</code> file, you will see that the
information is actually included twice.  That's because systems differ
about the order in which numbers are stored: some have the least
significant byte first (notably Intel and Mips) and some the most
significant (notably SPARC and Cambridge Consultants' XAP processor,
which is notable here mainly because I spend my working hours
programming for it --- you can't run zsh on it).  Since zsh uses
integers a great deal in the compiled code, it saves them in both
possible orders for ease of use.  Why not just save it for the machine
where you compiled it?  Then you wouldn't be able to share the files
across a heterogeneous network --- or even worse, if you made a
distribution of compiled files, they would work on some machines, and
not on others.  Think how Emacs users would complain if the <code>.elc</code>
files that arrived weren't the right ones.  (Worse, think how the vi
users would laugh.)
<p><p><strong>A little -Xtra help</strong><br><br>
    
<p>There are two final autoloading issues you might want to know about.  In
versions of zsh since 3.1.7, you will see that when you run
<code>functions</code> on a function which is marked for autoload but hasn't yet
been loaded, you get:
<pre>

afunctionmarkedforautoloadwhichhasntbeenloaded () {
        # undefined
        builtin autoload -XU
}

</pre>

The `<code># undefined</code>' is just printed to alert you that this was a
function marked as autoloadable by the <code>autoload</code> command: you can
tell, because it's the only time <code>functions</code> will emit a comment
(though there might be other `<code>#</code>' characters around).  What's
interesting is the <code>autoload</code> command with the <code>-X</code> option.  That
option means `Mark me for autoloading and run me straight away'.  You
can actually put it in a function yourself, and it will have the same
effect as running `<code>autoload</code>' on a not-yet-existent
function. Obviously, the <code>autoload</code> command will disappear as soon as
you do run it, to be replaced by the real contents.  If you put this
inside a file to be autoloaded, the shell will complain --- the
alternative is rather more unpalatable.
<p>Note also the <code>-U</code> option was set in that example:  that simply means
that I used <code>autoload</code> with the <code>-U</code> option when I originally told
the shell to autoload the function.
<p>There's another option, <code>+X</code>, the complete opposite of <code>-X</code>.  This
one can <em>only</em> be used with autoload outside the function you're
loading, just as <code>-X</code> was only meaningful inside.  It means `load the
file immediately, but don't run it', so it's a more active (or, as they
say nowadays, since they like unnecessarily long words, proactive) form
of <code>autoload</code>.  It's useful if you want to be able to run the
<code>functions</code> command to see the function, but don't want to run the
function itself.
<p><p><strong>Special functions</strong><br><br>
    
<p>I'm in danger of simply quoting the manual, but there are various
functions with a special meaning to the shell (apart from <code>TRAP...</code>
functions, which I've already covered).  That is, the functions
themselves are perfectly normal, but the shell will run them
automatically on certain occasions if they happen to exist, and silently
skip them if they don't.
<p>The two most frequently used are <code>chpwd</code> and <code>precmd</code>.  The former
is called whenever the directory changes, either via <code>cd</code>, or
<code>pushd</code>, or an <code>AUTO_CD</code> --- you could turn the first two into
functions, and avoid needing <code>chpwd</code> but not the last.  Here's how to
force an xterm, or a similar windowing terminal, to put the current
directory into the title bar.
<pre>

  chpwd() {
    [[ -t 1 ]] || return
    case $TERM in
      (sun-cmd) print -Pn "\e]l%~\e\\"
        ;;
      (*xterm*|rxvt|(dt|k|E)term) print -Pn "\e]2;%~\a"
        ;;
    esac
  }

</pre>

The first line tests that standard output is really a terminal --- you
don't want to print the string in the middle of a script which is
directing its output to a file.  Then we look to see if we have a
<code>sun-cmd</code> terminal, which has its own <em>sui generis</em> sequence for
putting a string into the title bar, or something which recognises xterm
escape sequences.  In either case, the special sequences (a bit like
termcap sequences as discussed for <code>echotc</code>) are interpreted by the
terminal, and instead of being printed out cause it to put the string in
the middle into the title bar.  The string here is `<code>%~</code>': I added the
<code>-P</code> option to <code>print</code> so it would expand prompt escapes.  I could
just have used <code>$PWD</code>, but this way has the useful effect of
shortening your home directory, or any other named directory, into
<code>~</code>-notation, which is a bit more readable.  Of course, you can put
other stuff there if you like, or, if you're really sophisticated, put
in a parameter <code>$HEADER</code> and define that elsewhere.
<p>If programmes other than the shell alter what appears in the xterm title
bar, you might consider changing that <code>chwpd</code> function to <code>precmd</code>.
The function <code>precmd</code> is called just before every prompt; in this case
it will restore the title line after every command has run.  Some people
make the mistake of using it to set up a prompt, but there are enough
ways of getting varying information into a fixed prompt string that you
shouldn't do that unless you have <em>very</em> odd things in your prompt.
It's a big nuisance having to redefine <code>precmd</code> to alter your prompt
--- especially if you don't know it's know, since then your prompt
apparently magically returns to the same format when you change it.
There are some good reasons for using <code>precmd</code>, too, but most of them
are fairly specialised.  For example, on one system I use it to check if
there is new input from a programme which is sending data to the shell
asynchronously, and if so printing it out onto the terminal.  This is
pretty much what happens with job control notification if you don't have
the <code>NOTIFY</code> option set.
<p>The name <code>precmd</code> is a bit of a misnomer: <code>preprompt</code> would have
been better.  It usurps the name more logically applied to the function
actually called <code>preexec</code>, which is run after you finished editing a
command line, but just before the line is executed.  <code>preexec</code> has one
additional feature: the line about to be executed is passed down as an
argument.  You can't alter what's going to be executed by editing the
parameter, however: that has been suggested as an upgrade, but it would
make it rather easy to get the shell into a state where you can't
execute any commands because <code>preexec</code> always messes them up.  It's
better, where possible, to write function front-ends to specific
commands you want to handle specially.  For example, here's my <code>ls</code>
function:
<pre>

  local ls
  if [[ -n $LS_COLORS ]]; then
    ls=(ls --color=auto)
  else
    ls=(ls -F)
  fi
  command $ls $*

</pre>

This handles GNU and non-GNU versions of ls.  If <code>$LS_COLORS</code> is set, it
assumes we are using GNU ls, and hence colouring (or colorizing, in
geekspeak) is available.  Otherwise, it uses the standard option <code>-F</code> to
show directories and links with a special symbol.  Then it uses <code>command</code>
to run the real <code>ls</code> --- this is a key thing to remember any time you use
a function front-end to a command.  I could have done this another way:
test in my initialisation files which version of <code>ls</code> I was using, then
alias <code>ls</code> to one of the two forms.  But I didn't.
<p>Apart from the trap functions, there is one remaining special function.
It is <code>periodic</code>, which is executed before a prompt, like <code>precmd</code>,
but only every now and then, in fact every <code>$PERIOD</code> seconds; it's up
to you to set <code>$PERIOD</code> when you defined <code>periodic</code>.  If <code>$PERIOD</code>
isn't set, or is zero, nothing happens.  Don't get <code>$PERIOD</code> confused
with <code>$SECONDS</code>, which just counts up from 0 when the shell starts.
<p><a name="l52"></a>
<h2>3.4: Aliases</h2>
<p>Aliases are much simpler than functions.  In the C shell and its
derivatives, there are no functions, so aliases take their place and can
have arguments, which involve expressions rather like those which extract
elements of previous history lines with `<code>!</code>'.  Zsh's aliases, like
ksh's, don't take arguments; you have to use functions for that.  However,
there are things aliases can do which functions can't, so sometimes you end
up using both, for example
<pre>

  zfget() {
    # function to retrieve a file by FTP,
    # using globbing on the remote host
  }
  alias zfget='noglob zfget'

</pre>

The function here does the hard work; this is a function from the zftp
function suite, supplied with the shell, which retrieves a file or set of
files from another machine.  The function allows patterns, so you can
retrieve an entire directory with `<code>zfget *</code>'.  However, you need to
avoid the `<code>*</code>' being expanded into the set of files in the current
directory on the machine you're logged into; this is where the alias comes
in, supplying the `<code>noglob</code>' in front of the function.  There's no way of
doing this with the function alone; by the time the function is called, the
`<code>*</code>' would already have been expanded.  Of course you could quote it,
but that's what we're trying to avoid.  This is a common reason for using
the alias/function combination.
<p>Remember to include the `<code>=</code>' in alias definition, necessary in zsh,
unlike csh and friends.  If you do:
<pre>

  alias zfget noglob zfget

</pre>

they are treated as a list of aliases.  Since none has the `<code>=</code>' and a
definition, the shell thinks you want to list the definitions of the listed
words; I get the output
<pre>

  zfget='noglob zfget'
  zfget='noglob zfget'

</pre>

since <code>zfget</code> was aliased as before, but <code>noglob</code> wasn't aliased and
was skipped, although the failed alias lookup caused status 1 to be
returned.  Remember that the <code>alias</code> command takes as many arguments as
you like; any with `<code>=</code>' is a definition, any without is a request to
print the current definition.
<p>Aliases can in fact be allowed to expand to almost anything the shell
understands, not just sets of words.  That's because the text retrieved
from the alias is put back into the input, and reread more or less as if
you'd typed it.  That means you can get away with strange combinations like
<pre>

  alias tripe="echo foo | sed 's/foo/bar/' |"
  tripe cat

</pre>

which is interpreted exactly the same way as
<pre>

  echo foo | sed 's/foo/bar/' | cat

</pre>

where the word `<code>foo</code>' is sent to the stream editor, which alters it to
`<code>bar</code>' (`<code>s/old/new/</code>' is <code>sed</code>'s syntax for a substitution), and
passes it on to `<code>cat</code>', which simply dumps the output.  It's useless, of
course, but it does show what can lurk behind an apparently simple command
if it happens to be an alias.  It is usually not a good idea to do this,
due to the potential confusion.
<p>As the manual entry explains, you can prevent an alias from being expanded
by quoting it.  This isn't like quoting any other expansion, though;
there's no particular important character which has to be interpreted
literally to stop the expansion.  The point is that because aliases are
expanded early on in processing of the command line, looking up an alias is
done on a string without quotes removed.  So if you have an alias
`<code>drivel</code>', none of the strings `<code>\drivel</code>', `<code>'d'rivel</code>', or
`<code>drivel""</code>' will be expanded as the alias:  they all would have the same
effect as proper commands, after the quotes are removed, but as aliases
they appear different.  The manual entry also notes that you can actually
make aliases for any of these special forms, e.g. `<code>alias '\drivel'=...</code>'
(note the quotes, since you need the backslash to be passed down to the
alias command).  You would need a pretty good reason to do so.
<p>Although my `<code>tripe</code>' example was silly, you know from the existence of
`precommand modifiers' that it's sometimes useful to have a special command
which precedes a command line, like <code>noglob</code> or the non-shell command
<code>nice</code>.  Since they have commands following, you would probably expect
aliases to be expanded there, too.  But this doesn't work:
<pre>

  % alias foo='echo an alias for foo'
  % noglob foo
  zsh: command not found: foo

</pre>

because the <code>foo</code> wasn't in command position.  The way round this is to
use a special feature:  aliases whose definitions end in a space force the
next word along to be looked up as a possible alias, too:
<pre>

  % alias noglob='noglob '
  % noglob foo
  an alias for foo

</pre>

which is useful for any command which can take a command line after it.
This also shows another feature of aliases:  unlike functions, they
remember that you have already called an alias of a particular name, and
don't look it up again.  So the `<code>noglob</code>' which comes from expanding the
alias is not treated as an alias, but as the ordinary precommand modifier.
<p>You may be a little mystified about this difference.  A simple answer is
that it's useful that way.  It's sometimes useful for functions to call
themselves; for example if you are handling a directory hierarchy in one go
you might get a function to examine a directory, do something for every
ordinary file, and for every directory file call itself with the new
directory name tacked on.  Aliases are too simple for this to be a useful
feature.  Another answer is that it's particularly easy to mark aliases as
being `in use' while they are being expanded, because it happens while the
strings inside them are being examined, before any commands are called,
where things start to get complicated.
<p>Lastly, there are `global aliases'.  If aliases can get you into a lot of
trouble, global aliases can get you into a lot of a lot of trouble.  They
are defined with the option <code>-g</code> and are expanded not just in command
position, but anywhere on the command line.
<pre>

  alias -g L='| less'
  echo foo L

</pre>

This turns into `<code>echo foo | less</code>'.  It's a neat trick if you don't mind
your command lines having only a minimal amount to do with what is actually
executed.
<p>I already pointed out that alias lookups are done so early that aliases are
expanded when you define functions:
<pre>

  % alias hello='echo I have been expanded'
  % fn() {
  function&gt;  hello
  function&gt; }
  % which fn
  fn () {
          echo I have been expanded
  }

</pre>

You can't stop this when typing in functions directly, except by
quoting part of the name you type.  When autoloading, the <code>-U</code> options is
available, and recommended for use with any non-trivial function.
<p>A brief word about that `<code>function&gt;</code>' which appears to prompt you while
you are editing a function; I mentioned this in the previous chapter but
here I want to be clearer about what's going on.  While you are being
prompted like that, the shell is not actually executing the commands you
are typing in.  Only when it is satisfied that it has a complete set of
commands will it go away and execute them (in this case, defining the
function).  That means that it won't always spot errors until right at the
end.  Luckily, zsh has multi-line editing, so if you got it wrong you
should just be able to hit up-arrow and edit what you typed; hitting return
will execute the whole thing in one go.  If you have redefined <code>$PS2</code> (or
<code>$PROMPT2</code>), or you have an old version of the shell, you may not see the
full prompt, but you will usually see something ending in `<code>&gt;</code>' which
means the same.
<p><a name="l53"></a>
<h2>3.5: Command summary</h2>
<p>As a reminder, the shell looks up commands in this order:
<p><dl>
  <li > aliases, which will immediately be interpreted again as texts for
  commands, possible even other aliases; they can be deleted with
  `<code>unalias</code>',
<p><li > reserved words, those special to the shell which often need to be
  interpreted differently from ordinary commands due to the syntax, although
  they can be disabled if you really need to,
<p><li > functions; these can also be disabled, although it's usually easier
  to `<code>unfunction</code>' them,
<p><li > builtin commands, which can be disabled, or called as a builtin by
  putting `<code>builtin</code>' in front,
<p><li > external commands, which can be called as such, even if the name
  clashes with one of the above types, by putting `<code>command</code>' in front.
</dl>
<p><a name="l54"></a>
<h2>3.6: Expansions and quotes</h2>
<p>As I keep advertising, there will be a whole chapter dedicated to the
subject of shell expansions and what to do with them.  However, it's a
rather basic subject, which definitely comes under the heading of basic
shell syntax, so I shall here list all the forms of expansion.  As given in
the manual, there are five stages.
<p><a name="l55"></a>
<h3>3.6.1: History expansion</h3>
<p>This is the earliest, and is only done on an interactive command line, and
only if you have not set <code>NO_BANG_HIST</code>.  It was described in the section
`<em>The history mechanism; types of history</em>' in the previous chapter.
It is almost independent of the shell's processing of the command line; it
takes place as the command line is read in, not when the commands are
interpreted.  However, in zsh it is done late enough that the `<code>!</code>'s can
be quoted by putting them in single quotes:
<pre>

  echo 'Hello!!'

</pre>

doesn't insert the previous line at that point, but
<pre>

  echo "Hello!!"

</pre>

does.  You can always quote active `<code>!</code>'s with a backslash, so
<pre>

  echo "Hello\!\!"

</pre>

works, with or without the double quotes.  Amusingly, since single quotes
aren't special in double quotes, if you set the <code>HIST_VERIFY</code> option,
which puts the expanded history line back on the command line for possible
further editing, and try the first two of the three possibilities above in
order, then keep hitting return, you will find ever increasing command
lines:
<pre>

  % echo 'Hello!!'
  Hello!!
  % echo "Hello!!"
  % echo "Helloecho 'Hello!!'"
  % echo "Helloecho 'Helloecho 'Hello!!''"
  % echo "Helloecho 'Helloecho 'Helloecho 'Hello!!'''"

</pre>

and if you understand why, you have a good grasp of how quotes work.
<p>There's another way of quoting exclamation marks in a line:  put a `<code>!"</code>'
in it.  It can appear anywhere (as long as it's not in single quotes) and
will be removed from the line, but it has the effect of disabling any
subsequent exclamation marks till the end of the line.  This is the only
time quote marks which are significant to the shell (i.e. are not
themselves quoted) don't have to occur in a matching pair.
<p>Note that as exclamation marks aren't active in any text read
non-interactively --- and this includes autoloaded functions and sourced
files, such as startup files, read inside interactive shells --- it is an
error to quote any `<code>!</code>'s in double quotes in files.  This will simply
pass on the backslashes to the next level of parsing.  Other forms of
quoting are all right: `<code>\!</code>', because any character quoted with a
backslash is treated as itself, and <code>'!'</code> because single quotes can quote
anything anyway.
<p><a name="l56"></a>
<h3>3.6.2: Alias expansion</h3>
<p>As discussed above, alias expansion also goes on as the command line is
read, so is to a certain extent similar to history expansion.  However,
while a history expansion may produce an alias for expansion, `<code>!</code>'s in
the text resulting from alias expansions are normal characters, so it can
be thought of as a later phase (and indeed it's implemented that way).
<p><a name="l57"></a>
<h3>3.6.3: Process, parameter, command, arithmetic and brace expansion</h3>
<p>There are a whole group of expansions which are done together, just by
looking at the line constructed from the input after history and alias
expansion and reading it from left to right, picking up any active
expansions as the line is examined.  Whenever a complete piece of
expandable text is found, it is expanded; the text is not re-examined,
except in the case of brace expansion, so none of these types of expansion
is performed on any resulting text.  Whether later forms of expansion ---
in other words, filename generation and filename expansion are performed
--- is another matter, depending largely on the <code>GLOB_SUBST</code> option as
discussed in the previous chapter.  Here's a brief summary of the different
types.
<p><p><strong>Process substitution</strong><br><br>
    
<p>There are three forms that result in
a command line argument which refers to a file from or to which
input or output is taken:  `<code>&lt;</code>(process)'
runs the process which is expected to generate output which can be used
as input by a command; `<code>&gt;</code>(process)' runs the
process which will take input to it; and
`<code>=</code>(process)' acts like the first one, but it is
guaranteed that the file is a plain file.
<p>This probably sounds like gobbledygook.  Here are some simple examples.
<pre>

  cat &lt; &lt;(echo This is output)

</pre>

(There are people in the world with nothing better to do than compile lists
of dummy uses of the `<code>cat</code>' command, as in that example, and pour scorn
on them, but I'll just have to brave it out.)  What happens is that the
command `<code>echo This is output</code>' is run, with the obvious result.  That
output is <em>not</em> put straight into the command line, as it would be with
command substitution, to be described shortly.  Instead, the command line
is given a filename which, when read, gets that output.  So it's more like:
<pre>

  echo This is output &gt;tmpfile
  cat &lt; tmpfile
  rm tmpfile

</pre>

(note that the temporary file is cleaned up automatically), except that
it's more compact.  In this example I could have missed out the remaining
`<code>&lt;</code>', since <code>cat</code> does the right thing with a filename, but I put it
there to emphasise the fact that if you want to redirect input from the
process substitution you need an <em>extra</em> `<code>&lt;</code>', over and above the one
in the substitution syntax.
<p>Here's an example for the corresponding output substitution:
<pre>

  echo This is output &gt; \ 
  &gt;(sed 's/output/rubbish/' &gt;outfile)

</pre>

which is a perfectly foul example, but works essentially like:
<pre>

  echo This is output &gt;tmpfile
  sed 's/output/rubbish/' &lt;tmpfile &gt;outfile

</pre>

There's an obvious relationship to pipes here, and in fact this example
could be better written,
<pre>

  echo This is output | sed 's/output/rubbish/' &gt;outfile

</pre>

A good example of an occasion where the output process substitution can't
be replaced by a pipe is when it's on the error output, and standard output
is being piped:
<pre>

  ./myscript 2&gt; &gt;(grep -v idiot &gt;error.log) |
      process-output &gt;output.log

</pre>

a little abstract, but here the main point of the script `myscript' is to
produce some output which undergoes further processing on the right-hand
side of the pipe.  However, we want to process the error output here, by
filtering out occurrences of lines which use the word `idiot', before
dumping those errors into a file <code>error.log</code>.  So we get an effect
similar to having two pipelines at once, one for output and one for error.
Note again the <em>two</em> `<code>&gt;</code>' signs present next to one another to get
that effect.
<p>Finally, the `<code>=</code>(process)' form.  Why do we need this
as well as the one with `<code>&lt;</code>'?  To understand that, you need to know a
little of how zsh tries to implement the latter type efficiently.  Most
modern UNIX-like systems have `named pipes', which are essentially files
that behave like the `<code>|</code>' on the command line:  one process writes to
the file, another reads from it, and the effect is essentially that data
goes straight through.  If your system has them, you will usually find the
following demonstration works:
<pre>

  % mknod tmpfile p
  % echo This is output &gt;&gt;tmpfile &amp;
  [2] 1507
  % read line &lt;tmpfile
  %
  [2]  + 1507 done       echo This is output &gt;&gt; tmpfile
  % print -- $line
  This is output
  %

</pre>

The syntax to create a named pipe is that rather strange `<code>mknod</code>'
command, with `<code>p</code>' for pipe.  The echo command writes to the file:  the
`<code>&gt;</code><code>&gt;</code>' redirection actually tries to add to the end of it, which
stops it being overwritten with an ordinary non-pipe file.  We stick this
in the background, because it won't do anything yet:  you can't write to
the pipe when there's no-one to read it (a fundamental rule of pipes which
isn't <em>quite</em> as obvious as it may seem, since it <em>is</em> possible for
data to lurk in the pipe, buffered, before the process reading from it
extracts it), so we put that in the background to wait for action.  This
comes in the next line, where we read from the pipe:  that allows the
<code>echo</code> to complete and exit.  Then we print out the line we've read.
<p>The problem with pipes is that they are just temporary storage spaces for
data on the way through.  In particular, you can't go back to the beginning
(in C-speak, `you can't seek backwards on a pipe') and re-read what was
there.  Sometimes this doesn't matter, but some commands, such as editors,
need that facility.  As the `<code>&lt;</code>' process substitution is implemented with
named pipes (well, maybe), there is also the `<code>=</code>' form, which produces a
real, live temporary file, probably in the `<code>/tmp</code>' directory, containing
the output from the file, and then puts the name of that file on the
command line.  The manual notes, unusually helpfully, that this is useful
with the `<code>diff</code>' command for comparing the output of two processes:
<pre>

  diff =(./myscript1) =(./myscript2)

</pre>

where, presumably, the two scripts produce similar, but not identical,
output which you want to compare.
<p>I said `well, maybe' in that paragraph because there's another way zsh can
do `<code>&lt;</code>' process substitutions.  Many modern systems allow you to access
a file with a name like `<code>/dev/fd/0</code>' which corresponds to file
descriptor 0, in this case standard input: to anticipate the section on
redirection, a `file descriptor' is a number assigned to a particular input
or output stream.  This method allows you to access it as a file; and if
this facility is available, zsh will use it to pass the name of the file in
process substitution instead of using a named pipe, since in this case it
doesn't have to create a temporary file; the system does everything.  Now,
if you are really on the ball, you will realise that this doesn't get
around the problem of pipes --- where is data on this file descriptor going
to come from?  The answer is that it will either have to come from a real
temporary file --- which is pointless, because that's what we wanted to
avoid --- or from a pipe opened from some process --- which is equivalent
to the named pipe method, except with just a file descriptor instead of a
name.  So even if zsh does it this way, you still need the `<code>=</code>' form for
programmes which need to go backwards in what they're reading.
<p><p><strong>Parameter substitution</strong><br><br>
    
<p>You've seen enough of this already.  This comes from a `<code>$</code>' followed
either by something in braces, or by alphanumeric characters forming the
name of the parameter: `<code>$foo</code>' or `<code>${foo}</code>', where the second form
protects the expansion from any  other strings at the ends and also allows
a veritable host of extra things to appear inside the braces to modify the
substitution.  More detail will be held over to till chapter 5; there's a
lot of it.
<p><p><strong>Command substitution</strong><br><br>
    
<p>This has two forms, <code>$</code>(process) and
<code>`</code>process<code>`</code>.  They function identically; the first form has two
advantages: substitutions can be nested, since the end character is
different from the start character, and (because it uses a `<code>$</code>') it
reminds you that, like parameter substitutions, command substitutions can
take place inside double-quoted strings.  In that case, like most other
things in quotes, the result will be a single word; otherwise, the result
is split into words on any field separators you have defined, usually
whitespace or the null character.  I'll use the <code>args</code> function again:
<pre>

  % args() { print $# $*; }
  % args $(echo two words)
  2 two words
  % args "$(echo one word)"
  1 one word

</pre>

The first form will split on newlines, not just spaces, so an equivalent is
<pre>

  % args $(echo two; echo words)
  2 two words

</pre>

Thus entire screeds of text will be flattened out into a single line of
single-word command arguments.  By contrast, with the double quotes no
processing is done whatsoever; the entire output is put verbatim into one
command argument, with newlines intact.  This means that the quite common
case of wanting a single complete line from a file per command argument has
to be handled by trickery; zsh has such trickery, but that's the stuff of
chapter five.
<p>Note the difference from process substitution:  no intermediate file name
is involved, the output itself goes straight onto the command line.  This
form of substitution is considerably more common, and, unlike the other, is
available in all UNIX shells, though not in all shells with the more modern
form `<code>$</code>(<code>...</code>)'.
<p>The rule that the command line is evaluated only once, left to right, is
adhered to here, but it's a little more complicated in this case since the
expression being substituted is scanned <em>as a complete command line</em>, so
can include anything a command usually can, with all the rules of quoting
and expansion being applied.  So if you get confused about what a command
substitution is actually up to, you should extract the commands from it and
think of them as a command line in their own right.  When you've worked out
what that's doing, decide what it's output will be, and that's the result
of the substitution.  You can ignore any error output; that isn't captured,
so will go straight to the terminal.  If you want to ignore it, use the
standard trick (see below) `<code>2&gt;/dev/null</code>' <em>inside</em> the command
substitution --- not on the main command line, where it won't work because
substitutions are performed before redirection of the main command line,
and in any case that will have the obvious side effect of changing the
error output from the command line itself.
<p>The only real catch with command substitution is that, as it is run as
separate process --- even if it only involves shell builtins --- no effects
other than the output will percolate back to the main shell:
<pre>

  % print $(bar=value; print bar is $bar)
  bar is value
  % print bar is $bar
  bar is

</pre>

There is maybe room for a form of substitution that runs inside the shell,
instead; however, with modern computers the overhead in starting the extra
process is pretty small --- and in any case we seem to have run out of
new forms of syntax.
<p>Once you know and are comfortable with command substitution, you will
probably start using it all the time, so there is one good habit to get
into straight away.  A particularly common use is simply to put the
contents of a file onto the command line.
<pre>

  # Don't do this, do the other.
  process_cmd `cat file_arguments`

</pre>

But there's a shortcut.
<pre>

  # Do do this, don't do the other
  process_cmd $(&lt;file_arguments)

</pre>

It's not only less writing, it's more efficient:  zsh spots the special
syntax, with the <code>&lt;</code> immediately inside the parentheses, reads the file
directly without bothering to start `<code>cat</code>', and inserts its contents:
no external process is involved.  You shouldn't confuse this with `null
redirections' as described below:  the syntax is awfully similar,
unfortunately, but the feature shown here is not dependent on that other
feature being enabled or set up in a particular way.  In fact, this feature
works in ksh, which doesn't have zsh's null redirections.
<p>You can quote the file-reading form too, of course: in that case, the
contents of the file `<code>cmd_arguments</code>' would be passed as just one
argument, with newlines and spaces intact.
<p>Sometimes, the rule about splitting the result of a command substitution
can get you into trouble:
<pre>

  % typeset foo=`echo words words`
  % print $foo
  words

</pre>

You probably expected the command substitution <em>not</em> to be split
here. but it was, and the shell executed typeset with the arguments
`<code>foo=words</code>' and `words'.  That's because in zsh arguments to
<code>typeset</code> are treated pretty much normally, except for some jiggery
pokery with tildes described below.  Other shells do this differently, and
future versions of zsh (from 4.1.1) will provided a compatibility option,
<code>KSH_TYPESET</code>.  For now, you need to use
quotes:
<pre>

  % typeset foo="`echo words words`"
  % print $foo
  words words

</pre>

<p>A really rather technical afterword: using `<code>$(cat file_arguments)</code>', you
might have counted two extra processes to be started, one being the usual
one for a command substitution, and another the `<code>cat</code>' process, since
that's an external command itself.  That would indeed be the obvious way of
doing it, but in fact zsh has an optimisation in cases like this: if it
knows the shell is about to exit --- in this case, the forked process which
is just interpreting the command line for the substitution --- it will not
bother to start a new process for the last command, and here just replace
itself with the <code>cat</code>.  So actually there's only one extra process here.
Obviously, an interactive shell is never replaced in this way, since
clairvoyance is not yet a feature of the shell.
<p><p><strong>Arithmetic substitution</strong><br><br>
    
<p>Arithmetic substitution is easy to explain:  everything I told you about
the <code>(( ... ))</code> command under numerical parameters, above, applies to
arithmetic substitution.  You simply bang a `<code>$</code>' in front, and it becomes
an expansion.
<pre>

  % print $(( 32 + 2 * 5 ))
  42

</pre>

You can perform everything inside arithmetic substitution that you
can inside the builtin, including assignments; the only difference is that
the status is not set, instead the value is put directly onto the command
line in place of the original expression.  As in C, the value of an
assignment is the value being assigned, `<code>$(( param = 3 + 2))</code>'
substitutes the value 5 as well as assigning it to <code>$param</code>.
<p>By the way, there's an extra level of substitution involved in all
arithmetic expansions, since scalar parameters are subject to arithmetic
expansion when they're read in.  This is simple if they only contain
numbers, but less obvious if they contain complete expressions:
<pre>

  % foo=3+5
  % print $(( foo + 2))
  2

</pre>

The foo was evaluated into 8 before it was substituted in.  Note this means
there were two evaluations:  this doesn't work:
<pre>

  % foo=3+
  % print $(( foo 2 ))
  zsh: bad math expression: operand expected at `'

</pre>

--- the complaint here is about the missing operand after the `<code>+</code>' in
the <code>$foo</code>.  However the following <em>does</em> work:
<pre>

  % foo=3+
  % print $(( $foo 2 ))
  5

</pre>

That's because the scalar <code>$foo</code> is turned into <code>3+</code> first.  This is
more logical than you might think:  with the rule about left to right
evaluation, the <code>$foo</code> is picked up inside the <code>$((...))</code> and expanded
as an ordinary parameter substitution while the argument of <code>$((...))</code>
is being scanned.  Then the complete argument `<code>3+ 2</code>' is expanded as an
arithmetical expression.  (Unfortunately, zsh isn't always this logical;
there could easily be cases where we haven't thought it through --- you
should feel free to bring these to our attention.)
<p>There's an older form with single square brackets instead of double
parentheses; there is now no reason to use it, as it's non-standard, but
you may sometimes still meet it.
<p><p><strong>Brace expansion</strong><br><br>
    
<p>Brace expansion is a feature acquired from the C shell and it's relatives,
although some versions of ksh have it, as it's a compile time option
there.  It's a useful way of saving you from typing the same thing twice on
a single command line:
<pre>

  % print -l {foo,bar}' is used far too often in examples'
  foo is used far too often in examples
  bar is used far too often in examples

</pre>

`<code>print</code>' is given two arguments which it is told to print out one per
line.  The text in quotes is common to both, but one has `<code>foo</code>' in
front, while the other has `<code>bar</code>' in front.  The brace expression can
equally be in the middle of an argument:  for example, a common use of this
among programmers is for similarly named source files:
<pre>

  % print zle_{tricky,vi,word}.c
  zle_tricky.c zle_vi.c zle_word.c

</pre>

As you see, you're not limited to two; you can have any number.  You can
quote a comma if you need a real one:
<pre>

  % print -l \`{\,,.}\'' is a punctuation character'
  `,' is a punctuation character
  `.' is a punctuation character

</pre>

The quotes needed quoting with a backslash to get them into the output.
The second comma is the active one for the braces.
<p>You can nest braces.  Once again, this is done left to right.  In
<pre>

  print {now,th{en,ere{,abouts}}}

</pre>

the first argument of the outer brace is `<code>now</code>', and the second is
`<code>th{en,ere{,abouts}}</code>'.  This brace expands to `<code>then</code>' and then the
expansion of `<code>there{,abouts}</code>', which is `<code>there thereabouts</code>' ---
there's nothing to stop you having an empty argument.  Putting this all
together, we have
<pre>

  print now then there thereabouts

</pre>

<p>There's more to know about brace expansion, which will appear in
chapter 5 on clever expansions.
<p><a name="l58"></a>
<h3>3.6.4: Filename Expansion</h3>
<p>It's a shame the names `filename expansion' and `filename generation' sound
so similar, but most people just refer to `<code>~</code> and <code>=</code> expansion' and
`globbing' respectively, which is all that is meant by the two.  The first
is by far the simpler.  The rule is:  unquoted `<code>~</code>'s at the beginning of
words perform expansion of named directories, which may be your home
directory:
<pre>

  % print ~
  /home/pws

</pre>

some user's home directory:
<pre>

  % print ~root
  /root

</pre>

(that may turn up `<code>/</code>' on your system), a directory named directly by you:
<pre>

  % t=/tmp
  % print ~t
  /tmp

</pre>

a directory you've recently visited:
<pre>

  % pwd
  /home/pws/zsh/projects/zshguide
  % print ~+
  /home/pws/zsh/projects/zshguide
  % cd /tmp
  % print ~-
  /home/pws/zsh/projects/zshguide

</pre>

or a directory in your directory stack:
<pre>

  % pushd /tmp
  % pushd ~
  % pushd /var/tmp
  % print ~2
  /tmp

</pre>

These forms were discussed above.  There are various extra rules.  You can
add a `<code>/</code>' after any of them, and the expansions still take place, so
you can use them to specify just the first part of a longer expression (as
you almost certainly have done with a simple `<code>~</code>').  If you quote the
`<code>~</code>' in any of the ways quoting normally takes place, the expansion
doesn't happen.
<p>A <code>~</code> in the middle of the word means something completely different, if
you have the <code>EXTENDED_GLOB</code> option set; if you don't, it doesn't mean
anything.  There are a few exceptions here; assignments are a fairly
natural one:
<pre>

  % foo=~pws
  % print $foo
  /home/pws

</pre>

(note that the `<code>~pws</code>', being unquoted, was expanded straight away at
the assignment, not at the print statement).  But the following works too:
<pre>

  % PATH=$PATH:~pws/bin

</pre>

because colons are special in assignments.  Note that this happens even if
the variable isn't a colon-separated path; the shell doesn't know what use
you're going to make of all the different variables.
<p>The companion of `<code>~</code>' is `<code>=</code>', which again has to occur at the start
of a word or assignment to be special.  The remainder of the word (here the
<em>entire</em> remainder, because directory paths aren't useful) is taken as
the name of an external command, and the word is expanded to the complete
path to that command, using <code>$PATH</code> just as if the command were to be
executed:
<pre>

  % print =ls
  /bin/ls

</pre>

and, slightly confusingly,
<pre>

  % foo==ls
  % print $foo
  /bin/ls

</pre>

where the two `<code>=</code>'s have two different meanings.  This form is useful
in a number of cases.  For example, you might want to look at or edit a
script which you know is in your path; the form
<pre>

  % vi =scriptname

</pre>

is more convenient than the more traditional
<pre>

  % vi `whence -p ls`

</pre>

where I put the `<code>-p</code>' in to force <code>whence</code> to follow the path,
ignoring builtins, functions, etc.  This brings us to another use for
`<code>=</code>' expansion,
<pre>

  % =ls

</pre>

is a neat and extremely short way of referring to an external command when
<code>ls</code> is usually a function.  It has some of the same effect
as `<code>command ls</code>', but is easier to type.
<p>In versions up to and including <code>4.0</code>, this syntax will also expand
aliases, so you need to be a bit careful if you really want a path to an
external command:
<pre>

  % alias foo='ls -F'
  % print =foo
  ls -F

</pre>

(Path expansion is done in preference, so you are safe if you use
<code>ls</code>, unless your <code>$PATH</code> is strange.)  Putting `<code>=foo</code>' at the
start of the command line doesn't work, and the reason why bears
examination:  <code>=</code>-expansion occurs quite late on, after ordinary alias
expansion and word splitting, so that the result is the single word
`<code>ls -F</code>', where the space is part of the word, which probably doesn't
mean anything (and if it does, don't lend me your computer when I need
something done in a hurry).  It's probably already obvious that alias
expansion here is more trouble than it's worth.  A less-than-exhaustive
search failed to find anyone who liked this feature, and it has been
removed from the shell from 4.1, so that `<code>=</code>'-expansion now only
expands paths to external commands.
<p>If you don't like <code>=</code>-expansion, you can turn it off by setting the
option <code>NO_EQUALS</code>.  One catch, which might make you want to do that, is
that the commands <code>mmv</code>, <code>mcp</code> and <code>mln</code>, which are a commonly used
though non-standard piece of free software, use `<code>=</code>' followed by a
number to replace a pattern, for example
<pre>

  mmv '*.c' '=1.old.c'

</pre>

renames all files ending with <code>.c</code> to end with <code>.old.c</code>.  If you were
not alert, you might forget to quote the second word.  Otherwise, however,
<code>=</code>' isn't very common at the start of a word, so you're probably fairly
safe.  For a way to do that with zsh patterns, see the discussion of
the function <code>zmv</code> below (the answer is `<code>zmv '(*).c' '$1.old.c'</code>').
<p>Note that zsh is smart enough to complete the names of commands after an
`<code>=</code>' of the expandable sort when you hit TAB.
<p><a name="l59"></a>
<h3>3.6.5: Filename Generation</h3>
<p>Filename generation is exactly the same as `globbing':  the expanding of
any unquoted wildcards to match files.  This is only done in one directory
at a time.  So for example
<pre>

  print *.c

</pre>

won't match files in a subdirectory ending in `<code>.c</code>'.  However, it <em>is</em>
done on all parts of a path, so
<pre>

  print */*.c

</pre>

will match all `<code>.c</code>' files in all immediate subdirectories of the
current directory.  Furthermore, zsh has an extension --- one of its most
commonly used special features --- to match files in any subdirectory at
any depth, including the current directory: use two `<code>*</code>'s as part of the
path:
<pre>

  print **/*.c

</pre>

will match `<code>prog.c</code>', `<code>version1/prog.c</code>',
`<code>version2/test/prog.c</code>', `<code>oldversion/working/saved/prog.c</code>', and
so on.  I will talk about filename generation and other uses of zsh's
extremely powerful patterns at much greater length in chapter 5.  My
main thrust here is to fit it into other forms of expansion; the main
thing to remember is that it comes last, after everything has already
been done.
<p>So although you would certainly expect this to work,
<pre>

  print ~/*

</pre>

generating all files in your home directory, you now know why:  it is first
expanded to `<code>/home/pws/*</code>' (or wherever), then the shell scans down the
path until it finds a pattern, and looks in the directory it has reached
(<code>/home/pws</code>) for matching files.  Furthermore,
<pre>

  foo=~/
  print $foo*

</pre>

works.  However, as I explained in the last chapter, you need to be careful
with
<pre>

  foo=*
  print ~/$foo

</pre>

This just prints `<code>/home/pws/*</code>'.  To get the `<code>*</code>' from the parameter
to be a wildcard, you need to tell the shell explicitly that's what you
want:
<pre>

  foo=*
  print ~/${~foo}

</pre>

As also noted, other shells do expand the <code>*</code> as a wildcard anyway.  The
zsh attitude here, as with word splitting, is that parameters should do
exactly what they're told rather than waltz off generating extra words or
expansions.
<p>Be even more careful with arrays:
<pre>

  foo=(*)

</pre>

will expand the <code>*</code> immediately, in the current directory --- the
elements of the array assignment are expanded exactly like a normal command
line glob.  This is often very useful, but note the difference from scalar
assignments, which do other forms of expansion, but not globbing.
<p>I'll mention a few possible traps for the unwary, which might confuse you
until you are a zsh globbing guru.  Firstly, parentheses actually have two
uses.  Consider:
<pre>

  print (foo|bar)(.)

</pre>

The first set of parentheses means `match either <code>foo</code> or <code>bar</code>'.  If
you've used <code>egrep</code>, you will probably be familiar with this.  The
second, however, simply means `match only regular files'.  The `<code>(.)</code>' is
called a `globbing qualifier', because it limits the scope of any matches so
far found.  For example, if either or both of <code>foo</code> and <code>bar</code> were
found, but were directories, they would not now be matched.  There are many
other possibilities for globbing qualifiers.  For now, the easiest way to
tell if something at the end is <em>not</em> a globbing qualifier is if it
contains a `<code>|</code>'.
<p>The second point is about forms like this:
<pre>

  print file-&lt;1-10&gt;.dat

</pre>

The `<code>&lt;</code>' and `<code>&gt;</code>' smell of redirection, as described next, but
actually the form `<code>&lt;</code>', optional start number, `<code>-</code>', optional finish
number, `<code>&gt;</code>' means match any positive integer in the range between the
two numbers, inclusive; if either is omitted, there is no limit on that
end, hence the cryptic but common `<code>&lt;-&gt;</code>' to match any positive integer
--- in other words, any group of decimal digits (bases other than ten are
not handled by this notation).  Older versions of the shell allowed the
form `<code>&lt;&gt;</code>' as a shorthand to match any number, but the overlap with
redirection was too great, as you'll see, so this doesn't work any more.
<p>Another two cryptic symbols are the two that do negation.  These only work
with the option `<code>EXTENDED_GLOB</code>' set:  this is necessary to get the most
out of zsh's patterns, but it can be a trap for the unwary by turning
otherwise innocuous characters into patterns:
<pre>

  print ^foo

</pre>

This means any file in the current directory <em>except</em> the file <code>foo</code>.
One way of coming unstuck with `<code>^</code>' is something like
<pre>

  stty kill ^u

</pre>

where you would hope `<code>^u</code>' means control with `<code>u</code>', i.e. ASCII
character 21.  But it doesn't, if <code>EXTENDED_GLOB</code> is set:  it means `any
file in the current directory except one called `<code>u</code>' ', which is
definitely a different thing.  The other negation operator isn't usually so
fraught, but it can look confusing:
<pre>

  print *.c~f*

</pre>

is a pattern of two halves; the shell tries to match `<code>*.c</code>', but rejects
any matches which also match `<code>f*</code>'.  Luckily, a `<code>~</code>' right at the
end isn't special, so
<pre>

 rm *.c~

</pre>

removes all files ending in `<code>.c~</code>' --- it wouldn't be very nice if it
matched all files ending in `<code>.c</code>' and treated the final `<code>~</code>' as an
instruction not to reject any, so it doesn't.  The most likely case I can
think of where you might have problems is with Emacs' numeric backup files,
which can have a `<code>~</code>' in the middle which you should quote.  There is no
confusion with the directory use of `<code>~</code>', however:  that only occurs at
the beginning of a word, and this use only occurs in the middle.
<p>The final oddments that don't fit into normal shell globbing are forms with
`<code>#</code>'.  These also require that <code>EXTENDED_GLOB</code> be set.  In the
simplest use, a `<code>#</code>' after a pattern says `match this zero or more
times'.  So `<code>(foo|bar)#.c</code>' matches <code>foo.c</code>, <code>bar.c</code>, <code>foofoo.c</code>,
<code>barbar.c</code>, <code>foobarfoo.c</code>, ...  With an extra <code>#</code>, the pattern before
(or single character, if it has no special meaning) must match at least
once.  The other use of `<code>#</code>' is in a facility called `globbing flags',
which look like `<code>(#X)</code>' where `<code>X</code>' is some letter, possibly followed
by digits.  These turn on special features from that point in the pattern
and are one of the newest features of zsh patterns; they will receive much
more space in chapter 5.
<p><a name="l60"></a>
<h2>3.7: Redirection: greater-thans and less-thans</h2>
<p>Redirection means retrieving input from some other file than the usual one,
or sending output to some other file than the usual one.  The simplest
examples of these are `<code>&lt;</code>' and `<code>&gt;</code>', respectively.
<pre>

  % echo 'This is an announcement' &gt;tempfile
  % cat &lt;tempfile &gt;newfile
  % cat newfile
  This is an announcement

</pre>

Here, <code>echo</code> sends its output to the file <code>tempfile</code>; <code>cat</code> took its
input from that file and sent its output --- the same as its input --- to
the file <code>newfile</code>; the second <code>cat</code> takes its input from <code>newfile</code>
and, since its output wasn't redirected, it appeared on the terminal.
<p>The other basic form of redirection is a pipe, using `<code>|</code>'.  Some people
loosely refer to all redirections as pipes, but that's rather confusing.
The input and output of a pipe are <em>both</em> programmes, unlike the case
above where one end was a file.  You've seen lots of examples already:
<pre>

  echo foo | sed 's/foo/bar/'

</pre>

Here, <code>echo</code> sends its output to the programme <code>sed</code>, which substitutes
foo by bar, and sends its own output to standard output.  You can chain
together as many pipes as you like; once you've grasped the basic behaviour
of a single pipe, it should be obvious how that works:
<pre>

  echo foo is a word | 
    sed 's/foo/bar/' | 
    sed 's/a word/an unword/'

</pre>

runs another <code>sed</code> on the output of the first one.  (You can actually
type it like that, by the way; the shell knows a pipe symbol can't be at
the end of a command.)  In fact, a single <code>sed</code> will suffice:
<pre>

  echo foo is a word |
    sed -e 's/foo/bar/' -e 's/a word/an unword/'

</pre>

has the same effect in this case.
<p>Obviously, all three forms of redirection only work if the programme in
question expects input from standard input, and sends output to standard
output.  You can't do:
<pre>

  echo 'edit me' | vi

</pre>

to edit input, since <code>vi</code> doesn't use the input sent to it; it always
deals with files.  Most simple UNIX commands can be made to deal with
standard input and output, however.  This is a big difference from other
operating systems, where getting programmes to talk to each other in an
automated fashion can be a major headache.
<p><a name="l61"></a>
<h3>3.7.1: Clobber</h3>
<p>The word `clobber', as in the option <code>NO_CLOBBER</code> which I mentioned in
the previous chapter, may be unfamiliar to people who don't use English as
their first language.  Its basic meaning is `hit' or `defeat' or
`destroy', as in `Itchy and Scratchy clobbered each other with mallets'.
If you do:
<pre>

  % echo first go &gt;file
  % echo second go &gt;file

</pre>

then <code>file</code> will contain only the words `second go'.  The first thing you
put into the file, `first go', has been clobbered.  Hence the
<code>NO_CLOBBER</code> option: if this is set, the shell will complain when you try
to overwrite the file.  You can use `<code>&gt;|file</code>' or `<code>&gt;! file</code>' to
override this.  You usually can't use `<code>&gt;!file</code>' because history
expansion will try to expand `<code>!file</code>' before the shell parses the line;
hence the form with the vertical bar tends to be more useful.
<p><a name="l62"></a>
<h3>3.7.2: File descriptors</h3>
<p>UNIX-like systems refer to different channels such as input, output and
error by `file descriptors', which are small integers.  Usually three are
special: 0, standard input; 1, standard output; and 2, standard error.
Bourne-like shells (but not csh-like shells) allow you to refer to a
particular file descriptor, instead of standard input or output, by putting
the integer immediately before the `<code>&lt;</code>' or `<code>&gt;</code>' (no space is
allowed).  What's more, if the `<code>&lt;</code>' or `<code>&gt;</code>' is followed immediately
by `<code>&amp;</code>', a file descriptor can follow the redirection (the one
before is optional as usual).  A common use is:
<pre>

  % echo This message will go to standard error &gt;&amp;2

</pre>

The command sends its message to standard output, file descriptor 1.  As
usual, `<code>&gt;</code>' redirects standard output.  This time, however, it is
redirected not to a file, but to file descriptor 2, which is standard
error.  Normally this is the same device as standard output, but it can be
redirected completely separately.  So:
<pre>

  % { echo A message
  cursh&gt; echo An error &gt;&amp;2 } &gt;file
  An error
  % cat file
  A message

</pre>

Apologies for the slightly unclear use of the continuation prompt
`<code>cursh&gt;</code>': this guide goes into a lot of different formats, and some
are a bit finnicky about long lines in preformatted text.  As pointed out
above, the `<code>&gt;file</code>' here will redirect all output from the stuff in
braces, just as if it were a single command.  However, the `<code>&gt;&amp;2</code>'
inside redirects the output of the second <code>echo</code> to standard 
error.  Since this wasn't redirected, it goes straight to the terminal.
<p>Note the form in braces in the previous example --- I'm going to use that
in a few more examples.  It simply sends something to standard output, and
something else to standard error; that's its only use.  Apart from that,
you can treat the bit in braces as a black box --- anything which can
produce both sorts of output.
<p>Sometimes you want to redirect both at once.  The standard Bourne-like way
of doing this is:
<pre>

  % { echo A message
  cursh&gt; echo An error &gt;&amp;2 } &gt;file 2&gt;&amp;1

</pre>

The `<code>&gt;file</code>' redirects standard output from the <code>{</code><em>...</em><code>}</code> to the
file; the following <code>2&gt;&amp;1</code> redirects standard error to wherever standard
output happens to be at that point, which is the same file.  This allows
you to copy two file descriptors to the same place.  Note that the order is
important; if you swapped the two around, `<code>2&gt;&amp;1</code>' would copy standard
error to the initial destination of standard output, which is the terminal,
before it got around to redirecting standard output.
<p>Zsh has a shorthand for this borrowed from csh-like shells:
<pre>

  % { echo A message
  cursh&gt; echo An error &gt;&amp;2 } &gt;&amp;file

</pre>

is exactly equivalent to the form in the previous paragraph, copying
standard output and standard error to the same file.  There is obviously a
clash of syntax with the descriptor-copying mechanism, but if you don't
have files whose names are numbers you won't run into it.  Note that
csh-like shells don't have the descriptor-copying mechanism: the simple
`<code>&gt;&amp;</code>' and the same thing with pipes are the only uses of `<code>&amp;</code>' for
redirections, and it's not possible there to refer to particular file
descriptors.
<p>To copy standard error to a pipe, there are also two forms:
<pre>

  % { echo A message
  cursh&gt; echo An error &gt;&amp;2 } 2&gt;&amp;1 | sed -e 's/A/I/g'
  I message In error
  % { echo A message
  cursh&gt; echo An error &gt;&amp;2 } |&amp; sed -e 's/A/I/'
  I message In error

</pre>

In the first case, note that the pipe is opened before the other
redirection, so that `<code>2&gt;&amp;1</code>' copies standard error to the pipe, not the
original standard output; you couldn't put that after the pipe in any case,
since it would refer to the `<code>sed</code>' command's output.  The second way is
like csh; unfortunately, `<code>|&amp;</code>' has a different meaning in ksh (start a
coprocess), so zsh is incompatible with ksh in this respect.
<p>You can also close a file descriptor you don't need: the form `<code>2&lt;&amp;-</code>'
will close standard error for the command where it appears.
<p>One thing not always appreciated about redirections is that they can occur
anywhere on the command line, not just at the end.
<pre>

  % &gt;file echo foo
  % cat file
  foo

</pre>

<p><a name="l63"></a>
<h3>3.7.3: Appending, here documents, here strings, read write</h3>
<p>There are various other forms which use multiple `<code>&gt;</code>'s and `<code>&lt;</code>'s.
First,
<pre>

  % echo foo &gt;file
  % echo bar &gt;&gt;file
  % cat file
  foo
  bar

</pre>

The `<code>&gt;</code><code>&gt;</code>' appends to the file instead of overwriting it.  Note, however,
that if you use this a lot you may find there are neater ways of doing the
same thing.  In this example,
<pre>

  % { echo foo
  cursh&gt; echo bar } &gt;file
  % cat file
  foo
  bar

</pre>

Here, `<code>cursh&gt;</code>' is a prompt from the shell that it is waiting for you to
close the `<code>{</code>' construct which executes a set of commands in the current
shell.  This construct can have a redirection applied to the entire
sequence of commands: `<code>&gt;file</code>' after the closing brace therefore
redirects the output from both <code>echo</code>s.
<p>In the case of input, doubling the sign has a totally different effect.
The word after the <code>&lt;</code><code>&lt;</code> is not a file, but a string which will be
used to mark in the end of input.  Input is read until a line with only
this string is found:
<pre>

  % sed -e 's/foo/bar/' &lt;&lt;HERE
  heredoc&gt; This line has foo in it.
  heredoc&gt; There is another foo in this one.
  heredoc&gt; HERE
  This line has a bar in it.
  There is another bar in this one.

</pre>

The shell prompts you with `<code>heredoc&gt;</code>' to tell you it is reading a
`here document', which is how this feature is referred to.  When it finds
the final string, in this case `<code>HERE</code>', it passes everything you have
typed as input to the command as if it came from a file.  The command in
this case is the stream editor, which has been told to replace the first
`<code>foo</code>' on each line with a `<code>bar</code>'.  (Replacing things with a bar
is a familiar experience from the city centre of my home town, Newcastle
upon Tyne.)
<p>So far, the features are standard in Bourne-like shells, but zsh has an
extension to here documents, sometimes referred to as `here strings'.
<pre>

  % sed -e 's/string/nonsense/' \ 
  &gt; &lt;&lt;&lt;'This string is the entire document.'
  This nonsense is the entire document.

</pre>

Note that `<code>&gt;</code> on the second line is a continuation prompt, not part
of the command line; it was just too long for the TeX version of this
document if I didn't split it.  This is a shorthand form of `here'
document if you just want to pass a single string to standard input.
<p>The final form uses both symbols: `<code>&lt;&gt;file</code>' opens the file for
reading and writing --- but only on standard input.  In other words, a
programme can now both read from and write to standard input.  This
isn't used all that often, and when you do use it you should remember
that you need to open standard output explicitly to the same file:
<pre>

  % echo test &gt;/tmp/redirtest
  % sed 's/e/Z/g' &lt;&gt;/tmp/redirtest 1&gt;&amp;0
  % cat /tmp/redirtest
  tZtst

</pre>

As standard input (the 0) was opened for writing, you can perform the
unusual trick of copying standard output (the 1) into it.  This is
generally not a particularly safe way of doing in-place editing,
however, though it seems to work fine with sed.  Note that in older
versions of zsh, `<code>&lt;&gt;</code>' was equivalent to `<code>&lt;-&gt;</code>', which is a
pattern that matches any number; this was changed quite some time ago.
<p><a name="l64"></a>
<h3>3.7.4: Clever tricks: exec and other file descriptors</h3>
<p>All Bourne-like shells have two other features.  First, the `command'
<code>exec</code>, which I described above as being used to replace the shell with
the command you give after it, can be used with only redirections after
it.  These redirections then apply permanently to the shell itself, rather
than temporarily to a single command.  So
<pre>

  exec &gt;file

</pre>

makes <code>file</code> the destination for standard output from that point on.  This
is most useful in scripts, where it's quite common to want to change the
destination of all output.
<p>The second feature is that you can use file descriptors which haven't even
been opened yet, as long as they are single digits --- in other words, you
can use numbers 3 to 9 for your own purposes.  This can be combined with
the previous feature for some quite clever effects:
<pre>

  exec 3&gt;&amp;1               
  # 3 refers to stdout
  exec &gt;file
  # stdout goes to `file', 3 untouched
      # random commands output to `file'
  exec 1&gt;&amp;3               
  # stdout is now back where it was
  exec 3&gt;&amp;-
  # file descriptor 3 closed to tidy up

</pre>

Here, file descriptor 3 has been used simply as a placeholder to remember
where standard output was while we temporarily divert it.  This is an
alternative to the `<code>{</code><em>...</em><code>} &gt;file</code>' trick.  Note that you can put
more than one redirection on the <code>exec</code> line: `<code>exec 3&gt;&amp;1 &gt;file</code>' also
works, as long as you keep the order the same.
<p><a name="l65"></a>
<h3>3.7.5: Multios</h3>
<p>Multios allow you to do an implicit `<code>cat</code>' (concatenate files) on input
and `<code>tee</code>' (send the same data to different files) on output.  They
depend on the option <code>MULTIOS</code> being set, which it is by default.  I
described this in the last chapter in discussing whether or not you should
have the option set, so you can look at the examples there.
<p>Here's one fact I didn't mention.  You use output multios like this:
<pre>

  command-generating-output &gt;file1 &gt;file2

</pre>

where the command's output is copied to both files.  This is done by a
process forked off by the shell:  it simply sits waiting for input, then
copies it to all the files in its list.  There's a problem in all versions
of the shell to date (currently 4.0.1):  this process is asynchronous, so
you can't rely on it having finished when the shell starts executing the
next command.  In other words, if you look at <code>file1</code> or <code>file2</code>
immediately after the command has finished, they may not yet contain all
the output because the forked process hasn't finished writing to it.
<p>This is really a bug, but for the time being you will have to live with it
as it's quite complicated to fix in all cases.  Multios are most useful as
a shorthand in interactive use, like so much of zsh; in a script or
function it is safer to use <code>tee</code>,
<pre>

  command-generating-output | tee file1 file2

</pre>

which does the same thing, but as <code>tee</code> is handled as a synchronous
process <code>file1</code> and <code>file2</code> are guaranteed to be complete when the
pipeline exits.
<p><a name="l66"></a>
<h2>3.8: Shell syntax: loops, (sub)shells and so on</h2>
<p><a name="l67"></a>
<h3>3.8.1: Logical command connectors</h3>
<p>I have been rather cavalier in using a couple of elements of syntax without
explaining them:
<pre>

  true  &amp;&amp;  print Previous command returned true
  false  ||  print Previous command returned false

</pre>

The relationship between `<code>&amp;&amp;</code>' and `<code>||</code>' and tests is fairly obvious,
but in this case they connect complete commands, not test arguments.
The `<code>&amp;&amp;</code>' executes the following command if the one before succeeded,
and the `<code>||</code>' executes the following command if the one before failed.
In other words, the first is equivalent to
<pre>

  if true; then
    print Previous command returned true
  fi

</pre>

but is more compact.
<p>There is a perennial argument about whether to use these or not.  In the
comp.unix.shell newsgroup on Usenet, you see people arguing that the
`<code>&amp;&amp;</code>' syntax is unreadable, and only an idiot would use it, while
other people argue that the full `<code>if</code>' syntax is slower and clumsier,
and only an idiot would use that for a simple test; but Usenet is like
that, and both answers are a bit simplistic.  On the one hand, the
difference in speed between the two forms is minute, probably measurable
in microseconds rather than milliseconds on a modern computer; the
scheduling of the shell process running the script by the operating
system is likely to make more difference if these are embedded inside a
much longer script or function, as they will be.  And on the other hand,
the connection between `<code>&amp;&amp;</code>' and a logical `and' is so strong in the
minds of many programmers that to anyone with moderate shell experience
they are perfectly readable.  So it's up to you.  I find I use the
`<code>&amp;&amp;</code>' and `<code>||</code>' forms for a pair of simple commands, but use
`<code>if</code>' for anything more complicated.
<p>I would certainly advise you to avoid chains like:
<pre>

  true || print foo &amp;&amp; print bar || false

</pre>

If you try that, you will see `<code>bar</code>' but not `<code>foo</code>', which is not
what a C programmer would expect.  Using the usual rules of precedence, you
would parse it as: either <code>true</code> must be true; or both the <code>print</code>
statements must be true; or the false must be true.  However, the shell
parses it differently, using these rules:
<dl>
  <li > If you encounter an `<code>&amp;&amp;</code>',
  <dl>
    <li > if the command before it (really the complete pipeline)
    succeeded, execute the command immediately after, and execute what
    follows normally, while
    <li > if the command failed, skip the next command and any others until
    an `<code>||</code>' is encountered, or until the group of commands is ended by
    a newline, a semicolon, or the end of an enclosing group.  Then execute
    whatever follows in the normal way.
  </dl>
  <li > If you encounter an `<code>||</code>',
  <dl>
    <li > if the command before it succeeded, skip the next command and any
    others until an `<code>&amp;&amp;</code>' is encountered, or until the end of the group,
    and execute what follows normally, while
    <li > if the command failed, execute the command immediately after the
    `<code>||</code>'.
  </dl>
</dl>
As you can see, the rule is completely symmetric; a simple summary is that
the logical connectors don't remember their past state.  So in the example
shown, the `<code>true</code>' succeeds, we skip `<code>print foo</code>' but execute
`<code>print bar</code>' and then skip <code>false</code>.  The expression returns status
zero because the last thing it executed did so.  Oddly enough, this is
completely standard behaviour for shells.  This is a roundabout way of
saying `don't use combined chains of `<code>&amp;&amp;</code>'s and `<code>||</code>'s unless you
think G&ouml;del's theorem is for sissies'.
<p>Strictly speaking, the and's and or's come in a hierarchy of things which
connect commands.  They are above pipelines, which explains my remark
above --- an expression like `<code>echo $ZSH_VERSION | sed '/dev//'</code>' is
treated as a single command between any logical connectors --- and they are
below newlines and semicolons --- an expression like `<code>true &amp;&amp; print yes;
false || print no</code>' is parsed as two distinct sets of logically connected
command sequences.  In the manual, a list is a complete set of commands
executed in one go:
<pre>

  echo foo; echo bar

  echo small furry animals

</pre>

--- a shell function is basically a glorified list with arguments and a
name.   A sublist is a set of commands up to a newline or
semicolon, in other words a complete expression possibly involving the
logical connectors:
<pre>

  show -nomoreproc | 
    grep -q foo &amp;&amp; 
    print The word '`foo'\' occurs.

</pre>

A pipeline is a chain of one or more commands connected
by `<code>|</code>', for example both individual parts of the previous sublist,
<pre>

  show -nomoreproc | grep -q foo

</pre>

and
<pre>

  print The word '`foo'\' occurs.

</pre>

count as pipelines.  A simple command is one single unit of execution with a
command name, so to use the same example that includes all three of the
following,
<pre>

  show -nomoreproc
  grep -q foo
  print The word '`foo'\' occurs.

</pre>

<p>This means that in something like
<pre>

  print foo

</pre>

where the command is terminated by a newline and then executed in one go,
the expression is all of the above --- list, sublist, pipeline and simple
command.  Mostly I won't need to make the formal distinction; it sometimes
helps when you need to break down a complicated set of commands.  It's a
good idea, and usually possible, to write in such a way that it's obvious
how the commands break down.  It's not too important to know the details,
as long as you've got a feel for how the shell finds the next command.
<p><a name="l68"></a>
<h3>3.8.2: Structures</h3>
<p>I've shown plenty of examples of one sort of shell structure already, the
<code>if</code> statement:
<pre>

  if [[ black = white ]]; then
    print Yellow is no colour.
  fi

</pre>

The main points are: the `<code>if</code>' itself is followed by some command whose
return status is tested; a `<code>then</code>' follows as a new command; any number
of commands may follow, as complex as you like; the whole sequence is ended
by a `<code>fi</code>' as a command on its own.  You can write the `<code>then</code>' on a
new line if you like, I just happen to find it neater to stick it where it
is.  If you follow the form here, remember the semicolon before it; the
<code>then</code> must start a separate command.  (You can put another command
immediately after the <code>then</code> without a newline or semicolon, though,
although people tend not to.)
<p>The double-bracketed test is by far the most common thing to put here in
zsh, as in ksh, but any command will do; only the status is important.
<pre>

  if true; then
    print This always gets executed
  fi
  if false; then
    print This never gets executed
  fi

</pre>

Here, <code>true</code> always returns true (status 0), while <code>false</code> always
returns false (status 1 in zsh, although some versions return status 255
--- anything nonzero will do).  So the statements following the <code>print</code>s
are correct.
<p>The <code>if</code> construct can be extended by `<code>elif</code>' and `<code>else</code>':
<pre>

  read var
  if [[ $var = yes ]]; then
    print Read yes
  elif [[ $var = no ]]; then
    print Read no
  else
    print Read something else
  fi

</pre>

The extension is pretty straightforward.  You can have as many `<code>elif</code>'s
with different tests as you like; the code following the first test to
succeed is executed.  If no test succeeded, and there is an `<code>else</code>'
(there doesn't need to be), the code following that is executed.  Note
that the form of the `<code>elif</code>' is identical to that of `<code>if</code>',
including the `<code>then</code>', while the else just appears on its own.
<p>The <code>while</code>-loop is quite similar to <code>if</code>.  There are two
differences: the syntax uses <code>while</code>, <code>do</code> and <code>done</code> instead of
<code>if</code>, <code>then</code> and <code>fi</code>, and after the loop body is executed (if it is),
the test is evaluated again.  The process stops as soon as the test is
false.  So
<pre>

  i=0
  while (( i++ &lt; 3 )); do
    print $i
  done

</pre>

prints 1, then 2, then 3.  As with <code>if</code>, the commands in the middle can
be any set of zsh commands, so
<pre>

  i=0
  while (( i++ &lt; 3 )); do
    if (( i &amp; 1 )); then
      print $i is odd
    else
      print $i is even
    fi
  done

</pre>

tells you that 1 and 3 are odd while 2 is even.  Remember that the
indentation is irrelevant; it is purely there to make the structures more
easy to understand.  You can write the code on a single line by replacing
all the newlines with semicolons.
<p>There is also an <code>until</code> loop, which is identical to the <code>while</code> loop
except that the loop is executed until the test is true.  `<code>until
[[</code><em>...</em>' is equivalent to `<code>while ! [[</code><em>...</em>'.
<p>Next comes the <code>for</code> loop.  The normal case can best be demonstrated by
another example:
<pre>

  for f in one two three; do
    print $f
  done

</pre>

which prints out `<code>one</code>' on the first iteration, then `<code>two</code>', then
`<code>three</code>'.  The <code>f</code> is set to each of the three words in turn, and the
body of the loop executed for each.  It is very useful that the words after
the `<code>in</code>' may be anything you would normally have on a shell command
line.  So `<code>for f in *; do</code>' will execute the body of the loop once for
each file in the current directory, with the file available as <code>$f</code>, and
you can use arrays or command substitutions or any other kind of
substitution to generate the words to loop over.
<p>The <code>for</code> loop is so useful that the shell allows a shorthand that you
can use on the command line: try
<pre>

  for f in *; print $f

</pre>

and you will see the files in the current directory printed out, one per
line.  This form, without the <code>do</code> and the <code>done</code>, involves less
typing, but is also less clear, so it is recommended that you only use it
interactively, not in scripts or functions.  You can turn the feature off
with <code>NO_SHORT_LOOPS</code>.
<p>The <code>case</code> statement is used to test a pattern against a series of
possibilities until one succeeds.  It is really a short way of doing
a series of <code>if</code> and <code>elif</code> tests on the same pattern:
<pre>

  read var
  case $var in
    (yes) print Read yes
          ;;
    (no) print Read no
         ;;
    (*) print Read something else
        ;;
   esac

</pre>

is identical to the <code>if</code>/<code>elif</code>/<code>else</code> example above.  The <code>$var</code>
is compared against each pattern in turn; if one matches, the code
following that is executed --- then the statement is exited; no further
matches are looked for.  Hence the `<code>*</code>' at the end, which can match
anything, acts like the `<code>else</code>' of an <code>if</code> statement.
<p>Note the quirks of the syntax: the pattern to test must appear in
parentheses.  For historical reasons, you can miss out the left
parenthesis before the pattern.  I haven't done that mainly because
unbalanced parentheses confuse the system I am using for writing this
guide.  Also, note the double semicolon: this is the only use of double
semicolons in the shell.  That explains the fact that if you type
`<code>;;</code>' on its own the shell will report a `parse error'; it couldn't
find a <code>case</code> to associate it with.
<p>You can also use alternative patterns by separating them with a vertical
bar.  Zsh allows alternatives with extended globbing anyway; but this is
actually a separate feature, which is present in other shells which don't
have zsh's extended globbing feature; it doesn't depend on the
<code>EXTENDED_GLOB</code> option:
<pre>

  read var
  case $var in
    (yes|true|1) print Reply was affirmative
                 ;;
    (no|false|0) print Reply was negative
                 ;;
    (*) print Reply was cobblers
              ;;
  esac

</pre>

The first `<code>print</code>' is used if the value of <code>$var</code> read in was
`<code>yes</code>', `<code>true</code>' or `<code>1</code>', and so on.  Each of the separate items
can be a pattern, with any of the special characters allowed by zsh,
this time depending on the setting of the option <code>EXTENDED_GLOB</code>.
<p>The <code>select</code> loop is not used all that often, in my experience.  It is
only useful with interactive input (though the code may certainly appear in
a script or function):
<pre>

  select var in earth air fire water; do
    print You selected $var
  done

</pre>

This prints a menu; you must type 1, 2, 3 or 4 to select the corresponding
item; then the body of the loop is executed with <code>$var</code> set to the value
in the list corresponding to the number.  To exit the loop hit the break
key (usually <code>^G</code>) or end of file (usually <code>^D</code>: the feature is
so infrequently used that currently there is a bug in the shell that this
tells you to use `<code>exit</code>' to exit, which is nonsense).  If the user
entered a bogus value, then the loop is executed with <code>$var</code> set to the
empty string, though the actual input can be retrieved from <code>$REPLY</code>.
Note that the prompt printed for the user input is <code>$PROMPT3</code>, the only
use of this parameter in the shell: all normal prompt substitutions are
available.
<p>There is one final type of loop which is special to zsh, unlike the others
above.  This is `<code>repeat</code>'.  It can be used two ways:
<pre>

  % repeat 3 print Hip Hip Hooray
  Hip Hip Hooray
  Hip Hip Hooray
  Hip Hip Hooray

</pre>

Here, the first word after <code>repeat</code> is a count, which could be a
variable as normal substitutions are performed.  The rest of the line (or
until the first semicolon) is a command to repeat; it is executed
identically each time.
<p>The second form is a fully fledged loop, just like <code>while</code>:
<pre>

  % repeat 3; do
  repeat&gt; print Hip Hip Hooray
  repeat&gt; done
  Hip Hip Hooray
  Hip Hip Hooray
  Hip Hip Hooray

</pre>

which has the identical effect to the previous one.  The `<code>repeat&gt;</code>' is
the shell's prompt to show you that it is parsing the contents of a
`<code>repeat</code>' loop.
<p><a name="l69"></a>
<h3>3.8.3: Subshells and current shell constructs</h3>
<p>More catching up with stuff you've already seen.  The expression in
parentheses here:
<pre>

  % (cd ~; ls)
  &lt;all the files in my home directory&gt;
  % pwd
  &lt;where I was before, not necessarily ~&gt;

</pre>

is run in a subshell, as if it were a script.  The main difference is that
the shell inherits almost everything from the main shell in which you are
typing, including options settings, functions and parameters.  The most
important thing it doesn't inherit is probably information about jobs: if
you run <code>jobs</code> in a subshell, you will get no output; you can't use
<code>fg</code> to resume a job in a subshell; you can't use `<code>kill %</code><em>n</em>' to
kill a job (though you can still use the process ID); and so on.  By now
you should have some feel for the effect of running in a separate process.
Running a command, or set of commands, in a different directory, as in this
example, is one quite common use for this construct.
<p>On the other hand, the expression in braces here:
<pre>

  % {cd ~; ls}
  &lt;all the files in my home directory&gt;
  % pwd
  /home/pws

</pre>

is run in the current shell.  This is what I was blathering on about in
the section on redirection.  Indeed, unless you need some special effect
like redirecting a whole set of commands, you won't use the
current-shell construct.  The example here would behave just the same
way if the braces were missing.
<p>As you might expect, the syntax of the subshell and current-shell forms is
very similar.  You can use redirection with both, just as with simple
commands, and they can appear in most places where a simple command can
appear:
<pre>

  [[ $test = true ]] &amp;&amp; {
    print Hello.
    print Well, this is exciting.
  }

</pre>

That would be much clearer using an `<code>if</code>', but it works.  For some
reason, you often find expressions of this form in system start-up files
located in the directory <code>/etc/rc.d</code> or, on older systems, in files
whose names begin with `<code>/etc/rc.</code>'.  You can even do:
<pre>

  if { foo=bar; [[ $foo = bar ]] }; then
    print yes
  fi

</pre>

but that's also pretty gross.
<p>One use for <code>{</code><em>...</em><code>}</code> is to make sure a whole set of commands is
executed at once.  For example, if you copy a set of commands from a script
in one window and want them to be run in one go in a shell in another
window, you can do:
<pre>

   % {
   cursh&gt;            # now paste your commands in here...
    ...
   cursh&gt; }

</pre>

and the commands will only be executed when you hit return after the final
`<code>}</code>'.  This is also a workaround for some systems where cut and paste
has slightly odd effects due to the way different states of the terminal
are handled.  The current-shell construct is a little bit like an anonymous
function, although it doesn't have any of the usual features of functions
--- you can't pass it arguments, and variables declared inside aren't local
to that section of code.
<p><a name="l70"></a>
<h3>3.8.4: Subshells and current shells</h3>
<p>In case you're confused about what happens in the current shell and what
happens in a subshell, here's a summary.
<p>The following are run in the current shell.
<ol>
  <li> All shell builtins and anything which looks like one, such
     as a precommand modifier and tests with `<code>[[</code>'.
  <li> All complex statements and loops such as <code>if</code> and <code>while</code>.
     Tests and code inside the block must both be considered separately.
  <li> All shell functions.
  <li> All files run by `<code>source</code>' or `<code>.</code>' as well as startup files.
  <li> The code inside a `<code>{</code><em>...</em><code>}</code>'.
  <li> The right hand side of a pipeline: this is guaranteed in zsh, but
     don't rely on it for other shells.
  <li> All forms of substitution except <code>`</code><em>...</em><code>`</code>,
     <code>$</code>(<em>...</em>), <code>=</code>(<em>...</em>),
     <code>&lt;</code>(<em>...</em>) and <code>&gt;</code>(<em>...</em>).
</ol>
<p>The following are run in a subshell.
<ol>
  <li> All external commands.
  <li> Anything on the left of a pipe, i.e. all sections of a
     pipeline but the last.
  <li> The code inside a `<code></code>(<em>...</em>)'.
  <li> Substitutions involving execution of code,
     i.e. <code>`</code><em>...</em><code>`</code>, <code>$</code>(<em>...</em>),
     <code>=</code>(<em>...</em>), <code>&lt;</code>(<em>...</em>) and
     <code>&gt;</code>(<em>...</em>).  (TCL fans note that this is different
     from the `<code>[</code><em>...</em><code>]</code>' command substitution in that language.)
  <li> Anything started in the background with `<code>&amp;</code>' at the end.
  <li> Anything which has ever been suspended.  This is a little subtle:
     suppose you execute a set of commands in the current shell and
     suspend it with <code>^Z</code>.  Since the shell needs to return you to
     the prompt, it forks a subshell to remember the commands it was
     executing when you interrupted it.  If you use <code>fg</code> or <code>bg</code> to
     restart, the commands will stay in the subshell.  This is a special
     feature of zsh; most shells won't let you interrupt anything in the
     current shell like that, though you can still abort it with <code>^C</code>.
</ol>
With an alias, you can't tell where it will be executed --- you need to
find out what it expands too first.  The expansion naturally takes place in
the current shell.
<p>Of course, if for some reason the current set of commands is already
running in a subshell, it doesn't get magically returned to the current
shell --- so a shell builtin on the left hand side of a pipeline is running
in a subshell.  However, it doesn't get an extra subshell, as an external
command would.  What I means is:
<pre>

  { print Hello; cat file } |
    while read line; print $line; done

</pre>

The shell forks, producing a subshell, to execute the left hand side of the
pipeline, and that subshell forks to execute the <code>cat</code> external command,
but nothing else in that set of commands will cause a new subshell to be
created.
<p>(For the curious only: actually, that's not quite true, and I already
pointed this out when I talked about command substitutions: the shell keeps
track of occasions when it is in a subshell and has no more commands to
execute. In this case it will not bother forking to create a new process
for the <code>cat</code>, it will simply replace the subshell which is not needed
any more.  This can only happen in simple cases where the shell has no
clearing up to do.)
<p><a name="l71"></a>
<h2>3.9: Emulation and portability</h2>
<p>I described the options you need to set for compatibility with ksh in the
previous chapter.  Here I'm more interested in the best way of running ksh
scripts and functions.
<p>First, you should remember that because of all zsh's options you can't
assume that a piece of zsh code will simply run a piece of sh or ksh code
without any extra changes.  Our old friend <code>SH_WORD_SPLIT</code> is the most
common problem, but there are plenty of others.  In addition to options,
there are other differences which simply need to be worked around.  I
will list some of them a bit later.  Generally speaking, Bourne shell is
simple enough that zsh emulates it pretty well --- although beware in case
you are using bash extensions, since to many Linux users bash is the
nearest approximation to the Bourne shell they ever come across.  Zsh makes
no attempt to emulate bash, even though some of bash's features have been
incorporated.
<p>To make zsh emulate ksh or sh as closely as it knows how, there are various
things you can do.
<ol>
  <li> Invoke zsh under the name sh or ksh, as appropriate.  You can do
     this by creating a symbolic link from zsh to sh or ksh.  Then when
     zsh starts up all the options will be set appropriately.  If you
     are starting that shell from another zsh, you can use the feature
     of zsh that tricks a programme into thinking it has a different name:
     `<code>ARGV0=sh zsh</code>' runs zsh under the name sh, just like the symbolic
     link method.
  <li> Use `<code>emulate ksh</code>' at the top of the script or function you
     want to run.  In the case of a function, it is better to run
     `<code>emulate -L ksh</code>' since this makes sure the normal options will
     be restored when the function exits; this is irrelevant for a script
     as the options cannot be propagated to the process which ran the
     script.  You can also use the option `<code>-R</code>' after <code>emulate</code>, which
     forces more options to be like ksh; these extra options are generally
     for user convenience and not relevant to basic syntax, but in some
     cases you may want the extra cover provided.
<p>If it's possible the script may already be running under ksh, you
     can instead use
<pre>

  [[ -z $ZSH_VERSION ]] &amp;&amp; emulate ksh

</pre>

     or for sh, using the simpler test command there,
<pre>

  [ x$ZSH_VERSION = x ] &amp;&amp; emulate sh

</pre>

</ol>
Both these methods have drawbacks, and if you plan to be a heavy zsh user
there's no substitute for simply getting used to zsh's own basic syntax.
If you think there is some useful element of emulation we missed, however,
you should certainly tell the zsh-workers mailing list about it.
<p>Emulation of ksh88 is much better than emulation of ksh93.  Support for
the latter is gradually being added, but only patchily.
<p>There is no easy way of converting code written for any csh-like shell; you
will just have to convert it by hand.  See the FAQ for some hints on
converting aliases to functions.
<p><a name="l72"></a>
<h3>3.9.1: Differences in detail</h3>
<p>Here are some differences from ksh88 which might prove significant for ksh
programmers.  This is lifted straight from the corresponding section of the
FAQ; it is not complete, and indeed some of the `differences' could be
interpreted as bugs.  Those marked `*' perform in a ksh-like manner if the
shell is invoked with the name `ksh', or if `emulate ksh' is in effect.
<p><dl>
<li > Syntax:
<dl>
  <li >* Shell word splitting.
  <li >* Arrays are (by default) more csh-like than ksh-like:
      subscripts start at 1, not 0; <code>array[0]</code> refers to <code>array[1]</code>;
      <code>$array</code> refers to the whole array, not <code>$array[0]</code>;
      braces are unnecessary: <code>$a[1] == ${a[1]}</code>, etc.
      The <code>KSH_ARRAYS</code> option is now available.
  <li >  Coprocesses are established by <code>coproc</code>; <code>|&amp;</code> behaves like
      csh.  Handling of coprocess file descriptors is also different.
  <li >  In <code>cmd1 &amp;&amp; cmd2 &amp;</code>, only <code>cmd2</code> instead of the whole
      expression is run in the background in zsh.  The manual implies
      this is a bug.  Use <code>{ cmd1 &amp;&amp; cmd2 } &amp;</code> as a workaround.
</dl>
<li > Command line substitutions, globbing etc.:
<dl>
  <li >* Failure to match a globbing pattern causes an error (use
      <code>NO_NOMATCH</code>).
  <li >* The results of parameter substitutions are treated as plain text:
      <code>foo="*"; print $foo</code> prints all files in ksh but <code>*</code> in zsh
      (uset <code>GLOB_SUBST</code>).
  <li >* <code>$PSn</code> do not do parameter substitution by default (use <code>PROMPT_SUBST</code>).
  <li >* Standard globbing does not allow ksh-style `pattern-lists'.
    See chapter 6 for a list of equivalent zsh forms.
    The <code>^</code>, <code>~</code> and <code>#</code> (but not <code>|</code>) forms require <code>EXTENDED_GLOB</code>.
    From version 3.1.3, the ksh forms are fully supported when the
    option <code>KSH_GLOB</code> is in effect; for previous versions you
    must use the table above.
<p>[1] Note that <code>~</code> is the only globbing operator to have a lower
      precedence than <code>/</code>.  For example, <code>**/foo~*bar*</code> matches any
      file in a subdirectory called <code>foo</code>, except where <code>bar</code>
      occurred somewhere in the path (e.g. <code>users/barstaff/foo</code> will
      be excluded by the <code>~</code> operator).  As the <code>**</code> operator cannot
      be grouped (inside parentheses it is treated as <code>*</code>), this is
      the way to exclude some subdirectories from matching a <code>**</code>.
  <li >  Unquoted assignments do file expansion after colons (intended for
      PATHs).
  <li >  <code>integer</code> does not allow <code>-i</code>.
  <li >  <code>typeset</code> and <code>integer</code> have special behaviour for
      assignments in ksh, but not in zsh.  For example, this doesn't
      work in zsh:
<pre>

        integer k=$(wc -l ~/.zshrc)
    
</pre>

      because the return value from <code>wc</code> includes leading
      whitespace which causes wordsplitting.  Ksh handles the
      assignment specially as a single word.
</dl>
<li > Command execution:
<dl>
  <li >* There is no <code>$ENV</code> variable (use <code>/etc/zshrc</code>, <code>~/.zshrc</code>;
      note also <code>$ZDOTDIR</code>).
  <li >  <code>$PATH</code> is not searched for commands specified
      at invocation without -c.
</dl>
<li > Aliases and functions:
<dl>
  <li >  The order in which aliases and functions are defined is significant:
      function definitions with () expand aliases.
  <li >  Aliases and functions cannot be exported.
  <li >  There are no tracked aliases: command hashing replaces these.
  <li >  The use of aliases for key bindings is replaced by `bindkey'.
  <li >* Options are not local to functions (use LOCAL_OPTIONS; note this
      may always be unset locally to propagate options settings from a
      function to the calling level).
</dl>
  <li > Traps and signals:
<dl>
  <li >* Traps are not local to functions.  The option LOCAL_TRAPS is
        available from 3.1.6.
  <li >  TRAPERR has become TRAPZERR (this was forced by UNICOS which
      has SIGERR).
</dl>
<li > Editing:
<dl>
  <li >  The options <code>emacs</code>, <code>gmacs</code>, <code>viraw</code> are not supported.
      Use bindkey to change the editing behaviour: <code>set -o {emacs,vi}</code>
      becomes <code>bindkey -{e,v}</code>; for gmacs, go to emacs mode and use
      <code>bindkey \^t gosmacs-transpose-characters</code>.
  <li >  The <code>keyword</code> option does not exist and <code>-k</code> is instead
      interactivecomments.  (<code>keyword</code> will not be in the next ksh
      release either.)
  <li >  Management of histories in multiple shells is different:
      the history list is not saved and restored after each command.
      The option <code>SHARE_HISTORY</code> appeared in 3.1.6 and is set in ksh
      compatibility mode to remedy this.
  <li >  <code>\</code> does not escape editing chars (use <code>^V</code>).
  <li >  Not all ksh bindings are set (e.g. <code>&lt;ESC&gt;#</code>; try <code>&lt;ESC&gt;q</code>).
  <li >* <code>#</code> in an interactive shell is not treated as a comment by
      default.
</dl>
<li > Built-in commands:
<dl>
  <li >  Some built-ins (<code>r</code>, <code>autoload</code>, <code>history</code>, <code>integer</code> ...)
      were aliases in ksh.
  <li >  There is no built-in command newgrp: use e.g. <code>alias
      newgrp="exec newgrp"</code>
  <li >  <code>jobs</code> has no <code>-n</code> flag.
  <li >  <code>read</code> has no <code>-s</code> flag.
</dl>
<li > Other idiosyncrasies:
<dl>
  <li >  <code>select</code> always redisplays the list of selections on each loop.
</dl>
</dl>
<p><a name="l73"></a>
<h3>3.9.2: Making your own scripts and functions portable</h3>
<p>There are also problems in making your own scripts and functions
available to other people, who may have different options set.
<p>In the case of functions, it is always best to put `<code>emulate -L zsh</code>'
at the top of the function, which will reset the options to the default
zsh values, and then set any other necessary options.  It doesn't take
the shell a great deal of time to process these commands, so try and
getinto the habit of putting them any function you think may be used by
other people.  (Completion functions are a special case as the
environment is already standardised --- see chapter 6 for this.)
<p>The same applies to scripts, since if you run the script without using
the option `<code>-f</code>' to zsh the user's non-interactive startup files will
be run, and in any case the file <code>/etc/zshenv</code> will be run.  We urge
system administrators not to set options unconditionally in that file
unless absolutely necessary; but they don't always listen.  Hence an
<code>emulate</code> can still save a lot of grief.
<p><a name="l74"></a>
<h2>3.10: Running scripts</h2>
<p>Here are some final comments on running scripts: they apply regardless
of the problems of portability, but you should certainly also be aware
of what I was saying in the previous section.
<p>You may be aware that you can force the operating system to run a script
using a particular interpreter by putting `<code>#!</code>' and the path to the
interpreter at the top of the script.  For example, a zsh script could
start with
<pre>

   #!/usr/local/bin/zsh
   print The arguments are $*

</pre>

assuming that zsh lives in the directory <code>/usr/local/bin</code>.  Then you
can run the script under its name as if it were an ordinary command.
Suppose the script were called `<code>scriptfile</code>' and in the current
directory, and you want to run it with the arguments `<code>one two
forty-three</code>'.  First you must make sure the script is executable:
<pre>

  % chmod +x scriptfile

</pre>

and then you can run it with the arguments:
<pre>

  % ./scriptfile one two forty-three
  The arguments are one two forty-three

</pre>

The shell treats the first line as a comment, since it begins with a
`<code>#</code>', but note it still gets evaluated by the shell; the system
simply looks inside the file to see if what's there, it doesn't change
it just because the first line tells it to execute the shell.
<p>I put the `<code>./</code>' in front to refer to the current directory because I
don't usually have that in my path --- this is for safety, to avoid
running things which happen to have names like commands simply because
they were in the current directory.  But many people aren't so paranoid,
and if `<code>.</code>'  is in your path, you can omit the `<code>./</code>'.  Hence,
obviously, it can be anywhere else in your path: it is searched for as
an ordinary executable.
<p>The shell actually provides this mechanism even on operating systems
(now few and far between in the UNIX world) that don't have the feature
built into them.  The way this works is that if the shell found the
file, and it was executable, but running it didn't work, then it will
look for the <code>#!</code>, extract the name following and run (in this
example) `<code>/usr/local/bin/zsh</code> &lt;path&gt;/scriptfile <code>one two
forty-three</code>', where &lt;path&gt; is the path where the file was found.
This is, in fact, pretty much what the system does if it handles it
itself.
<p>Some shells search for scripts using the path when they are given as
filenames at invocation, but zsh happens not to.  In other words,
`<code>zsh scriptfile</code>' only runs <code>scriptfile</code> in the current directory.
<p>There are two other features you may want to be aware of.  Both are down
to the operating system, if that is what is responsible for the `<code>#!</code>'
trick (true of all the most common UNIX-like systems at the moment).
First, you are usually allowed to supply one, but only one, argument or
option in the `<code>#!</code>' line, thus:
<pre>

  #!/usr/local/bin/zsh -f
  print other stuff here

</pre>

which stops startup files other than <code>/etc/zshenv</code> from being run, but
otherwise works the same as before.  If you need more options, you
should combine them in the same word.  However, it's usually clearer,
for anything apart from <code>-f</code>, <code>-i</code> (which forces the shell into
interactive mode) and a few other options which need to take effect
immediately, to put a `<code>setopt</code>' line at the start of the body of the
script.  In a few versions of zsh, there was an unexpected consequence
of the fact that the line would only be split once: if you accidentally
left some spaces at the end of the line (e.g. `<code>#!/usr/local/bin/zsh
-f </code>') they would be passed down to the shell, which would report an
error, which was hard to interpret.  The spaces will still usually be
passed down, but the shell is now smart enough to ignore spaces in an
option list.
<p>The second point is that the length of the `<code>#!</code>' line which will be
evaluated is limited.  Often the limit is 32 characters, in total, That
means if your path to zsh is long, e.g.
`<code>/home/users/psychology/research/dreams/freud/solaris_2.5/bin/zsh</code>'
the system won't be able to find the shell.  Your only recourse is to
find a shorter path, or execute the shell directly, or some sneakier
trick such as running the script under <code>/bin/sh</code> and making that start
zsh when it detects that zsh isn't running yet.  That's a fairly nasty
way of doing it, but just in case you find it necessary, here's an
example:
<pre>

  #!/bin/sh

  if [ x$ZSH_VERSION = x ]; then
    # Put the right path in here ---
    # or just rely on finding zsh in
    # $path, since `exec' handles that.
    exec /usr/local/bin/zsh $0 "$@"
  fi

  print $ZSH_VERSION
  print Hello, this is $0
  print with arguments $*.

</pre>

Note that first `<code>$0</code>', which passes down the name of the script that was
originally executed.  Running this as `<code>testexec foo bar</code>' gives me
<pre>

  3.1.9-dev-8
  Hello, this is /home/pws/tmp/testexec
  with arguments foo bar.

</pre>

I hope you won't have to resort to that.  By the way, really,
excruciatingly old versions of zsh didn't have <code>$ZSH_VERSION</code>.  Rather
than fix the script, I suggest you upgrade the shell.  Also, on some old
Bourne shells you might need to replace <code>"$@"</code> with <code>${1+"$@"}</code>,
which is more careful about only putting in arguments if there were any
(this is the sort of thing we'll see in chapter 5).  Usually this isn't
necessary.
<p>You can use the same trick on ancient versions of UNIX which didn't
handle `<code>#!</code>'.  On some such systems, anything with a `<code>:</code>' as the
first character is run with the Bourne shell, so this serves as an
alternative to `<code>#!/bin/sh</code>', while on some Berkeley systems, a plain
`<code>#</code>' caused csh to be used.  In the second case, you will need to
change the syntax of the first test to be understood by both zsh and
csh.  I'll leave that as an exercise for the reader.  If you have perl
(very probable these days) you can look at the <code>perlrun</code> manual page,
which discusses the corresponding problem of starting perl scripts from
a shell, for some ideas.
<p>There's one other glitch you may come across.  Sometimes if you type the
name of a script which you know is in your path and is executable, the
shell may tell you `<code>file not found</code>', or some equivalent message.
What this usually means is that the <em>interpreter</em> wasn't found,
because you mistyped the line after the `<code>#!</code>'.  This confusing
message isn't the shell's fault: a lot of operating systems return the
same system error in this case as if the script were really not found.
It's not worth the shell searching the path to see if the script is
there, because in the vast majority of cases the error refers to the
programme in the execution path.  If the operating system returned the
more natural error, `<code>exec format error</code>', then the shell would know
that there was something wrong with the file, and could investigate; but
unfortunately life's not that simple.

<p><a name="c4"></a>
    

<hr>
<ul>
    <li> <a href="zshguide04.html">Next chapter</a>
    <li> <a href="zshguide02.html">Previous chapter</a>
    <li> <a href="zshguide.html">Table of contents</a>
</ul>
<hr>
</body>
</html>