Sophie

Sophie

distrib > Mageia > 1 > i586 > by-pkgid > d92aa75c2d384ff9f513aed09a46f703 > files > 593

parrot-doc-3.1.0-2.mga1.i586.rpm

# Copyright (C) 2008, Parrot Foundation.

=head1 PCT Tutorial Episode 1: Introduction

=head2 Introduction

This is the first episode in a tutorial series on building a compiler with the
Parrot Compiler Tools. If you're interested in virtual machines, you've
probably heard of the Parrot virtual machine. Parrot is a generic virtual
machine designed for dynamic languages. This is in contrast with the Java
virtual machine (JVM) and Microsoft's B<C>ommon B<L>anguage B<R>untime (CLR), both of
which were designed to run static languages. Both the JVM and Microsoft
(through the Dynamic Language Runtime -- DLR) are adding support for dynamic
languages, but their primary focus is still static languages.

=head2 High Level Languages

The main purpose of a virtual machine is to run programs. These programs are
typically written in some B<H>igh B<L>evel B<L>anguage (HLL). Some well-known dynamic
languages (sometimes referred to as scripting languages) are Lua, Perl, PHP,
Python, Ruby, and Tcl. Parrot is designed to be able to run all these languages.
Each language that Parrot hosts, needs a compiler to parse the syntax of the
language and generate Parrot instructions.

If you've never implemented a programming language (and maybe even if you have
implemented a language), you might consider writing a compiler a bit of a black
art. I know I did when I became interested. And you know what, it is. Compilers
are complex programs, and implementing a language can be very difficult.

The Facts: 1) Parrot is suitable for running virtually any dynamic language
known, but before doing so, compilers must be written, and 2) writing compilers
from scratch is rather difficult.

=head2 The Parrot Compiler Toolkit

Enter the B<P>arrot B<C>ompiler B<T>oolkit (PCT). In order to make Parrot an interesting
target for language developers, the process of constructing a compiler should be
supported by the right tools. Just as any construction task becomes much easier
if you have the right tools (you wouldn't build a house using only your bare
hands, would you?), the same is true for constructing a compiler. The PCT was
designed to do just that: provide powerful tools to make writing a compiler for
Parrot childishly easy.

This tutorial introduces the PCT by showing how a simple case study language is
implemented for Parrot. The sample language is not
as complex as a real-world language, but is interesting enough to whet your
appetite and show the power of the PCT.

This tutorial also presents some
exercises which you can explore in order to learn more details of the PCT not
covered in this tutorial.

=head2 Squaak: A Simple Language

The case study language is named Squaak. We will be implementing on Parrot
a full-fledged compiler that can compile a Squaak program from source into
Parrot Intermediate Representation (PIR) or run a Squaak program immediately.
The compiler can also be used as an interactive interpreter, REPL, for Squaak.

Squaak demonstrates some common language constructs,
but is lacking some other, seemingly simple, features.
For instance, our language will have no return, break or
continue statements (or equivalents in your favorite syntax).

Squaak has the following features:

=over 4

=item * global and local variables

=item * basic types: integer, floating-point and strings

=item * aggregate types: arrays and hash tables

=item * operators: +, -, /, *, %, <, <=, >, >=, ==, !=, .., and, or, not

=item * subroutines and parameters

=item * assignments and various control statements, such as "if" and "while"

=back

As you can see, a number of common (more advanced) features are missing.
Most notable are:

=over 4

=item * classes and objects

=item * control flow statements such as break and return

=item * advanced control statements such as switch

=item * closures (nested subroutines and accessing local variables in an outer scope)

=back

=head2 The Tools

The Parrot Compiler Toolkit consists of the following tools:

=over 4

=item B<N>ot B<Q>uite B<P>erl (6) (NQP-rx).

NQP is a lightweight language inspired by Perl 6 which can be used to write the
methods that must be executed during the parsing phase, just as you can write
actions in a Yacc/Bison input file. It also provides the regular expression engine we'll use to
write our grammar. In addition to the capabilities of Perl 5's regexes, the Perl 6 regexes that NQP
implements can be used to define language grammars. (Check the references for the specification.)

=item B<P>arrot B<A>bstract B<S>yntax B<T>ree (PAST).

The PAST nodes are a set of classes defining generic abstract syntax tree nodes
that represent common language constructs.

=item HLL::Compiler class.

This class is the compiler driver for any PCT-based compiler.

=back

=head2 Getting Started

For this tutorial, it is assumed you have successfully compiled parrot
(and maybe even run the test suite).

If, after reading this tutorial, you feel like
contributing to one of the already implemented languages, you can check out the mailing list or
join IRC (see the references section for details).

Parrot comes with a Perl 5  script that generates the necessary files for a
language implementation. In order to generate these files for our sample language,
go the Parrot's root directory and type:

 $ perl tools/dev/mk_language_shell.pl Squaak ~/src/squaak

(Note: if you're on Windows, you should use backslashes.) This will generate the
files in the directory F<~/src/squaak>. The name of the language will be Squaak.

After this, go to the directory F<~/src/squaak> and type:

 $ parrot setup.pir test

This will compile the grammar and the actions and run the test suite.
For running F<setup.pir> you can either use an installed parrot executable
from your distribution or the one you have just compiled.

If you want more
information on what files are being generated, please check out the references
at the end of this episode or read the documentation included in the file
F<tools/dev/mk_language_shell.pl>.

Note that we didn't write a single line of code, and already we have the basic
infrastructure in place to get us started. Of course, the generated compiler
doesn't even look like the language we will be implementing, but that's ok for
now. Later we'll adapt the grammar to accept our language.

Now you might want to actually run a simple script with this compiler. Launch
your favorite editor, and put in this statement:

 say "Squaak!";

Save it the as file F<test.sq> and type:

 $ ./installable_squaak test.sq

"installable_squaak" is a "fake-cutable" an executable that bundles the Parrot interpreter and the
compiled bytecode for a program to allow treating a Parrot program as a normal executable program.
This will run Parrot, specifying squaak.pbc as the file to be run by Parrot,
which takes a single argument: the file test.sq. If all went well, you should
see the following output:

 $ ./installable_squaak test.sq
 Squaak!

Instead of running a script file, you can also run the Squaak compiler as an
interactive interpreter. Run the Squaak compiler without specifying a script
file, and type the same statement as you wrote in the file:

 $ ./installable_squaak
 say "Squaak!";

which will print:

 Squaak!

=head2 What's next?

This first episode of this tutorial is mainly an overview of what will be
coming. Hopefully you now have a global idea of what the Parrot Compiler Tools
are, and how they can be used to build a compiler targeting Parrot. If you want
to check out some serious usage of the PCT, check out Rakudo (Perl 6 on Parrot)
at http://rakudo.org/ or Pynie (Python on Parrot) at http://code.google.com/p/pynie/ .

The next episodes will focus on the step-by-step implementation of our language,
including the following topics:

=over 4

=item structure of PCT-based compilers

=item using NQP-rx rules to define the language grammar

=item implementing operator precedence using an operator precedence table

=item using NQP to write embedded parse actions

=item implementing language library routines

=back

In the mean time, experiment for yourself. You are welcome to join us on IRC
(see the References section for details). Any feedback on this tutorial is
appreciated.

=head2 Exercises

The exercises are provided at the end of each episode of this tutorial. In
order to keep the length of this tutorial somewhat acceptable, not everything
can be discussed in full detail. With episode 3 the answers and/or solutions
to these exercises are at the end of each episode. The answer of the exercise
from episode 1 is at the end of episode 2.

=head3 Advanced interactive mode.

Launch your favorite editor and look at the file Compiler.pm in the directory
F<~/src/squaak/src/Squaak/>. This file contains the main function (entry point) of the
compiler. The class HLLCcompiler defines methods to set a command-line banner
and prompt for your compiler when it is running in interactive mode. For
instance, when you run Python in interactive mode, you'll see:

 Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on
 win32 Type "help", "copyright", "credits" or "license" for more information.

or something similar (depending on your Python installation and version).
This text is called the command line banner. And while running in interactive
mode, each line will start with:

 >>>

which is called a prompt. For Squaak, we'd like to see the following when
running in interactive mode (of course you can change this according to your
personal taste):

 $ ./installable_squaak
 Squaak for Parrot VM.
 >

Add code to the file squaak.pir to achieve this.

Hint 1: Look in the INIT block.

Hint 2: Note that only double-quoted strings in NQP can interpret
escape-characters such as '\n'.

Hint 3: The functions to do this are documented in
F<compilers/pct/src/PCT/HLLCompiler.pir>.

=head2 References

=over 4

=item * Parrot mailing list: parrot-dev@lists.parrot.org

=item * IRC: join #parrot on irc.parrot.org

=item * Getting started with PCT: docs/pct/gettingstarted.pod

=item * Parrot Abstract Syntax Tree (PAST): docs/pct/past_building_blocks.pod

=item * Operator Precedence Parsing with PCT: docs/pct/pct_optable_guide.pod

=item * Perl 6/NQP rules syntax: Synopsis 5 at http://perlcabal.org/syn/S05.html or http://svn.pugscode.org/pugs/docs/Perl6/Spec/S05-regex.pod

=item * List of HLL projects: http://trac.parrot.org/parrot/wiki/Languages

=back

=cut