Sophie

Sophie

distrib > Mandriva > 8.2 > i586 > media > contrib > by-pkgid > 2d0dfb8f706af7eda26783b02747b35a > files > 6

clara-0.9.8-1mdk.i586.rpm

<HTML><HEAD><TITLE>Clara Book</TITLE></HEAD>
<BODY BGCOLOR=#D0D0D0>
<TABLE WIDTH=100% BORDER=1 BGCOLOR=#E2D3FC><TR><TD><CENTER><H1><BR>Clara OCR Developer's Guide<BR></H1></CENTER></TD></TR></TABLE>
<P>
<CENTER>
[<A href=index.html>Main</A>]
[<A href=clara-faq.html>FAQ</A>]
[<A href=clara-tut.html>Tutorial</A>]
[<A href=clara-adv.html>User's Manual</A>]
[<A href=clara-dev.html>Developer's Guide</A>]
</CENTER>

<P>
Welcome. Clara OCR is a free OCR, written for systems supporting
the C library and the X Windows System. Clara OCR is intended for the
cooperative OCR of books. There are some screenshots available at
<A HREF=http://www.claraocr.org/>http://www.claraocr.org/</A>.

<P>
This documentation is extracted automatically from the comments
of the Clara OCR source code. It is known as "The Clara OCR
Developer's Guide". It's currently unfinished. First-time users
are invited to read "The Clara OCR Tutorial". There is also an
advanced manual known as "The Clara OCR Advanced User's Manual".

<P>

<P>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B> CONTENTS</B></FONT></TD></TR></TABLE>
<UL>
<P>
<LI> <A HREF=#1.>1. Introducing the source code</A>
<UL>
<P>
<LI> <A HREF=#1.1>    1.1 Language and environment</A>
<LI> <A HREF=#1.2>    1.2 Modularization</A>
<LI> <A HREF=#1.3>    1.3 The memory allocator</A>
<LI> <A HREF=#1.4>    1.4 Security notes</A>
<LI> <A HREF=#1.5>    1.5 Runtime index checking</A>
<LI> <A HREF=#1.6>    1.6 Background operation</A>
<LI> <A HREF=#1.7>    1.7 Global variables</A>
<LI> <A HREF=#1.8>    1.8 Path variables</A>
<LI> <A HREF=#1.9>    1.9 Bitmaps</A>
<LI> <A HREF=#1.10>    1.10 Execution model</A>
<LI> <A HREF=#1.11>    1.11 Return codes</A>
<P>
</UL>
<LI> <A HREF=#2.>2. Internal representation of pages</A>
<UL>
<P>
<LI> <A HREF=#2.1>    2.1 Closures</A>
<LI> <A HREF=#2.2>    2.2 Symbols</A>
<LI> <A HREF=#2.3>    2.3 The sdesc structure and the mc array</A>
<LI> <A HREF=#2.4>    2.4 The preferred symbols</A>
<LI> <A HREF=#2.5>    2.5 Font size</A>
<LI> <A HREF=#2.6>    2.6 Symbol alignment</A>
<LI> <A HREF=#2.7>    2.7 Words and lines</A>
<LI> <A HREF=#2.8>    2.8 Acts and transliterations</A>
<LI> <A HREF=#2.9>    2.9 Symbol transliterations</A>
<LI> <A HREF=#2.10>    2.10 Transliteration preference</A>
<LI> <A HREF=#2.11>    2.11 Transliteration class computing</A>
<LI> <A HREF=#2.12>    2.12 The zone</A>
<P>
</UL>
<LI> <A HREF=#3.>3. Heuristics</A>
<UL>
<P>
<LI> <A HREF=#3.1>    3.1 Skeleton pixels</A>
<LI> <A HREF=#3.2>    3.2 Symbol pairing</A>
<LI> <A HREF=#3.3>    3.3 The build step</A>
<LI> <A HREF=#3.4>    3.4 The function review_tr</A>
<LI> <A HREF=#3.5>    3.5 Resetting</A>
<LI> <A HREF=#3.6>    3.6 Synchronization</A>
<LI> <A HREF=#3.7>    3.7 The function list_cl</A>
<P>
</UL>
<LI> <A HREF=#4.>4. The GUI</A>
<UL>
<P>
<LI> <A HREF=#4.1>    4.1 Main characteristics</A>
<LI> <A HREF=#4.2>    4.2 Geometry of the application window</A>
<LI> <A HREF=#4.3>    4.3 Geometry of windows</A>
<LI> <A HREF=#4.4>    4.4 Scrollbars</A>
<LI> <A HREF=#4.5>    4.5 Displaying bitmaps</A>
<LI> <A HREF=#4.6>    4.6 HTML windows overview</A>
<LI> <A HREF=#4.7>    4.7 Graphic elements</A>
<LI> <A HREF=#4.8>    4.8 XML support</A>
<LI> <A HREF=#4.9>    4.9 Auto-submission of forms</A>
<P>
</UL>
<LI> <A HREF=#5.>5. The Clara API</A>
<UL>
<P>
<LI> <A HREF=#5.1>    5.1 Redraw flags</A>
<LI> <A HREF=#5.2>    5.2 OCR statuses</A>
<LI> <A HREF=#5.3>    5.3 The function setview</A>
<LI> <A HREF=#5.4>    5.4 The function redraw (to be written)</A>
<LI> <A HREF=#5.5>    5.5 The function show_hint</A>
<LI> <A HREF=#5.6>    5.6 The function start_ocr</A>
<P>
</UL>
<LI> <A HREF=#6.>6. How to change the source code (examples)</A>
<UL>
<P>
<LI> <A HREF=#6.1>    6.1 How to add a bitmap comparison method</A>
<LI> <A HREF=#6.2>    6.2 How to write a bitmap comparison function</A>
<LI> <A HREF=#6.3>    6.3 How to add an application button</A>
<P>
</UL>
<LI> <A HREF=#7.>7. Bugs and TODO list</A>
<UL>
<P>
</UL>
<LI> <A HREF=#8.>8. AVAILABILITY</A>
<UL>
<P>
</UL>
<LI> <A HREF=#9.>9. CREDITS</A>
<UL>
</UL>
</UL>
<A NAME=1.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>1. Introducing the source code</B></FONT></TD></TR></TABLE>
<P>
This Guide is a collection of entry points to the Clara OCR
source code. Some notes explain punctual details about how this
or that feature was implemented. Others are higher-level
descriptions about how one entire subsystem works.

<P>
<A NAME=1.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.1 Language and environment</B></FONT></TD></TR></TABLE>
<P>
Clara OCR is written in ANSI C (with some GNU extensions) and
requires the services of the C library and the Xlib. The
development is using 32-bit Intel GNU/Linux (various different
distributions), GCC, Gnu Make, Bash, XFree86 and Perl 5 (required
for producing the documentation).

<P>
<A NAME=1.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.2 Modularization</B></FONT></TD></TR></TABLE>
<P>
Clara source code started, of course, as being one only file
named clara.c. At some point we divided it into smaller
pieces. Currently there are 16 files:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  book.c     .. Documentation only
  build.c    .. The function build
  clara.c    .. Startup and OCR run control
  cml.c      .. ClaraML dumper and recover
  common.h   .. Common declarations
  consist.c  .. Consistency tests
  event.c    .. GUI initialization and event handler
  gui.h      .. Declarations that depend on X11
  html.c     .. HTML generation and parse
  pbm2cl.c   .. Import PBM
  pattern.c  .. Book font stuff
  redraw.c   .. The function redraw
  skel.c     .. Skeleton computation
  symbol.c   .. Symbol stuff
  revision.c .. Revision procedures
  welcome.c  .. Welcome stuff</PRE>
</TD></TR></TABLE></CENTER>
Along this document we'll not refer these files, but the
identifiers (names of functions and variables).

<P>
Note that there are only two headers: common.h and gui.h. It's
complex to maintain one header for each module. Most functions
are not prototyped, but we intend to prototype all them in the
near future.

<P>

<P>
<A NAME=1.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.3 The memory allocator</B></FONT></TD></TR></TABLE>
<P>
Clara OCR relies on the memory allocator both for allocation or
resizing of some large blocks used by the main data structures, and
for allocation of a large number of small arrays. Currently Clara OCR
does not include or use an special memory allocator, but implements an
interface to realloc. The alloca function is also used sometimes along
the code, generally to allocate buffers for sorting arrays.

<P>
The interface is the function c_realloc. The function c_free must be
used to free the blocks allocated or resized by c_realloc. In the near
future, c_realloc will build a list of the currently allocated blocks,
their sizes and some bits more in order to help to trace flaws.

<P>

<P>
<A NAME=1.4>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.4 Security notes</B></FONT></TD></TR></TABLE>
<P>
Concerning security, the following criteria is being used:

<P>
1. string operations are generally performed using services that
accept a size parameter, like snprint or strncpy, except when the code
itself is simple and guarantees that a overflow won't occur.

<P>
2. The CGI clara.pl invokes write privileges through sclara, a program
specially written to perform only a small set of simple operations
required for the operation of the Clara OCR web interface.

<P>
The following should be done:

<P>
1. Memory blocks should be cleared before calling free().

<P>

<P>
<A NAME=1.5>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.5 Runtime index checking</B></FONT></TD></TR></TABLE>
<P>
A naive support for runtime index checking is provided through the
macro checkidx. This checking is performed only if the code is
compiled with the macro MEMCHECK defined and the command-line switch
'-X 1' is used.

<P>
In fact, only those points on the source code where the macro checkidx
is explicitly used will perform index checking. We've added calls to
checkidx on some critical functions due to its complexity, or because
segfaults were already were detected there.

<P>
<A NAME=1.6>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.6 Background operation</B></FONT></TD></TR></TABLE>
<P>
Clara OCR decides at runtime if the GUI will be used or not. So
even when using Clara OCR in batch mode (-b command-line switch),
linking with the X libraries is required.

<P>
When the -b command-line switch is used, Clara OCR just won't
make calls to X services. The source code tests the flag
"batch_mode" before calling X services. So it won't create the
application window on the X display, and automatically starts a
full OCR operation on all pages found (as if the "OCR" button was
pressed with the "work on all pages" option selected).  Upon
completion, Clara OCR will exit.

<P>

<P>
<A NAME=1.7>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.7 Global variables</B></FONT></TD></TR></TABLE>
<P>
Clara OCR uses a lot of global variables. Large data structures,
flags, paths, etc, use stored on global variables. In some cases we
use a naming strategy to make the code more readable. The important
cases are:

<P>
a. The main data structures of Clara OCR are global arrays that grow
as required. The following a convention was created for the names
associated with these arrays:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    structure    type    array    top    size
   --------------------------------------------
    act          adesc   act      topa   actsz
    closure      cldesc  cl       topcl  clsz
    symbol       sdesc   mc       tops   ssz
    pattern      pdesc   pattern  topp   psz
    link         ldesc   lk       toplk  lksz
    ptype        ptdesc  pt       toppt  ptsz</PRE>
</TD></TR></TABLE></CENTER>
The "top" is the last used entry (initial value -1). The "size"
is the total size of the allocate memory block for that array
(initial value 0). So the relation (top < size) must always be
true.

<P>
b. Menus are referred by their registration indexes. These indexes are
stored on variables named CM_X. The menu items registration indexes
are stored on variables named CM_X_SOMETHING (all capital). If the
item has an associated flag, the flag is named cm_x_something (all
small).

<P>

<P>
<A NAME=1.8>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.8 Path variables</B></FONT></TD></TR></TABLE>
<P>
Most path variables are computed from the path given through
the -f command line option. The variable "pagename" is the filename
of the PBM image of the page being processed, not
including the path eventually specified through the -f
switch. For instance, if the OCR is started with

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    clara -f mydocs/test.pbm</PRE>
</TD></TR></TABLE></CENTER>
Then the value of the variable "pagename" will be just
"test.pbm". The variable pagebase is pagename without the
suffix ".pbm" ("test", in the example).

<P>
Clara stores on the variable pagelist the null-separated list of
all names of pbm files found on this directory. Even in this
case, the variable pagename will store the filename of the page
being processed (at any moment Clara will be processing one and
only one page).

<P>
The directory that contains the pbm files that Clara will
process is stored on the variable pagesdir. In the example
above, the value of the variable pagesdir is "mydocs/".

<P>
The variable workdir stores the path of the directory where Clara
will create the files *.html, *.session, "patterns" and
"acts". This path is assumed to be equal to pagesdir, unless
another path is given through the -w switch. The variable
doubtsdir will be the concatenation of workdir with the string
"doubts/" (doubtsdir is ignored if -W is not used).

<P>

<P>
<A NAME=1.9>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.9 Bitmaps</B></FONT></TD></TR></TABLE>
<P>
Clara stores bitmaps in a linear array of bytes, following
closely the pbm raw format. The first line of a bitmap with width
w is stored on the first (w/8)+((w%8)!=0) bytes of the array. The
remaining bits (if any) are left blank, and so on. The leftmost
bit on each byte is the most significative one (black, or "on",
is 1, and white, or "off" is 0). An example follows:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

      +--------------+
      |              |
      |   XX XXXX    |
      |    XX   XX   |
      |    XX   XX   |
      |    XX   XX   |
      |    XX   XX   |
      |              |
      +--------------+

      stored as: 0 0 27 192 12 96 12 96 12 96 12 96 0 0</PRE>
</TD></TR></TABLE></CENTER>
Note that the array of bytes that encodesone bitmap does not
contain the bitmap width nor the height. So bitmaps must be
stored together with other data. This is done by structures where
the bitmap is one field and the geometric information is stored
on other fields. There are two such structures: bdesc and
cldesc.

<P>
<A NAME=1.10>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.10 Execution model</B></FONT></TD></TR></TABLE>
<P>
In order to allow the GUI to refresh the application window while
one OCR run is in course, Clara does not use multiple
threads. The main function alternates calls to xevents() to
receive input and to continue_ocr() to perform OCR. As the OCR
operations may take long to complete, a very simple model was
implemented to allow the OCR services to execute only partially.

<P>
Such services (for instance load_page()) accept a "reset" parameter
to allow resetting all static data, and they're expected to
return 0 when finished, or nonzero otherwise. Therefore, a call to
such services must loop until completion. The continue_ocr() calls
the OCR steps using this model, and some OCR steps call other
services (like load_page()) that implement this model too.

<P>

<P>

<P>

<P>
<A NAME=1.11>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>1.11 Return codes</B></FONT></TD></TR></TABLE>
<P>
When Clara OCR exits, the exit code will diagnose the
finalization status:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  0 clean
  1 data inconsistency
  2 buffer overflow
  3 invalid field
  4 internal error
  5 memory exhausted
  6 X error
  7 I/O error
  8 bad input</PRE>
</TD></TR></TABLE></CENTER>
<A NAME=2.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>2. Internal representation of pages</B></FONT></TD></TR></TABLE>
<P>
Even for non-developers, a knowledge of the internal data
structures used by Clara OCR is required for fine tuning and to
make simple diagnostics.

<P>
The basic elements stored are the "closures". Sets of one or more
closures are called "symbols". Symbols are arranged in lists
forming "words". The words are arranged in lists forming "lines".

<P>

<P>
<A NAME=2.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.1 Closures</B></FONT></TD></TR></TABLE>
<P>
Closures of black pixels by contiguity are a first attempt to
identify the atomic symbols of the document. The name "closure"
is of course due to the consideration of the contiguity as a
relation (in the mathematical sense of the word). Starting (for
instance) from (i,j), we compute the set of black pixels ("X" and
"*" in the figure). The limits (l,r,t,b) define the bounding box
of the closure.

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

          l i    r
      +---+-+----+---+
      |              |
    t +   XX XXXX    |
      |    XX   XX   |
    j +    X*   XX   |
      |    XX   XX   |
    b +    XX   XX   |
      |              |
      +--------------+</PRE>
</TD></TR></TABLE></CENTER>
When loading a document, the OCR computes all its closures and
use an array to store them. When the session file is written, the
closures are stored in CML format. Note that, if required, the
closures may be recomputed from the document, because the
document and the closure computing algorithm determine the index
that each closure will have on the array.

<P>

<P>
<A NAME=2.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.2 Symbols</B></FONT></TD></TR></TABLE>
<P>
As one character of the document may be composed by two or more
closures (for instance when it's broken), it's convenient to work not
with closures, but with sets of closures. So we define the concept of
"symbol" as being a set of one or more closures. Initially, the OCR
generates one unitary symbol for each closure. Subsequent steps may
define new symbols composed by two or more closures.

<P>
For instance, let's present three closures that do not correspond
to atomic symbols: "a" and "i" linked (one closure) and a broken
"u" (two closures). As a principle, Clara OCR do not try to break
closures into smaller closures. Instead of that, the
classification heuristic try to compose various patterns to
resolve symbols like the "ai" in the figure. Concerning the "u",
the classification heuristic is expected to merge the two
closures into one symbol and apply a "u" pattern to resolve it.

<P>

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

            l            r     l r l    r
      +-----+------------+-----+-+-+----+--+
      |                                    |
    t +                XX                  |
      |                XX                  |
      |                                    |
      |      XXXXX    XXX      XXX   XXX   + t
      |     X     XX   XX       XX    XX   |
      |           XX   XX       XX    XX   |
      |      XXXXXXX   XX       XX    XX   |
      |     X     XX   XX       XX    XX   |
      |     X     XX   XX       XX    XX   |
    b +      XXXXX XXXXXXX       XX  XXXX  + b
      |                                    |
      +------------------------------------+</PRE>
</TD></TR></TABLE></CENTER>
As a principle, Clara OCR won't merge dots and accents into
characters. So an "i" will generally be formed by two individual
symbols (the dot and the body). The heuristics that build the OCR
output are expected to compose these two symbols into one ASC
character. The same applies for "j" and the accents (acute,
grave, tilde, etc) found on various european languages.

<P>

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

          l  r
      +---+--+-------+
      |              |
    t +    XX        |
      |    XX        |
      |              |
      |   XXX        |
      |    XX        |
      |    XX        |
      |    XX        |
      |    XX        |
    b +   XXXX       |
      |              |
      +--------------+</PRE>
</TD></TR></TABLE></CENTER>
<A NAME=2.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.3 The sdesc structure and the mc array</B></FONT></TD></TR></TABLE>
<P>
Each symbol is stored in a sdesc structure. Those structures form
the mc array. Once created, a symbol is never deleted. So it's
index on the mc array identifies it (this is important for the
web-based revision procedure). Note that closures and symbols are
numbered on a document-related basis. The set of closures that
define one symbol never changes. So the symbol bounding box and
the total number of black pixels also won't change either.

<P>
So two different entries of the mc array never have the same set
of closures. The entries of the mc array are created by the
new_mc service.  When some procedure tries to create a new
symbol informing a list of closures for which already
exists a symbol, the service new_mc detects it and returns
to the caller not the index of a newly created symbol, but
the index of that already created one.

<P>

<P>
<A NAME=2.4>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.4 The preferred symbols</B></FONT></TD></TR></TABLE>
<P>
One same closure may belong to more than one symbol. This is
important in order to allow various heuristic trials. For
instance, the left closure of the "u" on the preceding section
could be identified as the body of an "i". In this case however
we would not find its dot. So the heuristic could try by chance
another solution, for instance to join it with the nearest
closure (in that case, the right closure of the "u") and try to
match it with some pattern of the font.

<P>
So the OCR will need to choose, from all symbols that contain a
given closure, the one to be preferred. In fact, Clara OCR
maintains dynamically a partition of the set of closures on
"preferred" symbols. This is the ps array. Some manual
operations, like fragment merging and symbol disassembling
(activated by the context menu on the page tab), change that
partition dinamically, as well as some automatic procedures, like
the merge step on the OCR run.

<P>

<P>
<A NAME=2.5>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.5 Font size</B></FONT></TD></TR></TABLE>
<P>
The font size is important for classifying all book symbols on
pattern "types". For instance, books generally use smaller
letters for footnotes. This classification is performed
automatically by Clara OCR and presented by the "PATTERN (types)"
window.

<P>
Clara OCR generally uses millimeters for presenting sizes, but
we'll soon express sizes in "points". Let's see an example. One
inch corresponds to 72.27 printer's point (pt) (The METAFONTBook
pg 21, note). So when using 600 dpi, each pt will correspond to
600/72.27 = 8.3 pixels. For 10 point roman characters, Knuth
defines the height of lowercase letters as being 155/36 pt, so
35.7 pixels for us. Therefore, to compute the font size (f) from
the height in pixels (h) of one lowercase letter, the formula is
f = 10*h/35.7.

<P>

<P>
<A NAME=2.6>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.6 Symbol alignment</B></FONT></TD></TR></TABLE>
<P>
The vertical alignment of symbols is important for various
heuristics. For instance, the vertical line from a broken "p"
matches an "l", but using alignement tests we're able to refuse
this match.

<P>
The current Clara OCR alignment support was developed for the Latin
alphabet, and is being adapted for other alphabets. Four vertical
alignemnt positions are considered. These positions are referred as
usual (ascent, baseline and descent). We use the Knuth's identifier
"x_height" to refer the height of lowercase letters without ascenders.

<P>

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  A XXX                     XXXXXXXXX         
     XX                      XX      X	       
     XX                      XX      XX       
     XX                      XX      XX       
  X  XX XXXXX   XX  XXXXX    XX      X      XXXX
     XXX     X   XXX     X   XXXXXXXX     XX    XX
     XX      XX  XX      XX  XX      X   XX      XX
     XX      XX  XX      XX  XX      XX  XXXXXXXXXX
     XX      XX  XX      XX  XX      XX  XX   
     XX      XX  XX      XX  XX      XX  XX   
     XXX     X   XXX     X   XX      X    XX    XX  XX
  B  XX XXXXX    XX XXXXX   XXXXXXXXX       XXXX    XXX
                 XX                                   X
                 XX                                   X
                 XX                                  X
  D             XXXX                          


  A (0) .. ascent (Knuth asc_height)
  X (1) .. x_height
  B (2) .. baseline
  D (3) .. descent (Knuth desc_depth)</PRE>
</TD></TR></TABLE></CENTER>
So in the figure we say that the alignment of "b" and "B" is 02, the
alignment of "p" is 13, the alignment of "e" is 12, and the alignment
of the comma is 23. A period has alignment 22. The dot of an "i" and
accents have alignment 00. In fact, the positions 1 and 2 use to be
well defined: all lowercase letters have the same height, and all
symbols use the same baseline. However, positions 0 and 3 are not so
well defined. For instance, on some printed books "t" and "l" have
different heights.

<P>

<P>
<A NAME=2.7>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.7 Words and lines</B></FONT></TD></TR></TABLE>
<P>
Clara OCR applies The concept of "symbol" to atomic symbols like
letters, digits or punctuation signs. Words (as "house" or
"peace"), are handled by Clara OCR as sequences of symbols.

<P>
It's very important to compute the words of the page. They
provide a context both to the OCR and to the reviewer. For
instance, if the known symbols of some word were identified as
bold, then Clara will automatically make the bold button on when
someone tries to review the unknown symbols of that word. The
same applies to prefer the recognition of one symbol as the digit
"1" instead of the character "l" if the known symbols of the
"word" are digits. Words are also the basis for revision based on
spelling. Each words is stored on a wdesc structure on the "word"
array.

<P>
When building the OCR output, Clara will combine words in
lines. Each line is a sequence of words (that is, wdesc
structures). The array "line" is the sequence of the heads of the
detected lines. Each entry of this array is a lndesc
structure. The left and right limits of words must be carefully
computed and compared in order to the OCR partitionate then in
columns, when dealing with multi-column pages.

<P>

<P>
<A NAME=2.8>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.8 Acts and transliterations</B></FONT></TD></TR></TABLE>
<P>
The "acts" or "revision acts" are the human interventions for
training a symbol, merging a fragment to one symbol, etc.

<P>
As the human interventions are the more precious
source of information, Clara logs all revision acts, in
order to be able to reuse them.

<P>
The transliterations are obtained from the revision acts, so
each transliteration refers one (or more) revision acts, and
also inherits some properties from that act (or those acts).

<P>
The acts are on the book scope, and not on the page scope. The acts
are stored on the file "acts" on the work directory.

<P>
Each act stores some data about the reviewer and also the submission
date. As we plan to reuse revision data, each act also stores some
data about the "original reviewer" and the "original submission
date". These fields are meaningful only for reused acts.

<P>

<P>
<A NAME=2.9>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.9 Symbol transliterations</B></FONT></TD></TR></TABLE>
<P>
Clara OCR maintains a list of 0 or more proposed or deduced
transliterations for each symbol. Along the OCR process, each
transliteration receives "votes" from reviewers or from machine
deduction heuristics, based on shape similarity or on
spelling.

<P>
So the choice of the "best" transliteration is performed through
election. Votes are stored on structures of type vdesc, and
transliterations are stored on structures of type trdesc. Each
symbol stores a pointer for a (possibly empty) list of
transliterations and each transliteration stores a pointer
for a (possibly empty) list of votes.

<P>
As the total stored information about one symbol may be large,
Clara maintains for each symbol its "transliteration class", used
by the heuristics to categorize each symbol and also to test the
current transliteration status (is it known? is it dubious?).

<P>

<P>
<A NAME=2.10>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.10 Transliteration preference</B></FONT></TD></TR></TABLE>
<P>
The election process used to choose the "best" transliteration for one
symbol (from those obtained through human revision or heuristics based
on shape similarity or spelling) consists in computing the
"preference" of each transliteration and choosing the one with maximum
preference.

<P>
The transliteration preference is the integer

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    UTSEAN</PRE>
</TD></TR></TABLE></CENTER>
where

<P>
U is 1 if the transliteration was confirmed by the arbiter,
or 0 otherwise.

<P>
T is 0 if this transliteration was confirmed by no trusted
source, 1 if it was confirmed by some trusted source.

<P>
S is 0 if this transliteration was not shape-deduced
from trusted input, or 1 if it was shape-deduced
from trusted input.

<P>
E is 1 if this transliteration was deduced from spelling,
or 0 otherwise.

<P>
A is 0 if this transliteration was confirmed by no anon
source, 1 if it was confirmed by some anon source.

<P>
N is 0 if this transliteration was not shape-deduced
from anon input, or 1 if it was shape-deduced from anon input.

<P>

<P>
<A NAME=2.11>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.11 Transliteration class computing</B></FONT></TD></TR></TABLE>
<P>
Once we have computed the "best" transliteration, we can compute
its transliteration class, important for various heuristics. From
the transliteration class it's possible test things like "do we
know the transliteration of this symbol?" or "is it an
alphanumeric character?"  or "concerning dimension and vertical
alignment could it be an alphanumeric character?", and others.

<P>
There are two moments where the transliteration class is
computed. The first is when a transliteration is added to
the symbol, and the second is when the CHAR class is
propagated.

<P>
The first uses the following criteria to compute the
transliteration class:

<P>
1. If the symbol has no transliteration at all, its class is
UNDEF.

<P>
2. On all other cases, the transliteration with largest
preference will be classified as DOT, COMMA, NOISE, ACCENT and
others. This search is implemented by the classify_tr function in
a straightforward way.

<P>
Just before the distribution of all symbols on words we propagate
CHARs. All CHAR symbols are searched, and for each one we look
its neighbours that seem to compose with it one same word. Such
neighbours, if untransliterated, will be classified as SCHARs.

<P>

<P>
<A NAME=2.12>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>2.12 The zone</B></FONT></TD></TR></TABLE>
<P>
Clara allows to create a zone within the document. By default the
zone is the entire document. The zone limits are given by the
"limits" array. The top left is (limits[0],limits[1]) as the
figure show:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    +-------------+
    |(0,1)   (6,7)|
    |             |
    |(4,5)   (2,3)|
    +-------------+</PRE>
</TD></TR></TABLE></CENTER>
Currently the zone is useful only to be saved as a new
document. The OCR operations always consider the entire document.

<P>

<P>
<P><A NAME=3.><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>3. Heuristics</B></FONT></TD></TR></TABLE>
<A NAME=3.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.1 Skeleton pixels</B></FONT></TD></TR></TABLE>
<P>
The first method implemented by Clara OCR for symbol classification
was skeleton fitting. Two symbols are considered similar when each
one contains the skeleton of the other.

<P>
Clara OCR implements five heuristics to compute skeletons. The
heuristic to be used is informed through the command-line option
-k as the SA parameter. The value of SA may be 0, 1, 2, 3 or 4.

<P>
Heuristics 0, 1 and 2 considerer a pixel as being a skeleton pixel
if it is the center of a circle inscribed within the closure, and
tangent to the pattern boundary in more than one point.

<P>
The discrete implementation of this idea is as follows: for each
pixel p of the closure, compute the minimum distance d from p to
some boundary pixel. Now try to find two pixels on the closure
boundary such that the distance from each of them to p does not
differ too much from d (must be less than or equal to RR). These
pixels are called "BPs".

<P>
To make the algorithm faster, the maximum distance from p to the
boundary pixels considered is RX. In fact, if there exists a
square of size 2*BT+1 centered at p, then p is considered a
skeleton pixel.

<P>
As this criteria alone produces fat skeletons and isolated
skeleton pixels along the closure boundary, two other conditions
are imposed: the angular distance between the radiuses from p to
each of those two pixels must be "sufficiently large" (larger
than MA), and a small path joining these two boundary pixels
(built only with boundary pixels) must not exist (the "joined"
function computes heuristically the smallest boundary path
between the two pixels, and that distance is then compared to
MP).

<P>
The heuristics 1 and 2 are variants of heuristic 0:

<P>
1. (SA = 1) The minimum linear distance between the two BPs
is specified as a factor (ML) of the square of the radius. This
will avoid the conversion from rectangular to polar coordinates
and may save some CPU time, but the results will be slightly
different.

<P>
2. (SA = 2) No minimum distance checks are performed, but a
minimum of MB BPs is required to exist in order to consider the
pixel p a skeleton pixel.

<P>
The heuristic 3 is very simple. It computes the skeleton removing
BT times the boundary.

<P>
The heuristic 4 uses "growing lines". For angles varying in steps
of approximately 22 degrees, a line of lenght RX pixels is drawn
from each pixel. The heuristic check if the line can or cannot be
entirely drawn using black pixels. Depending on the results, it
decides if the pixel is an skeleton pixel or not. For instance:
if all lines could be drawn, then the pixel is center of an
inscribed circle, so it's considered an skeleton pixels. All
considered cases can be found on the source code.

<P>
The heuristic 5 computes the distance from each pixel to the
border, for some definition of distance. When the distance is
at least RX, it is considered a skeleton pixel. Otherwise,
it will be considered a skeleton pixel if its distance to the
border is close to the maximum distance around it (see the code
for details).

<P>
All parameters for skeleton computation are informed to Clara
through the -k command-line option, as a list in the following
order: SA,RR,MA,MP,ML,MB,RX,BT. For instance:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    clara -k 2,1.4,1.57,10,3.8,10,4,4</PRE>
</TD></TR></TABLE></CENTER>
The default values and the valid ranges for each parameter must
be checked on the source code (see the declaration of the
variables SA, RR, MA, MP, ML, MB, RX, and BT, and the function
skel_parms). Note that BT must be at most RX.

<P>
<A NAME=3.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.2 Symbol pairing</B></FONT></TD></TR></TABLE>
<P>
Pairing applies to letters and digits. We say that the symbols a and b
(in this order) are paired if the symbol b follows the symbol a within
one same word. For instance, "h" and "a" are paired on the word
"that", "3" and "4" are paired on "12345", but "o" and "b" are not
paired on "to be" (because they're not on the same word).

<P>
The function s_pair tests symbol pairing, and returns
the following diagnostics:

<P>
0 .. the symbols are paired
1 .. insuficcient vertical intersection
2 .. one or both symbols above ascent
3 .. one or both symbols below descent
4 .. maximum horizontal distance exceeded
5 .. incomplete data

<P>
If p is nonzero, then store the inferred alignment for each symbol
(a and b) on the va field of these symbols, except when these
symbols have the va field already defined.

<P>
If rd is non-null, returns the dot diameter in *rd. If an
estimative for the dot diameter cannot be computed, does not
change *rd.

<P>

<P>

<P>
<A NAME=3.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.3 The build step</B></FONT></TD></TR></TABLE>
<P>
The "build" OCR step, implemented by the "build"
function, distributes the symbols on words
(analysing the distance, heights and relative
position for each pair of symbols), and the words
on lines (analysing the distance, heights and
relative position for each pair of words). Various
important heuristics take effect here.

<P>
0. Preparation

<P>
The first step of build is to distribute the symbols
on words. This is achieved by:

<P>
a. Undefining the next-symbol ("E" field) and previous-symbol
("W" field) links for each symbol, the surrounding word ("sw"
field) of each symbol, and the next signal ("sl" field) for
each symbol.

<P>
Obs. The next-symbol and previous symbol links are used
to build the list of symbols of each word. For instance,
on the word "goal", "o" is the next for "g" and
the previous for "a", "g" has no previous and "l"
has no next).

<P>

<P>
b. Undefining the transliteration class of SCHARs and
the uncertain alignment information.

<P>

<P>
2. Distributing symbols on words

<P>
The second step is, for each CHAR not in any word, define
a unitary word around it and extend it to right
and left applying the symbol pairing test. When
extending, merge words when necessary.

<P>

<P>
3. Computing the alignment using the words

<P>
Some symbols do not have a well-defined alignment by
themselves. For instance, a dot may be baseline-aligned
(a final dot) or 0-aligned (the "i" dot). So when
computing their alignments, we need to analyse their
neighborhoods. This is performed in this step.

<P>

<P>
4. Validating the recognition

<P>
Shape-based recognitions must be validated by special
heuristics. For instance, the left column of a broken 
"u" may be recognized as the body of an "i" letter. A
validation heuristic may refuse this recognition for
instance because the dot was not found. These heuristics
are per-alphabet.

<P>

<P>
5. Creating fake words for punctuation signs

<P>
To produce a clean output, symbols that do not belong to
any word are not included on the OCR output. So we need
to create fake words for punctuation signs like commas
of final dots.

<P>

<P>
6. Aligning words

<P>
Words need to be aligned in order to detect the
page text lines. This is perfomed as follows:

<P>

<P>
a. Undefine the next-word and previous-word
links for each word. These are links for the
previous and next word within lines. For instance,
on the line "our goal is", "goal" is the next
for "our" and the previous for "is", "our" has
no previous and "is" has no next.

<P>
b. Distribution of the words on lines. This is just
a matter of computing, for each word, its "next" word.
So for each pair of words, we test if they're "paired"
in the sense of the function w_pair. In affirmative
case, we make the left word point to the right word
as its "next" and the rigth point to the left as its
"previous".

<P>
The function w_pair does not test the existence of
intermediary words. So on the line "our goal is" that
function will report pairing between "our" and "is".
So after detecting pairing, our loop also checks if the
detected pairing is eventually "better" than those
already detected.

<P>

<P>
c. Sort the lines. The lines are sorted based on the
comparison performed by the function "cmpln".

<P>

<P>
7. Computing word properties

<P>
Finally, word properties can be computed once we
have detected the words. Some of these properties are
applied to untransliterated symbols. The properties are:

<P>
1. The baseline left and right ordinates.

<P>
2. The italic and bold flags.

<P>
3. The alphabet.

<P>
4. The word bounding box.

<P>
All these properties are computed by the
function wprops.
<A NAME=3.4>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.4 The function review_tr</B></FONT></TD></TR></TABLE>
<P>
Process the submission of transliterations. Also
process the actions that change the properties of the
current symbol. This is not a simple operation. In
order to make the interface powerful, the submission of
a transliteration may change the transliteration of the
current symbol, and also the transliterations of all
symbols on its class. Depending on the properties,
other actions may be performed as well. In order to not
inquire the user about what to do, a "protocol"
determines what to do in each case. This protocol tries
to emulate the "obvious" thing to do in each case,
however advanced users are invited to be aware about
it:

<P>

<P>
0. Remove the current revision vote (if any).

<P>

<P>
1. Add a REVISION vote for the current symbol,
informing the submitted transliteration, and compute
the preferred transliteration considering this new
vote.

<P>
2. If the symbol is unclassified and it is not bad,
then add its bitmap as a pattern.

<P>
3. If the symbol is classified and it was not bad and it's
still not bad, propagate the submitted transliteration
to the entire class.

<P>
4. If the symbol is classified, it is not the pattern of
the class, and it was not bad, but it became bad now,
then remove it from its class, and remove its SHAPE vote.

<P>
5. If the symbol is the pattern of the class, and it
was not bad but it became bad now, then remove it from
the patterns.
<A NAME=3.5>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.5 Resetting</B></FONT></TD></TR></TABLE>
<P>
<A NAME=3.6>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.6 Synchronization</B></FONT></TD></TR></TABLE>
<P>

<P>

<P>
<A NAME=3.7>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>3.7 The function list_cl</B></FONT></TD></TR></TABLE>
<P>
The function list_cl lists all closures that intersect the rectangle
of height h and width w with top left (x,y). The result will be
available on the global array list_cl_r, on indexes
0..list_cl_sz-1. This service is used to clip the closures or symbols
(see list_s) currently visible on the PAGE window. It's also used by
OCR operations that require locating the neighbours of one closure or
symbol (see list_s).

<P>
The parameter reset must be zero on all calls, except on the very
first call of this function after loading one page.

<P>
Every time a new page is loaded, this service must be called
informing a nozero value for the reset parameter. In this case,
the other parameters (x, y, w and h) are ignored, and the effect
will be preparing the page-specific static data structures used
to speed up the operation.

<P>
Closures are located by list_cl from the static lists of closures
clx and cly, ordered by leftmost and topmost coordinates. Small
and large closures are handled separately. The number of closures
with width larger than FS is counted on nlx. The number of
closures with height larger than FS is counted on nly.

<P>
The clx array is divided in two parts. The left one contains
(topcl+1)-nlx indexes for the closures with width not larger than
FS, sorted by the leftmost coordinate. The right one contains the
other indexes, in descending order.

<P>
The cly array is divided in two parts. The left one contains
(topcl+1)-nly indexes for the closures with height not larger
than FS, sorted by the topmost coordinate. The right one contains
the other indexes, in descending order.

<P>
So the small closures on the rectangle (x,y,w,h) may be located
through a combination of bynary searches on both axis. The large
closures are located by a brute-force linear loop. As nlx and nly
are expected to be very small, this brute force loop won't waste
CPU time.

<P>
<A NAME=4.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>4. The GUI</B></FONT></TD></TR></TABLE>
<P>

<P>
<A NAME=4.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.1 Main characteristics</B></FONT></TD></TR></TABLE>
<P>
1. Clara OCR GUI uses only 5 colors: white, gray, darkgray,
verydarkgray and black. The RGB value for each one is
customizable at startup (-c command-line option).

<P>
2. The I/O is buffered, but this feature may be switched off
(using the -u switch) to save memeory on the Xserver side.

<P>
3. Only one font, used for all needs (button lables, menu
entries, HTML renderization, and messages).

<P>
4. Asynchronous refresh. The OCR operations just set the redraw
flags (redraw_button, redraw_wnd, redraw_int, etc) and let the
redraw() function make its work.

<P>
5. No toolkit is used. The graphic code is very specific to
Clara, and it was not written to be reusable. So it's very
small. The disadvantage of this approach is that Clara look and
behaviour will be slightly different from the typical ones found
on popular environments like GNOME or KDE.

<P>

<P>

<P>
<A NAME=4.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.2 Geometry of the application window</B></FONT></TD></TR></TABLE>
<P>
The source code frequently refers some global variables that define
the position and size of the main componts (the plate, buttons,
etc). Most of these variables are set by comp_wnd_size. The variables
are:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    WH  .. application window height
    WW  .. application window width
    PH  .. plate height
    PW  .. plate width
    BW  .. button width
    BH  .. button width
    MRF .. maximum reduction factor
    TW  .. tab width
    TH  .. tab height
    PM  .. plate horizontal margin
    PT  .. plate top margin
    RW  .. scrollbar width
    MH  .. menubar heigth</PRE>
</TD></TR></TABLE></CENTER>
MRF applies to the scanned document and to the web clip.

<P>

<P>
<A NAME=4.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.3 Geometry of windows</B></FONT></TD></TR></TABLE>
<P>
The current window is informed through the CDW global variable
(set by the setview function). The variable CDW is an index for
the dw array of dwdesc structs. Some macros are used to refer the
fields of the structure dw[CDW]. The list of all them can be
found on the headers under the title "Parameters of the current
window".

<P>
The portion of the document being displayed is defined by the
macros X0, Y0, HR and VR, where (X0,Y0) is the top left and HR
and VR are the width and heigth, measured in pixels (graphic
documents) or characters (text documents):

<P>

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

         X0  X0+HR-1
         |     |
    +----+-----+--+
    |             |
    |             |
    |    +-----+  +- Y0
    |    |     |  |
    |    |     |  |
    |    |     |  |
    |    +-----+  +- Y0+VR-1
    |             |
    |             |
    |             |
    |             |
    |             |
    |             |
    +-------------+
     The document</PRE>
</TD></TR></TABLE></CENTER>
Regarding the application window, the document window is a
portion of the plate, defined by DM, DT, DW and DH, where (DM,DT)
is the top left and DW and DH are the width and heigth measured
in display pixels.

<P>

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

          DM              DM+DW-1
          |                 |
    +-----+-----------------+----+
    |                            |
    |                            |
    |                            |
    |     +-----------------+    +- DT
    |     |                 | |  |
    |     |                 | X  |
    |     |                 | X  |
    |     |    Document     | X  |
    |     |     window      | |  |
    |     |                 | |  |
    |     |                 | |  |
    |     |                 | |  |
    |     |                 | |  |
    |     +-----------------+    +- DT+DH-1
    |      -----XXXXXXXXXXX-     |
    |                            |
    |                            |
    +----------------------------+
         Application window</PRE>
</TD></TR></TABLE></CENTER>
The rectangle (X0,Y0,HR,VR) from the document is exhibited into
the display rectangle (DM,DT,DW,DH). When displaying the scanned
page, the reduction factor RF applies. Each square RFxRF of
pixels from the document will be mapped to one display pixel.
When displaying the scanned page in fat bit mode, each document
pixel will be mapped to a square ZPSxZPS of display pixels, and a
grid will be displayed too.

<P>

<P>
<A NAME=4.4>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.4 Scrollbars</B></FONT></TD></TR></TABLE>
<P>
The scrollbars inform the relative portion of the document being
exhibited. The viewable region of the document (in the sense just
defined) is defined by X0, Y0, HR and VR:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

              Y0    Y0+HR-1

         +----+-------+-------+ - 0
         |                    |
      X0 +    +-------+       |
         |    |       |       |
         |    |       |       |
         |    |       |       |
         |    |       |       |
 X0+VR-1 +    +-------+       |
         |                    |
         |                    |
         |                    |
         |                    |
         +--------------------+ - GRY-1

         |                    |
         0                   GRX-1</PRE>
</TD></TR></TABLE></CENTER>
The variables GRX and GRY contain the total width and height of
the full document, measured in pixels. The interpretation of the
contents of the variables X0, Y0, HR and VR is not simple. In some
cases, they will contain values measured in pixels. In other cases,
in characters. The variables HR and VR define the size of the
window. However, in some cases this size is the size
from the viewpoint of the document and, in others, of the display
(the difference is a reduction factor).

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

            +------------+  -
            |            |  |
            |            |  |
            |            |  X
            |            |  X
            |            |  X
            |            |  |
            |            |  |
            +------------+  -

            |---XXXX-----|</PRE>
</TD></TR></TABLE></CENTER>
Note that the parameters X0, Y0, HR, VR, GRX and GRY are macros
that refer the corresponding fields of the structure dw[CDW],
that stores the parameters of the current DW.

<P>

<P>
<A NAME=4.5>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.5 Displaying bitmaps</B></FONT></TD></TR></TABLE>
<P>
The Bitmaps on HTML windows and on the PAGE window are exhibited
in "reduced" fashion (a square RFxRF of pixels from the bitmap is
mapped to one display pixel). If RF=1, then each bitmap pixel
will map to one display pixel.

<P>
The windows PATTERN, PAGE_FATBITS, and PAGE_MATCHES exhibit
bitmaps in "zoomed" mode (one bitmap pixel maps to a ZPSxZPS
square of display pixels). In this case a grid is displayed to
make easier to distinguish each pixel. The variables GW and GS
contain the grid width and the "grid separation" (GS=ZPS+GW).

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

                   ZPS     GS              GW
                |<---->|<----->|   --->||<---

               ++------++------++------++----
               ++------++------++------++----
               ||      ||      ||      ||
               ||      ||      ||      ||
               ||      ||      ||      ||
               ++------++------++------++----
               ++------++------++------++----
               ||      ||      ||      ||
               ||      ||      ||      ||
               ||      ||      ||      ||</PRE>
</TD></TR></TABLE></CENTER>
Note that the parameters RF, GS and GW are macros that refer the
corresponding fields of the structure dw[CDW], that stores the
parameters of the current DW.

<P>

<P>
<A NAME=4.6>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.6 HTML windows overview</B></FONT></TD></TR></TABLE>
<P>
Clara is able to read a piece of HTML code, render it, display
the rendered code, and attend events like selection of an anchor,
filling a text field, or submitting a form. Note that anchor
selection and form submission activate internal procedures, and
won't call external customizable CGI programs.

<P>
Most windows displayed by Clara are produced using this HTML
support. When the "show HTML source" option on the "View" menu is
selected, Clara will display unrendered HTML, and it will become
easier to identify the HTML windows. Note that all HTML is
produced by Clara itself. Clara won't read HTML from files or
through HTTP.

<P>
Perhaps you are asking why Clara implements these things. Clara
does not intend to be a web browser. Clara supports HTML because
we were in need of a forms interface, and the HTML forms is
around there, ready to be used, and extensively proved on
practice as an easy and effective solution.  Note that we're not
trying to achieve completeness. Clara HTML support is
partial. There is only one font available, tables cannot be
nested and most options are unavailable, PBM is the only graphic
format supported, etc. However, it attends our needs, and the
code is surprisingly small.

<P>
Let's understand how the HTML windows work. First of all, note
that there is a html flag on the structure that defines a window
(structure dwdesc). For instance, this flag is on for the window
OUTPUT (initializition code at function setview).

<P>
When the function redraw is called and the window OUTPUT is
visible on the plate, the service draw_dw will be called
informing OUTPUT through the global variable CDW (Current
Window). However, before making that, redraw will test the flag
RG to check if the HTML contents for the OUTPUT window must be
generated again, calling a function specific to that window. For
instance, when a symbol is trained, this flag must be set in
order to notify asynchronously the need to recompute the window
contents, and render it again.

<P>
HTML renderization is performed by the function html2ge. It will
create an array of graphic entities. Each such entity is a
structure informing the geometric position (x,y,width,height) of
something, and this something (a piece of text, a button and its
label and state, a PBM image, etc). Finally, the function
draw_dw will search the elements currently visible on the
portion of the document clipped by the window, and display them.

<P>
<A NAME=4.7>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.7 Graphic elements</B></FONT></TD></TR></TABLE>
<P>
The rendering of each element on the HTML page creates one graphic
element ("GE" for short).

<P>
Free text is rendered to one GE of type GE_TEXT per word. This is
a "feature". The rendering procedures are currently unable to put
more than one text word per GE.

<P>
IMG tags are rendered to one GE of type GE_IMG. Note that the
value of the SRC element cannot be the name of a file containing
a image, but must be "internal" or "pattern/n". These are
keywords to the web clip and the bitmap the pattern "n". The
value of the SRC attribute is stored on the "txt" field of the
GE.

<P>
INPUT tags with TYPE=TEXT are rendered to one GE of type
GE_INPUT. The predefined value of the field (attribute VALUE) is
stored on the field "txt" of the GE. The name of the field
(attribute NAME) is stored on the field "arg" of the GE.

<P>
The Clara OCR HTML support added INPUT tags with TYPE=NUMBER. They're
rendered like TYPE=TEXT, but two steppers are added to faster
selection. So such tags will create three GEs (left stepper, input
field, and right stepper).

<P>
INPUT tags with TYPE=CHECKBOX are rendered to one GE of type
GE_CBOX. The variable name (attribute NAME) is stored on the "arg"
field. The argument to VALUE is stored on the field "txt". The status
of the checkbox is stored on the "iarg" field (2 means "checked", 0
means "not checked").

<P>
INPUT tags with TYPE=RADIO are rendered just like CHECKBOX. The
only difference is the type GE_RADIO instead GE_CBOX.

<P>
SELECT tags (starting a SELECT element) are rendered to one GE of
type SELECT. In fact, the entire SELECT element is stored on only
one GE. Each SELECT element originates one standard context menu,
as implemented by the Clara GUI. The "iarg" field stores the menu
index. The free text on each OPTION element is stored as an item
label on the context menu. The implementation of the SELECT
element is currently buggy: (a) for each renderization, one entry
on the array of context menus will be allocated, and will never
be freed, and (b) The attribute NAME of the SELECT won't be
stored anywhere.

<P>
INPUT tags with TYPE=SUBMIT are rendered to one GE of type
GE_SUBMIT. The value of the attribute VALUE is stored on the "txt"
field. The value of the ACTION attribute is stored on the field
"arg". The field "a" will store HTA_SUBMIT.

<P>
TD tags are rendered to one GE of type GE_RECT. The value of the
BGCOLOR attribute is stored on the "bg" field as a code (only the
colors known by the Clara GUI are supported: WHITE, BLACK, GRAY,
DARKGRAY and VDGRAY). The coordianates of the cell within the table
are stored on the fields "tr" and "tc".

<P>
All other supported tags do not generate GEs.

<P>
<A NAME=4.8>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.8 XML support</B></FONT></TD></TR></TABLE>
<P>
We decided to use XML because of the facilities of using
non-binary encodings to store, analyse, change and transmit
information, and also because XML is a standard. Currently we do
not have DTDs, and until now we didn't try to load, using the
Clara parser, XML code not produced by Clara itself.

<P>

<P>
<A NAME=4.9>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>4.9 Auto-submission of forms</B></FONT></TD></TR></TABLE>
<P>
The Clara OCR GUI tries to apply immediately all actions taken by
the user. So the HTML forms (e.g. the PATTERN window) do not
contain SUBMIT buttons, because they're not required (some forms
contain a SUBMIT button disguised as a CONSIST facility, but this
is just for the user's convenience).

<P>
The editable input fields make auto-submission mechanisms a bit
harder, because we cannot apply consistency tests and process the
form before the user finishes filling the field, so
auto-submission must be triggered on selected events. The
triggers must be a bit smart, because some events must be
attended before submission (for instance toggle a CHECKBOX),
while others must be attended after submission (for instance
changing the current tab). So auto-submission must be carefully
studied. The current strategy follows:

<P>
a. When the window PAGE (symbol) or the window PATTERN is
visible, auto-submit just after attending the buttons that change
the current symbol/pattern data (buttons BOLD, ITALIC, ALPHABET
or PTYPE).

<P>
b. When the window PAGE (symbol) or the window PATTERN is
visible, auto-submit just before attending the left or right
arrows.

<P>
c. When the user presses ENTER and an active input field exists,
auto-submit.

<P>
d. Auto-submit as the first action taken by the setview service,
in order to flush the current form before changing the current
tab or tab mode.

<P>
e. Auto-submit just after opening any menu, in order to flush
data before some critic action like quitting the program or
starting some OCR step.

<P>
f. Auto-submit just after attending CHECKBOX or RADIO buttons.

<P>
Auto-submission happens when the service auto_submit_form is
called, so it's easy to locate all triggering points (just search
the string auto_submit_form). This service takes no action when
the current form is unchanged.

<P>
<A NAME=5.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>5. The Clara API</B></FONT></TD></TR></TABLE>
<P>
This section describes the variables and functions exported by
Clara OCR for extensionability purpuses. Note that Clara OCR
currently does not have an interface for extensions. The first
such interface planned to be added will use the Guile
interpreter, available from the GNU Project.

<P>

<P>
<A NAME=5.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.1 Redraw flags</B></FONT></TD></TR></TABLE>
<P>
The redraw flags inform the function redraw about which portions
of the application window must be redraw. The precise meaning of
each flag depends on the implementation of the redraw function,
that can be analysed directly on the source code.

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    redraw_button .. one specific button or all buttons
    redraw_bg     .. redraw background
    redraw_grid   .. the grid on fatbits windows
    redraw_stline .. the status line
    redraw_dw     .. all visible windows
    redraw_inp    .. all text input fields
    redraw_tab    .. tabs and their labels
    redraw_zone   .. rectangle that defines the zone
    redraw_menu   .. menu bar and currently open menu
    redraw_j1     .. redraw junction 1 (page tab)
    redraw_j2     .. redraw junction 2 (page tab)
    redraw_pbar   .. progress bar
    redraw_map    .. alphabet map</PRE>
</TD></TR></TABLE></CENTER>
An individual button may be redraw to reflect a status change
(on/off). The junction 1 is the junction of the top and middle
windows on the page tab, and the junction 2 is the junction of
the middle and bottom window on the page tab. The correspondig
flags are used when resizing some window on the page tab.

<P>
If redraw_menu is 2, the menu is entirely redrawn. If redraw_menu is
1, then the draw_menu function will redraw only the last selected item
and the newly selected item, except if the menu is being drawn by the
first time.

<P>
The progress bar is displayed on the bottom of the window to
reflect the progress of some slow operation. By now, the
progress bar is unused.

<P>

<P>
<A NAME=5.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.2 OCR statuses</B></FONT></TD></TR></TABLE>
<P>
The OCR run in course (if any) stores various statuses on global
variables. For instance, the ocring macro will be nonzero if one
OCR run is in course. The GUI informs the OCR control routines
about what to do along the OCR run using various global
variables. Some of them drive the classification procedures:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  justone      .. Classify only one symbol
  this_pattern .. Use only one pattern to classify
  recomp_cl    .. Ignore current classes</PRE>
</TD></TR></TABLE></CENTER>
The first two are used for testing purposes, for instance when
checking why the classification routines classified some symbol
unexpected way.

<P>
The stop_ocr variable is set by the GUI when the STOP button is
pressed. Its status will be tested by the routines that control
the OCR run in course. Note that the variable cannot_stop may be
set by the current OCR step in course. It's effect is to inhibit
the GUI setting the stop_ocr status. It's used by routines that
cannot be stopped, otherwise the data structures they're handling
would rest in a irrecuperable inconsistency.

<P>
The OCR control routines handle the following statuses:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  ocr_all  .. OCR all pages
  starting .. continue_ocr was not called until now
  onlystep .. run only this OCR step</PRE>
</TD></TR></TABLE></CENTER>
The buttons CLASSIFY, BUILD, etc, start one specific OCR
step. The OCR step to be executed is stored on onlystep. The
to_ocr variable stores the page where the OCR run will be
executed.

<P>
The other to_* variables together with nopropag store information
about the revision operation requested from the GUI:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  to_tr    .. the transliteration to submit to the current symbol
  to_rev   .. the type of revision
  nopropag .. propagation flag for the result
  to_arg   .. integer argument to revision operation</PRE>
</TD></TR></TABLE></CENTER>
The types of revision are: transliteration submission (1),
fragment merging (2), symbol disassemble (3) and word extension
(4).

<P>
The to_arg variable stores the flagment to merge to the current
symbol or the symbol to add to the current word.

<P>
The variable ocr_other stores which operation to perform by the
OCR_OTHER step. This step is reserved to operations that are outside
the OCR run main stream, but require the control provided by the
continue_ocr function.

<P>

<P>
<A NAME=5.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.3 The function setview</B></FONT></TD></TR></TABLE>
<P>
As each window is displayed on only one mode and each mode belongs
to only one tab, in order to set a given mode or a given tab,
just call setview informing one window present on that mode as
parameter. That is the only parameter received by setview.
The geometry of each window will be re-computed by setview, so
setview is not called only to change the current mode, but
also after operations that change the geometry of the windows,
just like resizing the application X window or hiding the
scrollbars, or resizing the PAGE window, etc.

<P>
<P><A NAME=5.4><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.4 The function redraw (to be written)</B></FONT></TD></TR></TABLE>
<A NAME=5.5>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.5 The function show_hint</B></FONT></TD></TR></TABLE>
<P>
Messages are put on the status line (on the bottom of the
application X window) using the show_hint service. The show_hint
service receives two parameters: an integer f and a string (the
message).

<P>
If f is 0, then the message is "discardable". It won't be
displayed if a permanent message is currently being displayed.

<P>
If f is 1, then the message is "fixed". It won't be erased by a
subsequent show_hint call informing as message the empty string
(in practical terms, the pointer motion won't clear the message).

<P>
If f is 2, then the message is "permanent" (the message will be
cleared only by other fixed or permanent message).

<P>
If f is 3, any permanent or fixed messages will be discarded.

<P>
<A NAME=5.6>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>5.6 The function start_ocr</B></FONT></TD></TR></TABLE>
<P>
Starts a complete OCR run or some individual OCR step on one
given page, or on all pages. For instance, start_ocr is called by
the GUI when the user presses the "OCR" button or when the user
requests loading of one specific page.

<P>
In fact, almost all user requested operation is performed as an
"ocr step"in order to take advantage from the execution model
implemented by the function continue_ocr. So start_ocr is the
starting point for attending almost all user requests.

<P>
If p is -1, process all pages, if p < -1, process only the current
page (cpage) otherwise process only the page p. If s>=0 then run only
step s, otherwise run all steps.

<P>
If the flag r is nonzero, will ignore the current classes (if
any) and recompute them again (this is meaningful only to the
symbol classification step).

<P>
<P><A NAME=6.><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>6. How to change the source code (examples)</B></FONT></TD></TR></TABLE>
<A NAME=6.1>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>6.1 How to add a bitmap comparison method</B></FONT></TD></TR></TABLE>
<P>
It's not hard to add a bitmap comparison method to Clara
OCR. This may become very important when the available
heuristics are unable to classify the symbols of some
book, so a new heuristic must be created. In order to exemplify
that, we'll add a naive bitmap comparison method. It'll
just compare the number of black pixels on each bitmap,
and consider that the bitmaps are similar when these
numbers do not differ too much.

<P>
Please remember that the code added or linked to Clara
OCR must be GPL.

<P>
In order to add the new bitmap comparison method, we need
to write a function that compares two bitmaps returning how
similar they are, add this function as an alternative to
the Options menu, and call it when classifying the page symbols.
We'll perform all these steps adding a naive comparison
method, step by step. The more difficult one is to write
the bitmap comparison method. This step is covered on the
subsection "How to write a bitmap comparison function".

<P>
Let's present the other two steps. To add the new method
to the Options menu, we need:

<P>
a. Declare the macro CL_NBP:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  #define CL_BM 2</PRE>
</TD></TR></TABLE></CENTER>
b. Include the new classifier on the tune form (function
mk_tune):

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  C = (classifier==CL_NBP) ? "CHECKED" : "";
  totext("<BR><INPUT TYPE=RADIO NAME=C %s VALUE=2>NBP\n",C);</PRE>
</TD></TR></TABLE></CENTER>
Now add the call to
this new method on the classifier. This is just a matter of
adding one more item on the function selbc:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  else if (classifier == NBP)
      r = classify(c,bmpcmp_nbp);</PRE>
</TD></TR></TABLE></CENTER>
Where bmpcmp_nbp is the function that will be discussed
on the subsection "How to write a bitmap comparison function".

<P>
To use the new method, recompile the sources, start
Clara OCR and select the new method on the tune tab.

<P>
<A NAME=6.2>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>6.2 How to write a bitmap comparison function</B></FONT></TD></TR></TABLE>
<P>
The bitmap comparison function required for the example we're
presenting has the following prototype:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    int bmpcmp_nbp(int c,int st,int k,int d)</PRE>
</TD></TR></TABLE></CENTER>
The first parameter (c) is the symbol being compared, the
second (st) is the current status, the third (k) is the current
pattern and the fourth (d) will be discussed later.

<P>

<P>
Clara OCR will call bmpcmp_nbp once informing status 1
every time a new symbol c is chosen, so bmpcmp_cp
will be able to bufferize symbol data on static areas.
Note that to classify each symbol, Clara will
perform various calls to the bitmap
comparison function, because it will check X events
(like the STOP button), and, when visual modes are
enabled, Clara will need to refresh the screen
displaying the progress of the classification.

<P>
The block of bmpcmp_nbp corresponding to status 1 will
merely store on the static variable np the value of
the nbp field of the symbol structure.

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    nbp = mc[c].nbp;</PRE>
</TD></TR></TABLE></CENTER>
Before trying to classify the symbol c as similar to the
pattern k, Clara allows the bitmap comparison method to
apply simple heuristics to filter bad candidates in order
to save CPU cycles. This is done informing status 2. The
bitmap comparison function is expected, in this caso, to
return 1 if the pattern was accepted for further
processing, or 0 if it was rejected. For simplicity, the
bmpcmp_nbp function will return 1 in all cases.

<P>
When Clara OCR wants to effectively ask if the pattern
k matches the symbol c, it calls the bitmap comparison
function informing status 3. The function must return
a similarity index ranging from 0 (no similarity) to
10 (identity). Now we must take care of the fourth
parameter (mode). It informs if Clara is asking for a direct
(mode == 1) comparison or indirect (mode == 0) comparison.
This applies for asymmetric comparison methods. For
instance, when using skeleton fitting, "direct" means
that the pattern skeleton fits the symbol, and "indirect"
means that the symbol skeleton fits the pattern. Clara
will make both calls to avoid false positives.

<P>
In our example we'll on both cases return 10 if the test
5 * abs(nbp-m) <= (nbp+m) results true, where m is the
number of black pixels of the pattern.

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    m = pattern[k].nbp;
    if ((5*abs(nbp-)) <= (d->bp+mc[c].nbp))
        return(10);
    else
        return(0)</PRE>
</TD></TR></TABLE></CENTER>
Finally, every bitmap comparison method is expected to
produce a graphic image of the current status of the
comparison when called with status 0. That
image must be an FSxFS bitmap where each pixel may assume
the color WHITE, BLACK or GRAY. This bitmap must be
stored on the cfont array of bytes. The pixel on line i
and column j must be put on cfont[i+j*FS]. In our case
we'll just call the services copy_mc and bm2byte. The
effect is to copy the symbol bitmap to the cfont array:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

    unsigned char mcbm[BMS];
    [...]
    copy_mc(mcbm,c);
    bm2byte(cb,mcbm);</PRE>
</TD></TR></TABLE></CENTER>
<A NAME=6.3>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#E2D3FC><FONT SIZE=+1><B>6.3 How to add an application button</B></FONT></TD></TR></TABLE>
<P>
These are the steps to add a new button:

<P>
1. Increase the definition of BUTT_PER_COL by 1, and create a new
button macro after those already existing (bzoom, balpha,
etc). Note that each button macro is defined as an unique index
on the range 0..BUTT_PER_COL-1 (any permutation is ok).

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  #define BUTT_PER_COL 14
  [...]
  #define bfoo 13</PRE>
</TD></TR></TABLE></CENTER>
2. Define the label for this button (just add the corresponding
entry to the initialization block of the array BL). Multi-state
buttons have multiple labels, specified as
"state1:state2:state3":

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  char *BL[] = {
      [...]
      "foo"
  }</PRE>
</TD></TR></TABLE></CENTER>
The current state of the button is stored by button[bfoo]. When
the state is nonzero, the button is drawn using a dark
background.

<P>
3. Add a new block to attend this button on mactions_b1 and, if
desired, on mactions_b2 (just copy one existing block and adapt
it). It's mandatory to attend help requests. On/off and
multi-state buttons must circulate the acceptable values of the
respective entry of the array "button" in order to change the
current state, and set the redraw_button flag.

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

  if (i == bfoo) {
      if (help) {
          show_hint(0,"This is the FOO button");
          return;
      }
      show_hint("You pressed the FOO button");
  }</PRE>
</TD></TR></TABLE></CENTER>
There is no need to inform the type of the button (on/off,
multi-state or event catcher). The behaviour is defined by the
label and by the attending block. If the attending block changes
the button state, it must request redraw. Example:

<P>
<TABLE WIDTH=100%><TR><TD BGCOLOR=#E0E0E0><PRE>

      button[bfoo] = 1 - button[bfoo];
      redraw_button = bfoo;</PRE>
</TD></TR></TABLE></CENTER>
<A NAME=7.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>7. Bugs and TODO list</B></FONT></TD></TR></TABLE>
<P>
1. Check if all writings to the mb buffer are done through
snprintf.

<P>
2. Fix assymetric behaviour of the function "joined".

<P>
3. Optimize some bitmap copies (not bits but words).

<P>
4. Support Multiple OCR zones.

<P>
5. Make sure that the access to the data structures is blocked
during OCR (all functions that change the data structures must
check the value of the flag "ocring").

<P>
6. Use 64-bit integers for bitmap comparisons and support
big-endian CPUs.

<P>
7. Must clear memory before freeing.

<P>
8. Allow the transliterations to refer multiple acts (partially
done).

<P>
9. Rewrite composition of patterns for classification of linked
symbols.

<P>
10. Vertical segmentation (partially done).

<P>
11. Heuristics to merge fragments.

<P>

<P>
<A NAME=8.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>8. AVAILABILITY</B></FONT></TD></TR></TABLE>
<P>
Clara OCR is free software. Its source code is distributed under
the terms of the GNU GPL (General Public License), and is
available at <A HREF=http://www.claraocr.org/>http://www.claraocr.org/</A>. If you don't know what is the GPL,
please read it and check the GPL FAQ at
<A HREF=http://www.gnu.org/copyleft/gpl-faq.html>http://www.gnu.org/copyleft/gpl-faq.html</A>. You should have
received a copy of the GNU General Public License along with this
software; if not, write to the Free Software Foundation, Inc., 59
Temple Place - Suite 330, Boston, MA 02111-1307, USA. The Free
Software Foundation can be found at <A HREF=http://www.fsf.org>http://www.fsf.org</A>.

<P>

<P>
<A NAME=9.>
<P><TABLE BORDER=1 WIDTH=100%><TR><TD BGCOLOR=#79BEC6><FONT SIZE=+1><B>9. CREDITS</B></FONT></TD></TR></TABLE>
<P>
Clara OCR was written by Ricardo Ueda Karpischek. Imre Simon
contributed high-volume tests, discussions with experts,
selection of bibliographic resources, propaganda and many ideas
on how to make the software more useful.

<P>
Ricardo authored various free materials, some included in
Conectiva, Debian, FreeBSD and SuSE (the verb conjugator
"conjugue", the ispell dictionary br.ispell and the proxy
axw3). He recently ported the EiC interpreter to the Psion 5
handheld. Imre Simon promotes the usage and development of free
technologies and information from his research, teaching and
administrative labour at the University.

<P>
Ricardo Ueda Karpischek works as an independent developer and
instructor, and received no financial support to develop Clara
OCR. He's not an employee of any company or organization.

<P>
Roberto Hirata Junior and Marcelo Marcilio Silva contributed
ideas on character isolation and recognition. Richard Stallman
suggested improvements on how to generate HTML output. Marius
Vollmer is helping to add Guile support. Jacques Le Marois helped
on the announce process. We acknowledge Mike O'Donnell and Junior
Barrera for their good criticism. We acknowledge Peter Lyman for
his remarks about the Berkeley Digital Library, and Wanderley
Antonio Cavassin, Janos Simon and Roberto Marcondes Cesar Junior
for some web and bibliographic pointers. Bruno Barbieri Gnecco
provided hints and explanations about GOCR (main author: Jorg
Schulenburg). Luis Jose Cearra Zabala (author of OCRE) is gently
supporting our tentatives of using portions of his code. Adriano
Nagelschmidt Rodrigues and Carlos Juiti Watanabe carefully tried
the tutorial before the first announce. Eduardo Marcel Macan
packaged Clara OCR for Debian and suggested some
improvements. Mandrakesoft is hosting claraocr.org. We
acknowledge Conectiva and SuSE for providing copies of their
outstanding distributions. Finally, we acknowledge the late Jose
Hugo de Oliveira Bussab for his interest in our work.

<P>
The fonts used by the "view alphabet map" feature came from
Roman Czyborra's "The ISO 8859 Alphabet Soup" page at
<A HREF=http://czyborra.com/charsets/iso8859.html>http://czyborra.com/charsets/iso8859.html</A>.

<P>
Obs. see also the Changelog (<A HREF=http://www.claraocr.org/CHANGELOG>http://www.claraocr.org/CHANGELOG</A>).

<P>
</HR></BODY></HTML>