Sophie

Sophie

distrib > Mandriva > 2010.1 > i586 > media > contrib-updates > by-pkgid > 38e407a18d677238502174ba830049d4 > files > 23

ocamlviz-1.01-1.1mdv2010.1.i586.rpm

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%                                                                        %
%  Ocamlviz --- real-time profiling tools for Objective Caml             %
%  Copyright (C) by INRIA - CNRS - Universite Paris Sud                  %
%  Authors: Julien Robert                                                %
%           Guillaume Von Tokarski                                       %
%           Sylvain Conchon                                              %
%           Jean-Christophe Filliatre                                    %
%           Fabrice Le Fessant                                           %
%  GNU Library General Public License version 2                          %
%  See file LICENSE for details                                          %
%                                                                        %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\documentclass[a4paper,12pt]{article}

\usepackage[utf8]{inputenc}
\usepackage{url,fullpage,graphicx,alltt}

\newcommand{\ocaml}{\textsc{Objective Caml}}
\newcommand{\viz}{\textsc{Ocamlviz}}

\title{\viz}
\author{Julien Robert and Guillaume Von Tokarski}

\begin{document}

\maketitle

\tableofcontents

\section{Introduction}

\viz\ is a free software funded by Jane Street Capital within the framework of Jane Street Summer Project. 
It allows the monitoring of \ocaml\ programs and values in real time by using the \viz\ library. \viz\ can also be used as a debugging tool.

\section{Installation}

\subsection{Prerequisites}

You need Objective Caml $\ge$ 3.10.0 to compile Ocamlviz.
To compile the GUI, you also need Lablgtk2 $\ge$ 2.10.1-2~\cite{lablgtk}
and Libcairo-ocaml $\ge$ 20070908-1build1~\cite{cairo}. (If one of these libraries is missing, 
the compilation will proceed but the GUI will not be compiled.)

To display trees within the GUI, you need Graphviz~\cite{graphviz} to
be installed. But Graphviz is not required to compile Ocamlviz.

\subsection{Compiling from sources}

Within Ocamlviz sources, configure with
\begin{alltt}
  ./configure
\end{alltt}
Compile with
\begin{alltt}
  make
\end{alltt}
As superuser, install with
\begin{alltt}
  make install
\end{alltt}

\section{User Manual}
The documentation can be found at the following adress: \url{http://ocamlviz.lri.fr/doc/Monitor_sig.Monitor.html} or by compiling with the command ``make doc'' (the index of the documentation will be the file \textit{index.html} inside the folder \textit{doc}). 


\subsection{Instrumenting User Code for Monitoring}

Two libraries are provided:
\begin{itemize}
  \item Ocamlviz: This is the main library. It uses alarms to collect and send data. 
    If the monitored program also uses alarms, open the Ocamlviz\_threads library instead.
    Ocamlviz may not work properly in native compilation if the monitored program doesn't trigger the GC. 
    If this happens, compile the program in byte code.
  \item Ocamlviz\_threads: This is the library that should be used if the user program already uses alarms.
    Note that \ocaml\ threads are not efficient and this solution is a patch.
\end{itemize}

You have to call \texttt{Ocamlviz.init} (or \texttt{Ocamlviz\_threads.init}) in your code.

You may use \texttt{Ocamlviz.send\_now ()} (or \texttt{Ocamlviz\_threads.send\_now ()}) to force a sending.

\subsubsection{Module Point}
This module is a check-point tool.
When putting a Point.observe annotation, it will sum every time the program goes through this line of code.
For instance:
\begin{alltt}
  let point1 = Point.create "observe f" 
  let _ = Point.observe point1
\end{alltt}


\subsubsection{Module Time}
This module is a chronograph tool.
The timer that was created can be started and stopped. Note that a stopped timer can be restarted.
For instance:
\begin{alltt}
  let timer1 = Time.create "time in function f"

  let f () = 
  begin
  Time.start timer1;

  ...

  Time.stop timer1;
  end

  let g a b c d = a+b+c+d

  let _ = (Time.time "time in function g" g) 1 2 3 4
\end{alltt}

\subsubsection{Module Tag}
The module Tag allows creating sets of \ocaml\ data. It's possible to monitor the cardinal number and the size of these sets. The sets can contain any value of any type. For instance:

\begin{alltt}
  let tag = Tag.create ~size:true ~count:true ~period:1000 "tag example"

  let x = Tag.mark tag (true::[])
  let y = Tag.mark tag (6. +. 1., 6 + 1)
  let z = Tag.mark tag "string" 
\end{alltt}
The set \textit{tag} contains these 3 elements. The size and the cardinal number of this set will be monitored.
The period is in milliseconds.
For each tag, \viz\ goes through its elements in the heap. The bigger the elements, the slower the program, so correctly adjust the period.


\subsubsection{Module Value}
This module allows the monitoring of values with the following \ocaml\ types:
\begin{itemize}
\item integers
\item floating point numbers
\item booleans
\item strings
\end{itemize}
\bigskip

For instance:

\begin{alltt}
  let f x = x *. 0.1
  let _ = Value.observe_float_fct ~period:2000 "f 2." (fun () -> f 2.)

  let s = "weak"
  let _ = Value.observe_string "s" s
  let _ = Value.observe_string_fct ~weak:true "fct_s" (fun () -> s)

  let a = Value.observe_int_ref "a" (ref 0)
  
  let b = ref true
  let _ = Value.observe_float_ref "b" b
\end{alltt}
The argument \texttt{weak} means that the value can be attached to a weak pointer and garbage collected.


\subsubsection{Module Hashtable}
This module is meant to monitor \ocaml\ hash tables.
It monitors the:
\begin{itemize}
\item hash table length (number of elements inside the table)
\item array length (number of entries of the table)
\item number of empty buckets
\item hash table filling rate
\item longest bucket length
\item mean bucket length
\end{itemize}
\bigskip
For instance:

\begin{alltt}
  let h = Hashtable.observe ~period:1000 "h" (Hashtbl.create 17)
\end{alltt}
\viz\ goes through the whole hash table in the heap. The bigger the table, the slower the program, so correctly adjust the period.

\subsubsection{Module Tree}
This module allows the monitoring of polymorphic variants, once they were changed into the following type:
\begin{alltt}
  type variant = Node of string * variant list
\end{alltt}
\bigskip

For instance:

\begin{alltt}
  let tree1 = (Protocol.Node ("1",[
                                   Protocol.Node ("1.1",[]);
                                   Protocol.Node ("1.2",[]);
                                  ]))
  
  let _ = Tree.observe "tree1" (fun () -> tree1)
\end{alltt}


\subsubsection{Log}
This function builds a log and expands it.
For each call, it will store the string along with its time.

\begin{alltt}
  let _ = log "\%d This is how we use \%s in \%s" 1 "log" "ocamlviz";
          log "\%f It is \%b that log works like ocaml printf functions" 2. true
\end{alltt}

\subsubsection{Kill}
In some modules, there are functions called ``killed''. Calling this function will stop the monitoring of a data. This can be usefull if the data won't change anymore and if its monitoring costs a lot of ressources. 

\subsubsection{Wait\_for\_connected\_clients \& wait\_for\_killed\_clients}
\viz\ provides two functions to blocks the program execution:
\begin{itemize}
\item wait\_for\_connected\_clients $i$: this hangs up the program execution until $i$ clients are connected
\item wait\_for\_killed\_clients () : this hangs up the program execution until every clients are disconnected
\end{itemize}

\subsubsection{Automating Instrumentation using Camlp4}
It is possible to instrument automatically a file using \texttt{camlp4}.
For this purpose, a preprocessor called \texttt{pa\_ocamlviz} is provided.
It is used as follows:
\begin{alltt}
  ocamlopt -c -pp "camlp4 pa_o.cmo str.cma pa_ocamlviz.cmo pr_o.cmo" \emph{source_file.ml}
\end{alltt}



This will modify the following top-level instructions:
\begin{itemize}
\item References on integers, floating points, booleans, strings
\item Hash tables
\item Functions (time and calls monitoring)
\end{itemize}
\bigskip

If the data are visualized through the GUI for a file called "file", data' names will be "file\_name".
For example, a function "f" from a file "g.ml" will be displayed as "g\_f".

\subsection{Linking with Ocamlviz}

To link the user code with Ocamlviz, use
\begin{alltt}
  ocamlc unix.cma libocamlviz.cma \emph{<your files>}
\end{alltt}
in bytecode, and
\begin{alltt}
  ocamlopt unix.cmxa libocamlviz.cmxa \emph{<your files>}
\end{alltt}
in native-code.

\bigskip

Note that \texttt{Ocamlviz.init} (or \texttt{Ocamlviz\_threads.init}) must be called somewhere in the user code.
\bigskip

Once linked with Ocamlviz, the user code acts like a server. The default port used by this server is 51000. 
Another port can be specified using the \texttt{OCAMLVIZ\_PORT} environment variable.
\bigskip

The server's default timer is 0.1 seconds, you can specify another timer by changing the 
\texttt{OCAMLVIZ\_PERIOD} environment variable. We advise to keep a timer greater or equal than 0.1 seconds.
\bigskip

Calculing the size of living data in the heap can cost a lot of ressources and considerably affect the program execution. 
The computational complexity of this calculus is O(n), n being the number of blocks of the heap.
The default period of this calculus is 1.0 second. You can specify another period by changing the 
\texttt{OCAMLVIZ\_GC\_PERIOD} environment variable. We advise to keep a period greater or equal than 0.1 seconds.
NB: this doesn't affect the heap's total size, which is get according to server timer.

\subsection{Visualizing Monitoring Results}
\viz\ provides two clients to visualize the monitored data.

\subsubsection{GUI}

The GUI is launched with
\begin{alltt}
  ocamlviz-gui [options]
\end{alltt}
Command line options are
\begin{description}
\item[\texttt{-server}] to specify the server machine (the default is the local host)
\item[\texttt{-port}] to specify the server port (the default value is 51000)
\end{description}
If no Ocamlviz server is running, the GUI fails with the error message
\begin{alltt}
  connection: couldn't connect to the server machine:port
\end{alltt}
Otherwise, it opens a main window which looks like:
\begin{center}
  \includegraphics[scale=0.5]{gc.png}
\end{center}

The data are displayed in a notebook, in the following pages:
\begin{itemize}
\item Stats: 
  displays \texttt{Point} and \texttt{Time}
\item Values: 
  displays \texttt{Value}
\item Tags: 
  displays \texttt{Tag}
\item Hash tables: 
  displays \texttt{Hashtable}
\item Trees: 
  displays \texttt{Tree}
\item Log: 
  displays the log
\item Gc: 
  displays the garbage collector informations about the size of the heap, the size of living data in the heap, along with their representation on a graph
\end{itemize}

Inside some cells, there is a second information which is the last time the data was modified.
The color of the text can be red (value was killed) or green (value was garbage collected).
Cells can also contain check boxes. These check boxes, once checked, allow to create graphs and lists in new pages or existing pages, through the menu ``Visualize in'' or shortcuts.
A list can contain any data, but a graph can only display data of the same type, representing integers, floating-points, percentages or bytes.

It is possible to pause the GUI and even to travel back in time through the record panel. The database will store one minute of data by default, but this can be changed in the menu preferences. The maximum window is one hour.

\bigskip
\begin{center} 
\includegraphics[scale=0.5]{tree.png}
\end{center}
\bigskip
\begin{center} 
\includegraphics[scale=0.5]{hash1.png}
\end{center}
\bigskip
\begin{center} 
\includegraphics[scale=0.5]{hash2.png}
\end{center}



\subsubsection{ASCII Client}
This client logs the monitored data into a file.

The ASCII client is launched with
\begin{alltt}
  ocamlviz-ascii [options]
\end{alltt}
Command line options are
\begin{description}
\item[\texttt{-server}] to specify the server machine (the default is the local host)
\item[\texttt{-port}] to specify the server port (the default value is 51000)
\item[\texttt{-o}] to specify the output file (the default value is \texttt{ascii.log})
\end{description}
If no Ocamlviz server is running, the ASCII client fails with the error message
\begin{alltt}
  connection: couldn't connect to the server machine:port
\end{alltt}

\section{Developer Manual}

\subsection{Source Files}
\begin{itemize}
\item \texttt{ascii.ml}: this is the ASCII client, it writes monitored data into a file
\item \texttt{binary.ml}: contains functions that code and decode several \ocaml\ types in a buffer
\item \texttt{bproto.ml}: contains the functions that code and decode the \viz\ messages (see protocol.mli)
\item \texttt{db.ml}: the client database that stores the data and gives functions to acces them.
\item \texttt{dot.ml}: contains functions that create dot files (graphviz) from a variant (see protocol.mli)
\item \texttt{graph.ml}: a module that create a graph on a cairo canvas, and functions to manage the graph
\item \texttt{gui\_misc.ml}: contains miscellaneous functions for the GUI
\item \texttt{gui.ml}: the main file of the GUI, containing the main and the functions to build the notebook and export data into graphs and pages
\item \texttt{gui\_models.ml}: contains the functions that create the models and refresh them
\item \texttt{gui\_pref.ml}: contains the functions that create the preferences dialog windows, and manage preferences
\item \texttt{gui\_view.ml}: contains the functions that create the views associated to the models (see gui\_models.ml)
\item \texttt{monitor\_impl.ml}: contains the monitoring API
\item \texttt{net.ml}: contains the client-side network
\item \texttt{ocamlviz.ml}: includes monitor\_impl.ml and contains the server for alarms
\item \texttt{ocamlviz\_threads.ml}: includes monitor\_impl.ml and contains the server for threads
\item \texttt{preflexer.mll}: parses the file called ``preferences'' (if it exists) to apply the user preferences
\item \texttt{protocol.mli}: contains the protocol types
\item \texttt{timemap.ml}: a module to store data in an array and retrieve them with a logarithmic complexity
\item \texttt{tree\_panel.ml}: contains the functions to create and display a tree container
\end{itemize}

\subsection{Protocol}
The protocol is made of three types of messages:
\begin{itemize}
\item Declare, to declare a new tag to a client
\item Send, to send a tag's value (only after this tag was declared)
\item Bind, to bind tags together (optionnal)
\end{itemize}
\bigskip

These 3 messages have the following structure:

\begin{center}
  \begin{tabular}{|c|c|}
    \hline
    command & arguments \\\hline\hline
    \texttt{Declare} & tag, kind, name \\\hline
    \texttt{Send} & tag, value \\\hline
    \texttt{Bind} & tag list \\\hline
  \end{tabular}
\end{center}

\subsection{Binary Implementation of the Protocol}

\subsubsection{Tag}
A tag is an integer coded on 2 bytes.

\subsubsection{Kind}
Each kind is assigned to an integer. This integer is then coded on 1 byte.

\begin{center}
  \begin{tabular}{|c|c|}
    \hline
    & \texttt{Kind} \\
    \hline
    \hline
    0 & Point \\
    \hline
    1 & Time \\
    \hline
    2 & Value\_int \\
    \hline
    3 & Value\_float \\
    \hline
    4 & Value\_bool \\
    \hline
    5 & Value\_string \\
    \hline
    6 & Tag\_count \\
    \hline
    7 & Tag\_size \\
    \hline
    8 & Special \\
    \hline
    9 & KTree \\
    \hline
    10 & Hash \\
    \hline
    11 & KLog \\
    \hline
  \end{tabular}
\end{center}


\subsubsection{Name}
A name is coded into two parts, the first part being the string's length on 4 bytes, and the second being the string itself on \texttt{length} bytes.

\begin{center}
  \begin{tabular}{|c|c|c|}
    \hline
    \texttt{Bytes} & 4 & $n$ \\
    \hline
    \texttt{Value} & length ($n$) & contents \\\hline
  \end{tabular}
\end{center}

\subsubsection{List}
A list is coded into two parts, the first part being the list's length on 2 bytes, and the second being the elements.
The way the elements are coded will depend on their types.

\begin{center}
  \begin{tabular}{|c|c|c|c|c|}
    \hline
    \texttt{Bytes} & 2 & ? & ... & ?\\
    \hline
    \texttt{Value} & length & element \#1 & ... & elements \#n \\\hline
  \end{tabular}
\end{center}

\subsubsection{Value}

\begin{itemize}
\item Int

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 4   \\
      \hline
      \texttt{Native Int 31} & 0 & $i$ \\
      \hline
    \end{tabular}
  \end{center}
  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 8   \\
      \hline
      \texttt{Native Int 63} & 1 & $i$ \\
      \hline
    \end{tabular}
  \end{center}

\item Float

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 8   \\
      \hline
      \texttt{Float} & 2 & $f$ \\
      \hline
    \end{tabular}
  \end{center}

\item String

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 4 & $n$ \\
      \hline
      \texttt{String} & 3 & length ($n$) & $s$ \\\hline
    \end{tabular}
  \end{center}

\item Bool

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 1 \\
      \hline
      \texttt{Bool} & 4 & $b$ \\
      \hline
    \end{tabular}
  \end{center}

\item Int64

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 8 \\
      \hline
      \texttt{Int64} & 5 & $i$ \\
      \hline
    \end{tabular}
  \end{center}

\item Collected

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|}
      \hline
      \texttt{Bytes} & 1 \\
      \hline
      \texttt{Collected} & 6 \\
      \hline
    \end{tabular}
  \end{center}
  
\item Killed

  \bigskip  
  \begin{center}
    \begin{tabular}{|c|c|}
      \hline
      \texttt{Bytes} & 1 \\
      \hline
      \texttt{Killed} & 7 \\
      \hline
    \end{tabular}
  \end{center}

\item Tree

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 1 & ? \\
      \hline
      \texttt{Tree} & 8 & \# nodes & \texttt{Node} \texttt{List} \\
      \hline
    \end{tabular}
  \end{center}
    \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|c|c|}
      \hline
      \texttt{Bytes} & 4 & length($s$) & 1 & ? \\
      \hline
      \texttt{Node ($s$,$l$)} & length($s$) & $s$ & length($l$) & $l$ (\texttt{Child} \texttt{List}) \\
      \hline
    \end{tabular}
  \end{center}
    \bigskip
  \begin{center}
    \begin{tabular}{|c|c|}
      \hline
      \texttt{Bytes} & 2 \\
      \hline
      \texttt{Child} & index \\
      \hline
    \end{tabular}
  \end{center}

  This coding allows to keep the sharing.

  Tree coding example:
  \begin{alltt}
            A
           / \textbackslash
          B   C
         / \textbackslash
        D   E
  \end{alltt}
   
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Nodes} & \texttt{Value to code} & \texttt{Meaning} \\
      \hline
      \hline
      & 8 & Tree \\
      \hline
      & 5 & \# nodes \\
      \hline
      0 & 1 & length D \\
      \cline{2-3}
      & D & \\
      \cline{2-3}
      & 0 & 0 child \\
      \hline
      1 & 1 & length E \\
      \cline{2-3}
      & E & \\
      \cline{2-3}
      & 0 & 0 child \\
      \hline
      2 & 1 & length B \\
      \cline{2-3}
      & B & \\
      \cline{2-3}
      & 2 & 2 children \\
      \cline{2-3}
      & 0 & node 0 \\
      \cline{2-3}
      & 1 & node 1 \\
      \hline
      3 & 1 & length C \\
      \cline{2-3}
      & C & \\
      \cline{2-3}
      & 0 & 0 child \\
      \hline
      4 & 1 & length A \\
      \cline{2-3}
      & A & \\
      \cline{2-3}
      & 2 & 2 children \\
      \cline{2-3}
      & 2 & node 2 \\
      \cline{2-3}
      & 3 & node 3 \\
      \hline
    \end{tabular}
  \end{center}

 
\item Hashtable

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & 4 & 4 & 4 & 4  \\
      \hline
      \texttt{Hashtable} & 9 & \# entries & \# elements & \# empty buckets & max bucket length \\
      \hline
    \end{tabular}
  \end{center}


\item Log

  \bigskip
  \begin{center}
    \begin{tabular}{|c|c|c|}
      \hline
      \texttt{Bytes} & 1 & ? \\
      \hline
      \texttt{Log} & 10 & \texttt{Float} * \texttt{String} \texttt{List} \\
      \hline
    \end{tabular}
  \end{center}


\end{itemize}

\subsubsection{Command}

\begin{description}
\item[\texttt{Declare}] ~\par

\begin{center}
  \begin{tabular}{|c|c|c|c|c|}
    \hline
    \texttt{Bytes} & 1 & 2 & 1 & ? \\
    \hline
    \texttt{Value} & 0 & tag & kind & string \\
    \hline
  \end{tabular}
\end{center}

\item[\texttt{Send}] ~\par

\begin{center}
  \begin{tabular}{|c|c|c|c|}
    \hline
    \texttt{Bytes} & 1 & 2  & ? \\
    \hline
    \texttt{Value} & 1 & tag & value \\
    \hline
  \end{tabular}
\end{center}

\item[\texttt{Bind}] ~\par

\begin{center}
  \begin{tabular}{|c|c|c|}
    \hline
    \texttt{Bytes} & 1 & 2  \\
    \hline
    \texttt{Value} & 1 & tag  \\
    \hline
  \end{tabular}
\end{center}


\end{description}

\subsection{Architecture}
\bigskip
This is the architecture of \viz. When a program is monitored, a server is created, sending the binary data on the network to its clients. Each client will decode every binary data and store them into its own database. 
\bigskip
\begin{center} 
\includegraphics{archi.mps}
\end{center}

\begin{thebibliography}{99}

\bibitem{lablgtk}
Jacques Garrigue
\emph{Lablgtk}, an \ocaml\ interface to Gtk+

\url{http://wwwfun.kurims.kyoto-u.ac.jp/soft/lsl/lablgtk.html}


\bibitem{cairo}
\emph{Cairo}, a 2D graphics library with support for multiple output devices

\url{http://cairographics.org/cairo-ocaml/}

\bibitem{graphviz}
\emph{Graphviz}, an open source graph visualization software

\url{http://www.graphviz.org/}

\end{thebibliography}

\end{document}


%%% Local Variables:                                                            
%%% mode: latex                                                                 
%%% mode: whizzytex                                                             
%%% mode: flyspell                                                              
%%% ispell-local-dictionary: "francais-latin1"                                  
%%% End: