Sophie

Sophie

distrib > Fedora > 16 > i386 > by-pkgid > da5b00f3f86a431dc18211d71488251d > files > 13

coccinelle-doc-1.0.0-0.rc4.2.fc16.i686.rpm


\section{Examples}
%\label{sec:examples}

This section presents a range of examples.  Each
example is presented along with some C code to which it is
applied. The description explains the rules and the matching process.

\subsection{Function renaming}

One of the primary goals of Coccinelle is to perform software
evolution.  For instance, Coccinelle could be used to perform function
renaming. In the following example, every occurrence of a call to the
function \texttt{foo} is replaced by a call to the
function \texttt{bar}.\\

\begin{tabular}{ccc}
Before & Semantic patch & After \\
\begin{minipage}[t]{.3\linewidth}
\begin{lstlisting}
#DEFINE TEST "foo"

printf("foo");

int main(int i) {
//Test
  int k = foo();

  if(1) {
    foo();
  } else {
    foo();
  }

  foo();
}
\end{lstlisting}
\end{minipage}
&
\begin{minipage}[t]{.3\linewidth}
\begin{lstlisting}[language=Cocci]
@M@@

@@@M


@-- foo()
@++ bar()
\end{lstlisting}
\end{minipage}
&
\begin{minipage}[t]{.3\linewidth}
\begin{lstlisting}
#DEFINE TEST "foo"

printf("foo");

int main(int i) {
//Test
  int k = bar();

  if(1) {
    bar();
  } else {
    bar();
  }

  bar();
}
\end{lstlisting}
\end{minipage}\\
\end{tabular}

\newpage
\subsection{Removing a function argument}

Another important kind of evolution is the introduction or deletion of a
function argument. In the following example, the rule \texttt{rule1} looks
for definitions of functions having return type \texttt{irqreturn\_t} and
two parameters. A second \emph{anonymous} rule then looks for calls to the
previously matched functions that have three arguments. The third argument
is then removed to correspond to the new function prototype.\\

\begin{tabular}{c}
\begin{lstlisting}[language=Cocci,name=arg]
@M@ rule1 @
identifier fn;
identifier irq, dev_id;
typedef irqreturn_t;
@@@M

static irqreturn_t fn (int irq, void *dev_id)
{
   ...
}

@M@@
identifier rule1.fn;
expression E1, E2, E3;
@@@M

 fn(E1, E2
@--  ,E3
   )
\end{lstlisting}\\
\end{tabular}

\vspace{1cm}

\begin{tabular}{c}
  \texttt{drivers/atm/firestream.c} at line 1653 before transformation\\
\begin{lstlisting}[language=PatchC]
static void fs_poll (unsigned long data)
{
        struct fs_dev *dev = (struct fs_dev *) data;

@-        fs_irq (0, dev, NULL);
        dev->timer.expires = jiffies + FS_POLL_FREQ;
        add_timer (&dev->timer);
}
\end{lstlisting}\\
\vspace{1cm}
\\


  \texttt{drivers/atm/firestream.c} at line 1653 after transformation\\
\begin{lstlisting}[language=PatchC]
static void fs_poll (unsigned long data)
{
        struct fs_dev *dev = (struct fs_dev *) data;

@+        fs_irq (0, dev);
        dev->timer.expires = jiffies + FS_POLL_FREQ;
        add_timer (&dev->timer);
}
\end{lstlisting}\\
\end{tabular}

\newpage
\subsection{Introduction of a macro}

To avoid code duplication or error prone code, the kernel provides
macros such as \texttt{BUG\_ON}, \texttt{DIV\_ROUND\_UP} and
\texttt{FIELD\_SIZE}. In these cases, the semantic patches look for
the old code pattern and replace it by the new code.\\

A semantic patch to introduce uses of the \texttt{DIV\_ROUND\_UP} macro
looks for the corresponding expression, \emph{i.e.}, $(n + d - 1) /
d$. When some code is matched, the metavariables \texttt{n} and \texttt{d}
are bound to their corresponding expressions. Finally, Coccinelle rewrites
the code with the \texttt{DIV\_ROUND\_UP} macro using the values bound to
\texttt{n} and \texttt{d}, as illustrated in the patch that follows.\\

\begin{tabular}{c}
Semantic patch to introduce uses of the \texttt{DIV\_ROUND\_UP} macro\\
\begin{lstlisting}[language=Cocci,name=divround]
@M@ haskernel @
@@@M

#include <linux/kernel.h>

@M@ depends on haskernel @
expression n,d;
@@@M

(
@-- (((n) + (d)) - 1) / (d))
@++ DIV_ROUND_UP(n,d)
|
@-- (((n) + ((d) - 1)) / (d))
@++ DIV_ROUND_UP(n,d)
)
\end{lstlisting}
\end{tabular}\\

\vspace{1cm}

\begin{tabular}{c}
Example of a generated patch hunk\\
\begin{lstlisting}[language=PatchC]
@---- a/drivers/atm/horizon.c
@++++ b/drivers/atm/horizon.c
@M@@ -698,7 +698,7 @@ got_it:
                if (bits)
                        *bits = (div<<CLOCK_SELECT_SHIFT) | (pre-1);
                if (actual) {
@--                       *actual = (br + (pre<<div) - 1) / (pre<<div);
@++                       *actual = DIV_ROUND_UP(br, pre<<div);
                        PRINTD (DBG_QOS, "actual rate: %u", *actual);
                }
                return 0;
\end{lstlisting}
\end{tabular}\\

\newpage

The \texttt{BUG\_ON} macro makes a assertion about the value of an
expression. However, because some parts of the kernel define
\texttt{BUG\_ON} to be the empty statement when debugging is not wanted,
care must be taken when the asserted expression may have some side-effects,
as is the case of a function call. Thus, we create a rule introducing
\texttt{BUG\_ON} only in the case when the asserted expression does not
perform a function call.

On particular piece of code that has the form of a function call is a use
of \texttt{unlikely}, which informs the compiler that a particular
expression is unlikely to be true.  In this case, because \texttt{unlikely}
does not perform any side effects, it is safe to use \texttt{BUG\_ON}.  The
second rule takes care of this case.  It furthermore disables the
isomorphism that allows a call to \texttt{unlikely} be replaced with its
argument, as then the second rule would be the same as the first one.\\

\begin{tabular}{c}
\begin{lstlisting}[language=Cocci,name=bugon]
@M@@
expression E,f;
@@@M

(
  if (<+... f(...) ...+>) { BUG(); }
|
@-- if (E) { BUG(); }
@++ BUG_ON(E);
)

@M@ disable unlikely @
expression E,f;
@@@M

(
  if (<+... f(...) ...+>) { BUG(); }
|
@-- if (unlikely(E)) { BUG(); }
@++ BUG_ON(E);
)
\end{lstlisting}\\
\end{tabular}\\

For instance, using the semantic patch above, Coccinelle generates
patches like the following one.

\begin{tabular}{c}
\begin{lstlisting}[language=PatchC]
@---- a/fs/ext3/balloc.c
@++++ b/fs/ext3/balloc.c
@M@@ -232,8 +232,7 @@ restart:
                prev = rsv;
        }
        printk("Window map complete.\n");
@--       if (bad)
@--               BUG();
@++       BUG_ON(bad);
 }
 #define rsv_window_dump(root, verbose) \
        __rsv_window_dump((root), (verbose), __FUNCTION__)
\end{lstlisting}
\end{tabular}

\newpage
\subsection{Look for \texttt{NULL} dereference}

This SmPL match looks for \texttt{NULL} dereferences. Once an
expression has been compared to \texttt{NULL}, a dereference to this
expression is prohibited unless the pointer variable is reassigned.\\

\begin{tabular}{c}
    Original \\

\begin{lstlisting}
foo = kmalloc(1024);
if (!foo) {
  printk ("Error %s", foo->here);
  return;
}
foo->ok = 1;
return;
\end{lstlisting}\\
  \end{tabular}

\vspace{1cm}

\begin{tabular}{c}
  Semantic match\\

\begin{lstlisting}[language=Cocci]
@M@@
expression E, E1;
identifier f;
statement S1,S2,S3;
@@@M

@+* if (E == NULL)
{
  ... when != if (E == NULL) S1 else S2
      when != E = E1
@+* E->f
  ... when any
  return ...;
}
else S3
\end{lstlisting}\\
\end{tabular}

\vspace{1cm}

\begin{tabular}{c}
  Matched lines\\

\begin{lstlisting}[language=PatchC]
foo = kmalloc(1024);
@-if (!foo) {
@-  printk ("Error %s", foo->here);
  return;
}
foo->ok = 1;
return;
\end{lstlisting}\\
\end{tabular}

\newpage
\subsection{Reference counter: the of\_xxx API}

Coccinelle can embed Python code. Python code is used inside special
SmPL rule annotated with \texttt{script:python}.  Python rules inherit
metavariables, such as identifier or token positions, from other SmPL
rules. The inherited metavariables can then be manipulated by Python
code.

The following semantic match looks for a call to the
\texttt{of\_find\_node\_by\_name} function. This call increments a
counter which must be decremented to release the resource. Then, when
there is no call to \texttt{of\_node\_put}, no new assignment to the
\texttt{device\_node} variable \texttt{n} and a \texttt{return}
statement is reached, a bug is detected and the position \texttt{p1}
and \texttt{p2} are initialized. As the Python only depends on the
positions \texttt{p1} and \texttt{p2}, it is evaluated. In the
following case, some emacs Org mode data are produced.  This example
illustrates the various fields that can be accessed in the Python code from
a position variable.

\begin{tabular}{c}
\begin{lstlisting}[language=Cocci,breaklines=true]
@M@ r exists @
local idexpression struct device_node *n;
position p1, p2;
statement S1,S2;
expression E,E1;
@@@M

(
if (!(n@p1 = of_find_node_by_name(...))) S1
|
n@p1 = of_find_node_by_name(...)
)
<... when != of_node_put(n)
    when != if (...) { <+... of_node_put(n) ...+> }
    when != true !n  || ...
    when != n = E
    when != E = n
if (!n || ...) S2
...>
(
  return <+...n...+>;
|
return@p2 ...;
|
n = E1
|
E1 = n
)

@M@ script:python @
p1 << r.p1;
p2 << r.p2;
@@@M

print "* TODO [[view:%s::face=ovl-face1::linb=%s::colb=%s::cole=%s][inc. counter:%s::%s]]" % (p1[0].file,p1[0].line,p1[0].column,p1[0].column_end,p1[0].file,p1[0].line)
print "[[view:%s::face=ovl-face2::linb=%s::colb=%s::cole=%s][return]]" % (p2[0].file,p2[0].line,p2[0].column,p2[0].column_end)
\end{lstlisting}
\end{tabular}


\newpage

Lines 13 to 17 list a variety of constructs that should not appear
between a call to \texttt{of\_find\_node\_by\_name} and a buggy return
site. Examples are a call to \texttt{of\_node\_put} (line 13) and a
transition into the then branch of a conditional testing whether
\texttt{n} is \texttt{NULL} (line 15). Any number of conditionals
testing whether \texttt{n} is \texttt{NULL} are allowed as indicated
by the use of a nest \texttt{<...~~...>} to describe the path between
the call to \texttt{of\_find\_node\_by\_name}, the return and the
conditional in the pattern on line 18.\\

The previously semantic match has been used to generate the following
lines. They may be edited using the emacs Org mode to navigate in the code
from a site to another.

\begin{lstlisting}[language=,breaklines=true]
* TODO [[view:/linux-next/arch/powerpc/platforms/pseries/setup.c::face=ovl-face1::linb=236::colb=18::cole=20][inc. counter:/linux-next/arch/powerpc/platforms/pseries/setup.c::236]]
[[view:/linux-next/arch/powerpc/platforms/pseries/setup.c::face=ovl-face2::linb=250::colb=3::cole=9][return]]
* TODO [[view:/linux-next/arch/powerpc/platforms/pseries/setup.c::face=ovl-face1::linb=236::colb=18::cole=20][inc. counter:/linux-next/arch/powerpc/platforms/pseries/setup.c::236]]
[[view:/linux-next/arch/powerpc/platforms/pseries/setup.c::face=ovl-face2::linb=245::colb=3::cole=9][return]]
\end{lstlisting}

Note~: Coccinelle provides some predefined Python functions,
\emph{i.e.}, \texttt{cocci.print\_main}, \texttt{cocci.print\_sec} and
\texttt{cocci.print\_secs}. One could alternatively write the following
SmPL rule instead of the previously presented one.

\begin{tabular}{c}
\begin{lstlisting}[language=Cocci]
@M@ script:python @
p1 << r.p1;
p2 << r.p2;
@@@M

cocci.print_main(p1)
cocci.print_sec(p2,"return")
\end{lstlisting}
\end{tabular}\\

The function \texttt{cocci.print\_secs} is used when there is several
positions which are matched by a single position variable and that
every matched position should be printed.

Any metavariable could be inherited in the Python code. However,
accessible fields are not currently equally supported among them.

\newpage
\subsection{Filtering identifiers, declarers or iterators with regular expression}

If you consider the following SmPL file which uses the regexp functionality to
filter the identifiers that contain, begin or end by \texttt{foo},

\begin{tabular}{c@{\hspace{2cm}}c}
\begin{lstlisting}[language=Cocci, name=Regexp]
@M@anyid@
type t;
identifier id;
@@@M
t id () {...}

@M@script:python@
x << anyid.id;
@@@M
print "Identifier: %s" % x

@M@contains@
type t;
identifier foo ~= ".*foo";
@@@M
t foo () {...}

@M@script:python@
x << contains.foo;
@@@M
print "Contains foo: %s" % x

\end{lstlisting}
&
\begin{lstlisting}[language=Cocci,name=Regexp]
@M@endsby@
type t;
identifier foo ~= ".*foo$";
@@@M

t foo () {...}

@M@script:python@
x << endsby.foo;
@@@M
print "Ends by foo: %s" % x

@M@beginsby@
type t;
identifier foo ~= "^foo";
@@@M
t foo () {...}

@M@script:python@
x << beginsby.foo;
@@@M
print "Begins by foo: %s" % x
\end{lstlisting}
\end{tabular}\\

and the following C program, on the left, which defines the functions
\texttt{foo}, \texttt{bar}, \texttt{foobar}, \texttt{barfoobar} and
\texttt{barfoo}, you will get the result on the right.

\begin{tabular}{c@{\hspace{2cm}}c}
\begin{lstlisting}
int foo () { return 0; }
int bar () { return 0; }
int foobar () { return 0; }
int barfoobar () { return 0; }
int barfoo () { return 0; }
\end{lstlisting}
&
\begin{lstlisting}
Identifier: foo
Identifier: bar
Identifier: foobar
Identifier: barfoobar
Identifier: barfoo
Contains foo: foo
Contains foo: foobar
Contains foo: barfoobar
Contains foo: barfoo
Ends by foo: foo
Ends by foo: barfoo
Begins by foo: foo
Begins by foo: foobar
\end{lstlisting}
\end{tabular}

% \begin{tabular}{ccc}
% Before & Semantic patch & After \\
% \begin{minipage}[t]{.3\linewidth}
% \begin{lstlisting}
% \end{lstlisting}
% \end{minipage}
% &
% \begin{minipage}[t]{.3\linewidth}
% \begin{lstlisting}[language=Cocci]
% \end{lstlisting}
% \end{minipage}
% &
% \begin{minipage}[t]{.3\linewidth}
% \begin{lstlisting}
% \end{lstlisting}
% \end{minipage}\\
% \end{tabular}

%%% Local Variables:
%%% mode: LaTeX
%%% TeX-master: "main_grammar"
%%% coding: utf-8
%%% TeX-PDF-mode: t
%%% ispell-local-dictionary: "american"
%%% End: