Sophie

Sophie

distrib > Mageia > 5 > i586 > by-pkgid > 3abba1c7a0f7ec4c649289f5b8a17d86 > files > 118

python-opencl-2013.1-8.mga5.i586.rpm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Parallel Algorithms &mdash; PyOpenCL 2013.1 documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="_static/akdoc.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    './',
        VERSION:     '2013.1',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  false
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="top" title="PyOpenCL 2013.1 documentation" href="index.html" />
    <link rel="next" title="Built-in Utilities" href="tools.html" />
    <link rel="prev" title="Multi-dimensional arrays" href="array.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="tools.html" title="Built-in Utilities"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="array.html" title="Multi-dimensional arrays"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">PyOpenCL 2013.1 documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="parallel-algorithms">
<h1>Parallel Algorithms<a class="headerlink" href="#parallel-algorithms" title="Permalink to this headline">¶</a></h1>
<div class="section" id="module-pyopencl.elementwise">
<span id="element-wise-expression-evalution-map"></span><h2>Element-wise expression evalution (&#8220;map&#8221;)<a class="headerlink" href="#module-pyopencl.elementwise" title="Permalink to this headline">¶</a></h2>
<p>Evaluating involved expressions on <tt class="xref py py-class docutils literal"><span class="pre">pyopencl.array.Array</span></tt> instances by
using overloaded operators can be somewhat inefficient, because a new temporary
is created for each intermediate result. The functionality in the module
<a class="reference internal" href="#module-pyopencl.elementwise" title="pyopencl.elementwise"><tt class="xref py py-mod docutils literal"><span class="pre">pyopencl.elementwise</span></tt></a> contains tools to help generate kernels that
evaluate multi-stage expressions on one or several operands in a single pass.</p>
<p>Here&#8217;s a usage example:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">pyopencl</span> <span class="kn">as</span> <span class="nn">cl</span>
<span class="kn">import</span> <span class="nn">pyopencl.array</span> <span class="kn">as</span> <span class="nn">cl_array</span>
<span class="kn">import</span> <span class="nn">numpy</span>

<span class="n">ctx</span> <span class="o">=</span> <span class="n">cl</span><span class="o">.</span><span class="n">create_some_context</span><span class="p">()</span>
<span class="n">queue</span> <span class="o">=</span> <span class="n">cl</span><span class="o">.</span><span class="n">CommandQueue</span><span class="p">(</span><span class="n">ctx</span><span class="p">)</span>

<span class="n">n</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">a_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span>
        <span class="n">ctx</span><span class="p">,</span> <span class="n">queue</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">))</span>
<span class="n">b_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span>
        <span class="n">ctx</span><span class="p">,</span> <span class="n">queue</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">))</span>

<span class="kn">from</span> <span class="nn">pyopencl.elementwise</span> <span class="kn">import</span> <span class="n">ElementwiseKernel</span>
<span class="n">lin_comb</span> <span class="o">=</span> <span class="n">ElementwiseKernel</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span>
        <span class="s">&quot;float a, float *x, &quot;</span>
        <span class="s">&quot;float b, float *y, &quot;</span>
        <span class="s">&quot;float *z&quot;</span><span class="p">,</span>
        <span class="s">&quot;z[i] = a*x[i] + b*y[i]&quot;</span><span class="p">,</span>
        <span class="s">&quot;linear_combination&quot;</span><span class="p">)</span>

<span class="n">c_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">empty_like</span><span class="p">(</span><span class="n">a_gpu</span><span class="p">)</span>
<span class="n">lin_comb</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="n">a_gpu</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">b_gpu</span><span class="p">,</span> <span class="n">c_gpu</span><span class="p">)</span>

<span class="kn">import</span> <span class="nn">numpy.linalg</span> <span class="kn">as</span> <span class="nn">la</span>
<span class="k">assert</span> <span class="n">la</span><span class="o">.</span><span class="n">norm</span><span class="p">((</span><span class="n">c_gpu</span> <span class="o">-</span> <span class="p">(</span><span class="mi">5</span><span class="o">*</span><span class="n">a_gpu</span><span class="o">+</span><span class="mi">6</span><span class="o">*</span><span class="n">b_gpu</span><span class="p">))</span><span class="o">.</span><span class="n">get</span><span class="p">())</span> <span class="o">&lt;</span> <span class="mf">1e-5</span>
</pre></div>
</div>
<p>(You can find this example as <tt class="file docutils literal"><span class="pre">examples/demo_elementwise.py</span></tt> in the PyOpenCL
distribution.)</p>
</div>
<div class="section" id="module-pyopencl.reduction">
<span id="sums-and-counts-reduce"></span><span id="custom-reductions"></span><h2>Sums and counts (&#8220;reduce&#8221;)<a class="headerlink" href="#module-pyopencl.reduction" title="Permalink to this headline">¶</a></h2>
<dl class="class">
<dt id="pyopencl.reduction.ReductionKernel">
<em class="property">class </em><tt class="descclassname">pyopencl.reduction.</tt><tt class="descname">ReductionKernel</tt><big>(</big><em>ctx</em>, <em>dtype_out</em>, <em>neutral</em>, <em>reduce_expr</em>, <em>map_expr=None</em>, <em>arguments=None</em>, <em>name=&quot;reduce_kernel&quot;</em>, <em>options=[]</em>, <em>preamble=&quot;&quot;</em><big>)</big><a class="headerlink" href="#pyopencl.reduction.ReductionKernel" title="Permalink to this definition">¶</a></dt>
<dd><p>Generate a kernel that takes a number of scalar or vector <em>arguments</em>
(at least one vector argument), performs the <em>map_expr</em> on each entry of
the vector argument and then the <em>reduce_expr</em> on the outcome of that.
<em>neutral</em> serves as an initial value. <em>preamble</em> offers the possibility
to add preprocessor directives and other code (such as helper functions)
to be added before the actual reduction kernel code.</p>
<p>Vectors in <em>map_expr</em> should be indexed by the variable <em>i</em>. <em>reduce_expr</em>
uses the formal values &#8220;a&#8221; and &#8220;b&#8221; to indicate two operands of a binary
reduction operation. If you do not specify a <em>map_expr</em>, &#8220;in[i]&#8221; &#8211; and
therefore the presence of only one input argument &#8211; is automatically
assumed.</p>
<p><em>dtype_out</em> specifies the <a class="reference external" href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html#numpy.dtype" title="(in NumPy v1.8)"><tt class="xref py py-class docutils literal"><span class="pre">numpy.dtype</span></tt></a> in which the reduction is
performed and in which the result is returned. <em>neutral</em> is specified as
float or integer formatted as string. <em>reduce_expr</em> and <em>map_expr</em> are
specified as string formatted operations and <em>arguments</em> is specified as a
string formatted as a C argument list. <em>name</em> specifies the name as which
the kernel is compiled. <em>options</em> are passed unmodified to
<a class="reference internal" href="runtime.html#pyopencl.Program.build" title="pyopencl.Program.build"><tt class="xref py py-meth docutils literal"><span class="pre">pyopencl.Program.build()</span></tt></a>. <em>preamble</em> specifies a string of code that
is inserted before the actual kernels.</p>
<dl class="method">
<dt id="pyopencl.reduction.ReductionKernel.__call__">
<tt class="descname">__call__</tt><big>(</big><em>*args</em>, <em>queue=None</em>, <em>wait_for=None</em>, <em>return_event=False</em><big>)</big><a class="headerlink" href="#pyopencl.reduction.ReductionKernel.__call__" title="Permalink to this definition">¶</a></dt>
<dd><p><em>wait_for</em>
may either be <em>None</em> or a list of <a class="reference internal" href="runtime.html#pyopencl.Event" title="pyopencl.Event"><tt class="xref py py-class docutils literal"><span class="pre">pyopencl.Event</span></tt></a> instances for
whose completion this command waits before starting exeuction.</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">the resulting scalar as a single-entry <tt class="xref py py-class docutils literal"><span class="pre">pyopencl.array.Array</span></tt>
if <em>return_event</em> is <em>False</em>, otherwise a tuple <tt class="docutils literal"><span class="pre">(scalar_array,</span> <span class="pre">event)</span></tt>.</td>
</tr>
</tbody>
</table>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">The returned <a class="reference internal" href="runtime.html#pyopencl.Event" title="pyopencl.Event"><tt class="xref py py-class docutils literal"><span class="pre">pyopencl.Event</span></tt></a> corresponds only to part of the
execution of the reduction. It is not suitable for profiling.</p>
</div>
</dd></dl>

</dd></dl>

<p>Here&#8217;s a usage example:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">a</span> <span class="o">=</span> <span class="n">pyopencl</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">pyopencl</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>

<span class="n">krnl</span> <span class="o">=</span> <span class="n">ReductionKernel</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">,</span> <span class="n">neutral</span><span class="o">=</span><span class="s">&quot;0&quot;</span><span class="p">,</span>
        <span class="n">reduce_expr</span><span class="o">=</span><span class="s">&quot;a+b&quot;</span><span class="p">,</span> <span class="n">map_expr</span><span class="o">=</span><span class="s">&quot;x[i]*y[i]&quot;</span><span class="p">,</span>
        <span class="n">arguments</span><span class="o">=</span><span class="s">&quot;__global float *x, __global float *y&quot;</span><span class="p">)</span>

<span class="n">my_dot_prod</span> <span class="o">=</span> <span class="n">krnl</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
</pre></div>
</div>
</div>
<div class="section" id="module-pyopencl.scan">
<span id="prefix-sums-scan"></span><span id="custom-scan"></span><h2>Prefix Sums (&#8220;scan&#8221;)<a class="headerlink" href="#module-pyopencl.scan" title="Permalink to this headline">¶</a></h2>
<p>A prefix sum is a running sum of an array, as provided by
e.g. <tt class="xref py py-mod docutils literal"><span class="pre">numpy.cumsum</span></tt>:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">]</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">cumsum</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="go">array([ 1,  2,  3,  4,  5,  7,  9, 11, 13, 15])</span>
</pre></div>
</div>
<p>This is a very simple example of what a scan can do. It turns out that scans
are significantly more versatile. They are a basic building block of many
non-trivial parallel algorithms. Many of the operations enabled by scans seem
difficult to parallelize because of loop-carried dependencies.</p>
<div class="admonition seealso">
<p class="first admonition-title">See also</p>
<dl class="last docutils">
<dt><a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.128.6230">Prefix sums and their applications</a>, by Guy Blelloch.</dt>
<dd>This article gives an overview of some surprising applications of scans.</dd>
<dt><a class="reference internal" href="#predefined-scans"><em>Simple / Legacy Interface</em></a></dt>
<dd>These operations built into PyOpenCL are realized using <tt class="xref py py-class docutils literal"><span class="pre">GenericScanKernel</span></tt>.</dd>
</dl>
</div>
<div class="section" id="usage-example">
<h3>Usage Example<a class="headerlink" href="#usage-example" title="Permalink to this headline">¶</a></h3>
<p>This example illustrates the implementation of a simplified version of <tt class="xref py py-func docutils literal"><span class="pre">copy_if()</span></tt>,
which copies integers from an array into the (variable-size) output if they are
greater than 300:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">knl</span> <span class="o">=</span> <span class="n">GenericScanKernel</span><span class="p">(</span>
        <span class="n">ctx</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">,</span>
        <span class="n">arguments</span><span class="o">=</span><span class="s">&quot;__global int *ary, __global int *out&quot;</span><span class="p">,</span>
        <span class="n">input_expr</span><span class="o">=</span><span class="s">&quot;(ary[i] &gt; 300) ? 1 : 0&quot;</span><span class="p">,</span>
        <span class="n">scan_expr</span><span class="o">=</span><span class="s">&quot;a+b&quot;</span><span class="p">,</span> <span class="n">neutral</span><span class="o">=</span><span class="s">&quot;0&quot;</span><span class="p">,</span>
        <span class="n">output_statement</span><span class="o">=</span><span class="s">&quot;&quot;&quot;</span>
<span class="s">            if (prev_item != item) out[item-1] = ary[i];</span>
<span class="s">            &quot;&quot;&quot;</span><span class="p">)</span>

<span class="n">out</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
<span class="n">knl</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">out</span><span class="p">)</span>

<span class="n">a_host</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">get</span><span class="p">()</span>
<span class="n">out_host</span> <span class="o">=</span> <span class="n">a_host</span><span class="p">[</span><span class="n">a_host</span> <span class="o">&gt;</span> <span class="mi">300</span><span class="p">]</span>

<span class="k">assert</span> <span class="p">(</span><span class="n">out_host</span> <span class="o">==</span> <span class="n">out</span><span class="o">.</span><span class="n">get</span><span class="p">()[:</span><span class="nb">len</span><span class="p">(</span><span class="n">out_host</span><span class="p">)])</span><span class="o">.</span><span class="n">all</span><span class="p">()</span>
</pre></div>
</div>
<p>The value being scanned over is a number of flags indicating whether each array
element is greater than 300. These flags are computed by <em>input_expr</em>. The
prefix sum over this array gives a running count of array items greater than
300. The <em>output_statement</em> the compares <cite>prev_item</cite> (the previous item&#8217;s scan
result, i.e. index) to <cite>item</cite> (the current item&#8217;s scan result, i.e.
index). If they differ, i.e. if the predicate was satisfied at this
position, then the item is stored in the output at the computed index.</p>
<p>This example does not make use of the following advanced features also available
in PyOpenCL:</p>
<ul class="simple">
<li>Segmented scans</li>
<li>Access to the previous item in <em>input_expr</em> (e.g. for comparisons)
See the <a class="reference external" href="https://github.com/inducer/pyopencl/blob/master/pyopencl/scan.py#L1353">implementation</a> of <tt class="xref py py-func docutils literal"><span class="pre">unique()</span></tt> for an example.</li>
</ul>
</div>
<div class="section" id="making-custom-scan-kernels">
<h3>Making Custom Scan Kernels<a class="headerlink" href="#making-custom-scan-kernels" title="Permalink to this headline">¶</a></h3>
<div class="section" id="debugging-aids">
<h4>Debugging aids<a class="headerlink" href="#debugging-aids" title="Permalink to this headline">¶</a></h4>
<dl class="class">
<dt id="pyopencl.scan.GenericDebugScanKernel">
<em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">GenericDebugScanKernel</tt><a class="headerlink" href="#pyopencl.scan.GenericDebugScanKernel" title="Permalink to this definition">¶</a></dt>
<dd><p>Performs the same function and has the same interface as
<tt class="xref py py-class docutils literal"><span class="pre">GenericScanKernel</span></tt>, but uses a dead-simple, sequential scan.  Works
best on CPU platforms, and helps isolate bugs in scans by removing the
potential for issues originating in parallel execution.</p>
</dd></dl>

</div>
</div>
<div class="section" id="simple-legacy-interface">
<span id="predefined-scans"></span><h3>Simple / Legacy Interface<a class="headerlink" href="#simple-legacy-interface" title="Permalink to this headline">¶</a></h3>
<dl class="class">
<dt id="pyopencl.scan.ExclusiveScanKernel">
<em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">ExclusiveScanKernel</tt><big>(</big><em>ctx</em>, <em>dtype</em>, <em>scan_expr</em>, <em>neutral</em>, <em>name_prefix=&quot;scan&quot;</em>, <em>options=[]</em>, <em>preamble=&quot;&quot;</em>, <em>devices=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.ExclusiveScanKernel" title="Permalink to this definition">¶</a></dt>
<dd><p>Generates a kernel that can compute a <a class="reference external" href="https://secure.wikimedia.org/wikipedia/en/wiki/Prefix_sum">prefix sum</a>
using any associative operation given as <em>scan_expr</em>.
<em>scan_expr</em> uses the formal values &#8220;a&#8221; and &#8220;b&#8221; to indicate two operands of
an associative binary operation. <em>neutral</em> is the neutral element
of <em>scan_expr</em>, obeying <em>scan_expr(a, neutral) == a</em>.</p>
<p><em>dtype</em> specifies the type of the arrays being operated on.
<em>name_prefix</em> is used for kernel names to ensure recognizability
in profiles and logs. <em>options</em> is a list of compiler options to use
when building. <em>preamble</em> specifies a string of code that is
inserted before the actual kernels. <em>devices</em> may be used to restrict
the set of devices on which the kernel is meant to run. (defaults
to all devices in the context <em>ctx</em>.</p>
<dl class="method">
<dt id="pyopencl.scan.ExclusiveScanKernel.__call__">
<tt class="descname">__call__</tt><big>(</big><em>self</em>, <em>input_ary</em>, <em>output_ary=None</em>, <em>allocator=None</em>, <em>queue=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.ExclusiveScanKernel.__call__" title="Permalink to this definition">¶</a></dt>
<dd></dd></dl>

</dd></dl>

<dl class="class">
<dt id="pyopencl.scan.InclusiveScanKernel">
<em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">InclusiveScanKernel</tt><big>(</big><em>dtype</em>, <em>scan_expr</em>, <em>neutral=None</em>, <em>name_prefix=&quot;scan&quot;</em>, <em>options=[]</em>, <em>preamble=&quot;&quot;</em>, <em>devices=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.InclusiveScanKernel" title="Permalink to this definition">¶</a></dt>
<dd><p>Works like <a class="reference internal" href="#pyopencl.scan.ExclusiveScanKernel" title="pyopencl.scan.ExclusiveScanKernel"><tt class="xref py py-class docutils literal"><span class="pre">ExclusiveScanKernel</span></tt></a>.</p>
<div class="versionchanged">
<p><span class="versionmodified">Changed in version 2013.1: </span><em>neutral</em> is now always required.</p>
</div>
</dd></dl>

<p>For the array <cite>[1,2,3]</cite>, inclusive scan results in <cite>[1,3,6]</cite>, and exclusive
scan results in <cite>[0,1,3]</cite>.</p>
<p>Here&#8217;s a usage example:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="n">knl</span> <span class="o">=</span> <span class="n">InclusiveScanKernel</span><span class="p">(</span><span class="n">context</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">,</span> <span class="s">&quot;a+b&quot;</span><span class="p">)</span>

<span class="n">n</span> <span class="o">=</span> <span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="o">-</span><span class="mi">2</span><span class="o">**</span><span class="mi">18</span><span class="o">+</span><span class="mi">5</span>
<span class="n">host_data</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span>
<span class="n">dev_data</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="n">host_data</span><span class="p">)</span>

<span class="n">knl</span><span class="p">(</span><span class="n">dev_data</span><span class="p">)</span>
<span class="k">assert</span> <span class="p">(</span><span class="n">dev_data</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> <span class="o">==</span> <span class="n">np</span><span class="o">.</span><span class="n">cumsum</span><span class="p">(</span><span class="n">host_data</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span><span class="o">.</span><span class="n">all</span><span class="p">()</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="module-pyopencl.algorithm">
<span id="predicated-copies-partition-unique"></span><h2>Predicated copies (&#8220;partition&#8221;, &#8220;unique&#8221;, ...)<a class="headerlink" href="#module-pyopencl.algorithm" title="Permalink to this headline">¶</a></h2>
</div>
<div class="section" id="sorting-radix-sort">
<h2>Sorting (radix sort)<a class="headerlink" href="#sorting-radix-sort" title="Permalink to this headline">¶</a></h2>
</div>
<div class="section" id="building-many-variable-size-lists">
<h2>Building many variable-size lists<a class="headerlink" href="#building-many-variable-size-lists" title="Permalink to this headline">¶</a></h2>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Parallel Algorithms</a><ul>
<li><a class="reference internal" href="#module-pyopencl.elementwise">Element-wise expression evalution (&#8220;map&#8221;)</a></li>
<li><a class="reference internal" href="#module-pyopencl.reduction">Sums and counts (&#8220;reduce&#8221;)</a></li>
<li><a class="reference internal" href="#module-pyopencl.scan">Prefix Sums (&#8220;scan&#8221;)</a><ul>
<li><a class="reference internal" href="#usage-example">Usage Example</a></li>
<li><a class="reference internal" href="#making-custom-scan-kernels">Making Custom Scan Kernels</a><ul>
<li><a class="reference internal" href="#debugging-aids">Debugging aids</a></li>
</ul>
</li>
<li><a class="reference internal" href="#simple-legacy-interface">Simple / Legacy Interface</a></li>
</ul>
</li>
<li><a class="reference internal" href="#module-pyopencl.algorithm">Predicated copies (&#8220;partition&#8221;, &#8220;unique&#8221;, ...)</a></li>
<li><a class="reference internal" href="#sorting-radix-sort">Sorting (radix sort)</a></li>
<li><a class="reference internal" href="#building-many-variable-size-lists">Building many variable-size lists</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="array.html"
                        title="previous chapter">Multi-dimensional arrays</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="tools.html"
                        title="next chapter">Built-in Utilities</a></p>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
          <a href="tools.html" title="Built-in Utilities"
             >next</a> |</li>
        <li class="right" >
          <a href="array.html" title="Multi-dimensional arrays"
             >previous</a> |</li>
        <li><a href="index.html">PyOpenCL 2013.1 documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2009, Andreas Kloeckner.
      Last updated on Oct 15, 2014.
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.2.3.
    </div>
  </body>
</html>