<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Parallel Algorithms — PyOpenCL 2013.1 documentation</title> <link rel="stylesheet" href="_static/default.css" type="text/css" /> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="_static/akdoc.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: './', VERSION: '2013.1', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: false }; </script> <script type="text/javascript" src="_static/jquery.js"></script> <script type="text/javascript" src="_static/underscore.js"></script> <script type="text/javascript" src="_static/doctools.js"></script> <link rel="top" title="PyOpenCL 2013.1 documentation" href="index.html" /> <link rel="next" title="Built-in Utilities" href="tools.html" /> <link rel="prev" title="Multi-dimensional arrays" href="array.html" /> </head> <body> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" accesskey="I">index</a></li> <li class="right" > <a href="py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="tools.html" title="Built-in Utilities" accesskey="N">next</a> |</li> <li class="right" > <a href="array.html" title="Multi-dimensional arrays" accesskey="P">previous</a> |</li> <li><a href="index.html">PyOpenCL 2013.1 documentation</a> »</li> </ul> </div> <div class="document"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="parallel-algorithms"> <h1>Parallel Algorithms<a class="headerlink" href="#parallel-algorithms" title="Permalink to this headline">¶</a></h1> <div class="section" id="module-pyopencl.elementwise"> <span id="element-wise-expression-evalution-map"></span><h2>Element-wise expression evalution (“map”)<a class="headerlink" href="#module-pyopencl.elementwise" title="Permalink to this headline">¶</a></h2> <p>Evaluating involved expressions on <tt class="xref py py-class docutils literal"><span class="pre">pyopencl.array.Array</span></tt> instances by using overloaded operators can be somewhat inefficient, because a new temporary is created for each intermediate result. The functionality in the module <a class="reference internal" href="#module-pyopencl.elementwise" title="pyopencl.elementwise"><tt class="xref py py-mod docutils literal"><span class="pre">pyopencl.elementwise</span></tt></a> contains tools to help generate kernels that evaluate multi-stage expressions on one or several operands in a single pass.</p> <p>Here’s a usage example:</p> <div class="highlight-python"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">pyopencl</span> <span class="kn">as</span> <span class="nn">cl</span> <span class="kn">import</span> <span class="nn">pyopencl.array</span> <span class="kn">as</span> <span class="nn">cl_array</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="n">ctx</span> <span class="o">=</span> <span class="n">cl</span><span class="o">.</span><span class="n">create_some_context</span><span class="p">()</span> <span class="n">queue</span> <span class="o">=</span> <span class="n">cl</span><span class="o">.</span><span class="n">CommandQueue</span><span class="p">(</span><span class="n">ctx</span><span class="p">)</span> <span class="n">n</span> <span class="o">=</span> <span class="mi">10</span> <span class="n">a_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">queue</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">))</span> <span class="n">b_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">queue</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">))</span> <span class="kn">from</span> <span class="nn">pyopencl.elementwise</span> <span class="kn">import</span> <span class="n">ElementwiseKernel</span> <span class="n">lin_comb</span> <span class="o">=</span> <span class="n">ElementwiseKernel</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="s">"float a, float *x, "</span> <span class="s">"float b, float *y, "</span> <span class="s">"float *z"</span><span class="p">,</span> <span class="s">"z[i] = a*x[i] + b*y[i]"</span><span class="p">,</span> <span class="s">"linear_combination"</span><span class="p">)</span> <span class="n">c_gpu</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">empty_like</span><span class="p">(</span><span class="n">a_gpu</span><span class="p">)</span> <span class="n">lin_comb</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="n">a_gpu</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">b_gpu</span><span class="p">,</span> <span class="n">c_gpu</span><span class="p">)</span> <span class="kn">import</span> <span class="nn">numpy.linalg</span> <span class="kn">as</span> <span class="nn">la</span> <span class="k">assert</span> <span class="n">la</span><span class="o">.</span><span class="n">norm</span><span class="p">((</span><span class="n">c_gpu</span> <span class="o">-</span> <span class="p">(</span><span class="mi">5</span><span class="o">*</span><span class="n">a_gpu</span><span class="o">+</span><span class="mi">6</span><span class="o">*</span><span class="n">b_gpu</span><span class="p">))</span><span class="o">.</span><span class="n">get</span><span class="p">())</span> <span class="o"><</span> <span class="mf">1e-5</span> </pre></div> </div> <p>(You can find this example as <tt class="file docutils literal"><span class="pre">examples/demo_elementwise.py</span></tt> in the PyOpenCL distribution.)</p> </div> <div class="section" id="module-pyopencl.reduction"> <span id="sums-and-counts-reduce"></span><span id="custom-reductions"></span><h2>Sums and counts (“reduce”)<a class="headerlink" href="#module-pyopencl.reduction" title="Permalink to this headline">¶</a></h2> <dl class="class"> <dt id="pyopencl.reduction.ReductionKernel"> <em class="property">class </em><tt class="descclassname">pyopencl.reduction.</tt><tt class="descname">ReductionKernel</tt><big>(</big><em>ctx</em>, <em>dtype_out</em>, <em>neutral</em>, <em>reduce_expr</em>, <em>map_expr=None</em>, <em>arguments=None</em>, <em>name="reduce_kernel"</em>, <em>options=[]</em>, <em>preamble=""</em><big>)</big><a class="headerlink" href="#pyopencl.reduction.ReductionKernel" title="Permalink to this definition">¶</a></dt> <dd><p>Generate a kernel that takes a number of scalar or vector <em>arguments</em> (at least one vector argument), performs the <em>map_expr</em> on each entry of the vector argument and then the <em>reduce_expr</em> on the outcome of that. <em>neutral</em> serves as an initial value. <em>preamble</em> offers the possibility to add preprocessor directives and other code (such as helper functions) to be added before the actual reduction kernel code.</p> <p>Vectors in <em>map_expr</em> should be indexed by the variable <em>i</em>. <em>reduce_expr</em> uses the formal values “a” and “b” to indicate two operands of a binary reduction operation. If you do not specify a <em>map_expr</em>, “in[i]” – and therefore the presence of only one input argument – is automatically assumed.</p> <p><em>dtype_out</em> specifies the <a class="reference external" href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html#numpy.dtype" title="(in NumPy v1.8)"><tt class="xref py py-class docutils literal"><span class="pre">numpy.dtype</span></tt></a> in which the reduction is performed and in which the result is returned. <em>neutral</em> is specified as float or integer formatted as string. <em>reduce_expr</em> and <em>map_expr</em> are specified as string formatted operations and <em>arguments</em> is specified as a string formatted as a C argument list. <em>name</em> specifies the name as which the kernel is compiled. <em>options</em> are passed unmodified to <a class="reference internal" href="runtime.html#pyopencl.Program.build" title="pyopencl.Program.build"><tt class="xref py py-meth docutils literal"><span class="pre">pyopencl.Program.build()</span></tt></a>. <em>preamble</em> specifies a string of code that is inserted before the actual kernels.</p> <dl class="method"> <dt id="pyopencl.reduction.ReductionKernel.__call__"> <tt class="descname">__call__</tt><big>(</big><em>*args</em>, <em>queue=None</em>, <em>wait_for=None</em>, <em>return_event=False</em><big>)</big><a class="headerlink" href="#pyopencl.reduction.ReductionKernel.__call__" title="Permalink to this definition">¶</a></dt> <dd><p><em>wait_for</em> may either be <em>None</em> or a list of <a class="reference internal" href="runtime.html#pyopencl.Event" title="pyopencl.Event"><tt class="xref py py-class docutils literal"><span class="pre">pyopencl.Event</span></tt></a> instances for whose completion this command waits before starting exeuction.</p> <table class="docutils field-list" frame="void" rules="none"> <col class="field-name" /> <col class="field-body" /> <tbody valign="top"> <tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">the resulting scalar as a single-entry <tt class="xref py py-class docutils literal"><span class="pre">pyopencl.array.Array</span></tt> if <em>return_event</em> is <em>False</em>, otherwise a tuple <tt class="docutils literal"><span class="pre">(scalar_array,</span> <span class="pre">event)</span></tt>.</td> </tr> </tbody> </table> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last">The returned <a class="reference internal" href="runtime.html#pyopencl.Event" title="pyopencl.Event"><tt class="xref py py-class docutils literal"><span class="pre">pyopencl.Event</span></tt></a> corresponds only to part of the execution of the reduction. It is not suitable for profiling.</p> </div> </dd></dl> </dd></dl> <p>Here’s a usage example:</p> <div class="highlight-python"><div class="highlight"><pre><span class="n">a</span> <span class="o">=</span> <span class="n">pyopencl</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span> <span class="n">b</span> <span class="o">=</span> <span class="n">pyopencl</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span> <span class="n">krnl</span> <span class="o">=</span> <span class="n">ReductionKernel</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">numpy</span><span class="o">.</span><span class="n">float32</span><span class="p">,</span> <span class="n">neutral</span><span class="o">=</span><span class="s">"0"</span><span class="p">,</span> <span class="n">reduce_expr</span><span class="o">=</span><span class="s">"a+b"</span><span class="p">,</span> <span class="n">map_expr</span><span class="o">=</span><span class="s">"x[i]*y[i]"</span><span class="p">,</span> <span class="n">arguments</span><span class="o">=</span><span class="s">"__global float *x, __global float *y"</span><span class="p">)</span> <span class="n">my_dot_prod</span> <span class="o">=</span> <span class="n">krnl</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> </pre></div> </div> </div> <div class="section" id="module-pyopencl.scan"> <span id="prefix-sums-scan"></span><span id="custom-scan"></span><h2>Prefix Sums (“scan”)<a class="headerlink" href="#module-pyopencl.scan" title="Permalink to this headline">¶</a></h2> <p>A prefix sum is a running sum of an array, as provided by e.g. <tt class="xref py py-mod docutils literal"><span class="pre">numpy.cumsum</span></tt>:</p> <div class="highlight-python"><div class="highlight"><pre><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="gp">>>> </span><span class="n">a</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">]</span> <span class="gp">>>> </span><span class="n">np</span><span class="o">.</span><span class="n">cumsum</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="go">array([ 1, 2, 3, 4, 5, 7, 9, 11, 13, 15])</span> </pre></div> </div> <p>This is a very simple example of what a scan can do. It turns out that scans are significantly more versatile. They are a basic building block of many non-trivial parallel algorithms. Many of the operations enabled by scans seem difficult to parallelize because of loop-carried dependencies.</p> <div class="admonition seealso"> <p class="first admonition-title">See also</p> <dl class="last docutils"> <dt><a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.128.6230">Prefix sums and their applications</a>, by Guy Blelloch.</dt> <dd>This article gives an overview of some surprising applications of scans.</dd> <dt><a class="reference internal" href="#predefined-scans"><em>Simple / Legacy Interface</em></a></dt> <dd>These operations built into PyOpenCL are realized using <tt class="xref py py-class docutils literal"><span class="pre">GenericScanKernel</span></tt>.</dd> </dl> </div> <div class="section" id="usage-example"> <h3>Usage Example<a class="headerlink" href="#usage-example" title="Permalink to this headline">¶</a></h3> <p>This example illustrates the implementation of a simplified version of <tt class="xref py py-func docutils literal"><span class="pre">copy_if()</span></tt>, which copies integers from an array into the (variable-size) output if they are greater than 300:</p> <div class="highlight-python"><div class="highlight"><pre><span class="n">knl</span> <span class="o">=</span> <span class="n">GenericScanKernel</span><span class="p">(</span> <span class="n">ctx</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">,</span> <span class="n">arguments</span><span class="o">=</span><span class="s">"__global int *ary, __global int *out"</span><span class="p">,</span> <span class="n">input_expr</span><span class="o">=</span><span class="s">"(ary[i] > 300) ? 1 : 0"</span><span class="p">,</span> <span class="n">scan_expr</span><span class="o">=</span><span class="s">"a+b"</span><span class="p">,</span> <span class="n">neutral</span><span class="o">=</span><span class="s">"0"</span><span class="p">,</span> <span class="n">output_statement</span><span class="o">=</span><span class="s">"""</span> <span class="s"> if (prev_item != item) out[item-1] = ary[i];</span> <span class="s"> """</span><span class="p">)</span> <span class="n">out</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> <span class="n">knl</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">out</span><span class="p">)</span> <span class="n">a_host</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> <span class="n">out_host</span> <span class="o">=</span> <span class="n">a_host</span><span class="p">[</span><span class="n">a_host</span> <span class="o">></span> <span class="mi">300</span><span class="p">]</span> <span class="k">assert</span> <span class="p">(</span><span class="n">out_host</span> <span class="o">==</span> <span class="n">out</span><span class="o">.</span><span class="n">get</span><span class="p">()[:</span><span class="nb">len</span><span class="p">(</span><span class="n">out_host</span><span class="p">)])</span><span class="o">.</span><span class="n">all</span><span class="p">()</span> </pre></div> </div> <p>The value being scanned over is a number of flags indicating whether each array element is greater than 300. These flags are computed by <em>input_expr</em>. The prefix sum over this array gives a running count of array items greater than 300. The <em>output_statement</em> the compares <cite>prev_item</cite> (the previous item’s scan result, i.e. index) to <cite>item</cite> (the current item’s scan result, i.e. index). If they differ, i.e. if the predicate was satisfied at this position, then the item is stored in the output at the computed index.</p> <p>This example does not make use of the following advanced features also available in PyOpenCL:</p> <ul class="simple"> <li>Segmented scans</li> <li>Access to the previous item in <em>input_expr</em> (e.g. for comparisons) See the <a class="reference external" href="https://github.com/inducer/pyopencl/blob/master/pyopencl/scan.py#L1353">implementation</a> of <tt class="xref py py-func docutils literal"><span class="pre">unique()</span></tt> for an example.</li> </ul> </div> <div class="section" id="making-custom-scan-kernels"> <h3>Making Custom Scan Kernels<a class="headerlink" href="#making-custom-scan-kernels" title="Permalink to this headline">¶</a></h3> <div class="section" id="debugging-aids"> <h4>Debugging aids<a class="headerlink" href="#debugging-aids" title="Permalink to this headline">¶</a></h4> <dl class="class"> <dt id="pyopencl.scan.GenericDebugScanKernel"> <em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">GenericDebugScanKernel</tt><a class="headerlink" href="#pyopencl.scan.GenericDebugScanKernel" title="Permalink to this definition">¶</a></dt> <dd><p>Performs the same function and has the same interface as <tt class="xref py py-class docutils literal"><span class="pre">GenericScanKernel</span></tt>, but uses a dead-simple, sequential scan. Works best on CPU platforms, and helps isolate bugs in scans by removing the potential for issues originating in parallel execution.</p> </dd></dl> </div> </div> <div class="section" id="simple-legacy-interface"> <span id="predefined-scans"></span><h3>Simple / Legacy Interface<a class="headerlink" href="#simple-legacy-interface" title="Permalink to this headline">¶</a></h3> <dl class="class"> <dt id="pyopencl.scan.ExclusiveScanKernel"> <em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">ExclusiveScanKernel</tt><big>(</big><em>ctx</em>, <em>dtype</em>, <em>scan_expr</em>, <em>neutral</em>, <em>name_prefix="scan"</em>, <em>options=[]</em>, <em>preamble=""</em>, <em>devices=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.ExclusiveScanKernel" title="Permalink to this definition">¶</a></dt> <dd><p>Generates a kernel that can compute a <a class="reference external" href="https://secure.wikimedia.org/wikipedia/en/wiki/Prefix_sum">prefix sum</a> using any associative operation given as <em>scan_expr</em>. <em>scan_expr</em> uses the formal values “a” and “b” to indicate two operands of an associative binary operation. <em>neutral</em> is the neutral element of <em>scan_expr</em>, obeying <em>scan_expr(a, neutral) == a</em>.</p> <p><em>dtype</em> specifies the type of the arrays being operated on. <em>name_prefix</em> is used for kernel names to ensure recognizability in profiles and logs. <em>options</em> is a list of compiler options to use when building. <em>preamble</em> specifies a string of code that is inserted before the actual kernels. <em>devices</em> may be used to restrict the set of devices on which the kernel is meant to run. (defaults to all devices in the context <em>ctx</em>.</p> <dl class="method"> <dt id="pyopencl.scan.ExclusiveScanKernel.__call__"> <tt class="descname">__call__</tt><big>(</big><em>self</em>, <em>input_ary</em>, <em>output_ary=None</em>, <em>allocator=None</em>, <em>queue=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.ExclusiveScanKernel.__call__" title="Permalink to this definition">¶</a></dt> <dd></dd></dl> </dd></dl> <dl class="class"> <dt id="pyopencl.scan.InclusiveScanKernel"> <em class="property">class </em><tt class="descclassname">pyopencl.scan.</tt><tt class="descname">InclusiveScanKernel</tt><big>(</big><em>dtype</em>, <em>scan_expr</em>, <em>neutral=None</em>, <em>name_prefix="scan"</em>, <em>options=[]</em>, <em>preamble=""</em>, <em>devices=None</em><big>)</big><a class="headerlink" href="#pyopencl.scan.InclusiveScanKernel" title="Permalink to this definition">¶</a></dt> <dd><p>Works like <a class="reference internal" href="#pyopencl.scan.ExclusiveScanKernel" title="pyopencl.scan.ExclusiveScanKernel"><tt class="xref py py-class docutils literal"><span class="pre">ExclusiveScanKernel</span></tt></a>.</p> <div class="versionchanged"> <p><span class="versionmodified">Changed in version 2013.1: </span><em>neutral</em> is now always required.</p> </div> </dd></dl> <p>For the array <cite>[1,2,3]</cite>, inclusive scan results in <cite>[1,3,6]</cite>, and exclusive scan results in <cite>[0,1,3]</cite>.</p> <p>Here’s a usage example:</p> <div class="highlight-python"><div class="highlight"><pre><span class="n">knl</span> <span class="o">=</span> <span class="n">InclusiveScanKernel</span><span class="p">(</span><span class="n">context</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">,</span> <span class="s">"a+b"</span><span class="p">)</span> <span class="n">n</span> <span class="o">=</span> <span class="mi">2</span><span class="o">**</span><span class="mi">20</span><span class="o">-</span><span class="mi">2</span><span class="o">**</span><span class="mi">18</span><span class="o">+</span><span class="mi">5</span> <span class="n">host_data</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span> <span class="n">dev_data</span> <span class="o">=</span> <span class="n">cl_array</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span><span class="n">queue</span><span class="p">,</span> <span class="n">host_data</span><span class="p">)</span> <span class="n">knl</span><span class="p">(</span><span class="n">dev_data</span><span class="p">)</span> <span class="k">assert</span> <span class="p">(</span><span class="n">dev_data</span><span class="o">.</span><span class="n">get</span><span class="p">()</span> <span class="o">==</span> <span class="n">np</span><span class="o">.</span><span class="n">cumsum</span><span class="p">(</span><span class="n">host_data</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span><span class="o">.</span><span class="n">all</span><span class="p">()</span> </pre></div> </div> </div> </div> <div class="section" id="module-pyopencl.algorithm"> <span id="predicated-copies-partition-unique"></span><h2>Predicated copies (“partition”, “unique”, ...)<a class="headerlink" href="#module-pyopencl.algorithm" title="Permalink to this headline">¶</a></h2> </div> <div class="section" id="sorting-radix-sort"> <h2>Sorting (radix sort)<a class="headerlink" href="#sorting-radix-sort" title="Permalink to this headline">¶</a></h2> </div> <div class="section" id="building-many-variable-size-lists"> <h2>Building many variable-size lists<a class="headerlink" href="#building-many-variable-size-lists" title="Permalink to this headline">¶</a></h2> </div> </div> </div> </div> </div> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <h3><a href="index.html">Table Of Contents</a></h3> <ul> <li><a class="reference internal" href="#">Parallel Algorithms</a><ul> <li><a class="reference internal" href="#module-pyopencl.elementwise">Element-wise expression evalution (“map”)</a></li> <li><a class="reference internal" href="#module-pyopencl.reduction">Sums and counts (“reduce”)</a></li> <li><a class="reference internal" href="#module-pyopencl.scan">Prefix Sums (“scan”)</a><ul> <li><a class="reference internal" href="#usage-example">Usage Example</a></li> <li><a class="reference internal" href="#making-custom-scan-kernels">Making Custom Scan Kernels</a><ul> <li><a class="reference internal" href="#debugging-aids">Debugging aids</a></li> </ul> </li> <li><a class="reference internal" href="#simple-legacy-interface">Simple / Legacy Interface</a></li> </ul> </li> <li><a class="reference internal" href="#module-pyopencl.algorithm">Predicated copies (“partition”, “unique”, ...)</a></li> <li><a class="reference internal" href="#sorting-radix-sort">Sorting (radix sort)</a></li> <li><a class="reference internal" href="#building-many-variable-size-lists">Building many variable-size lists</a></li> </ul> </li> </ul> <h4>Previous topic</h4> <p class="topless"><a href="array.html" title="previous chapter">Multi-dimensional arrays</a></p> <h4>Next topic</h4> <p class="topless"><a href="tools.html" title="next chapter">Built-in Utilities</a></p> <div id="searchbox" style="display: none"> <h3>Quick search</h3> <form class="search" action="search.html" method="get"> <input type="text" name="q" /> <input type="submit" value="Go" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> <p class="searchtip" style="font-size: 90%"> Enter search terms or a module, class or function name. </p> </div> <script type="text/javascript">$('#searchbox').show(0);</script> </div> </div> <div class="clearer"></div> </div> <div class="related"> <h3>Navigation</h3> <ul> <li class="right" style="margin-right: 10px"> <a href="genindex.html" title="General Index" >index</a></li> <li class="right" > <a href="py-modindex.html" title="Python Module Index" >modules</a> |</li> <li class="right" > <a href="tools.html" title="Built-in Utilities" >next</a> |</li> <li class="right" > <a href="array.html" title="Multi-dimensional arrays" >previous</a> |</li> <li><a href="index.html">PyOpenCL 2013.1 documentation</a> »</li> </ul> </div> <div class="footer"> © Copyright 2009, Andreas Kloeckner. Last updated on Oct 15, 2014. Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.2.3. </div> </body> </html>