Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > b0aa6cd23b567cd0e312b072b2e3b0bf > files > 1234

nvidia-cuda-toolkit-devel-5.5.22-2.mga4.nonfree.x86_64.rpm

<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-us" xml:lang="en-us">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
      <meta http-equiv="X-UA-Compatible" content="IE=edge"></meta>
      <meta name="copyright" content="(C) Copyright 2005"></meta>
      <meta name="DC.rights.owner" content="(C) Copyright 2005"></meta>
      <meta name="DC.Type" content="concept"></meta>
      <meta name="DC.Title" content="Host API Overview"></meta>
      <meta name="DC.Format" content="XHTML"></meta>
      <meta name="DC.Identifier" content="host-api-overview"></meta>
      <meta name="DC.Language" content="en-us"></meta>
      <link rel="stylesheet" type="text/css" href="../common/formatting/commonltr.css"></link>
      <link rel="stylesheet" type="text/css" href="../common/formatting/site.css"></link>
      <title>CURAND :: CUDA Toolkit Documentation</title>
      <!--[if lt IE 9]>
      <script src="../common/formatting/html5shiv-printshiv.min.js"></script>
      <![endif]-->
      <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.min.js"></script>
      <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.ba-hashchange.min.js"></script>
      <link rel="canonical" href="http://docs.nvidia.com/cuda/curand/index.html"></link>
      <link rel="stylesheet" type="text/css" href="../common/formatting/qwcode.highlight.css"></link>
   </head>
   <body>
      
      <article id="contents">
         <div id="eqn-warning">This document includes math equations
            (highlighted in red) which are best viewed with <a target="_blank" href="https://www.mozilla.org/firefox">Firefox</a> version 4.0
            or higher, or another <a target="_blank" href="http://www.w3.org/Math/Software/mathml_software_cat_browsers.html">MathML-aware
               browser</a>. There is also a <a href="../../pdf/CURAND_Library.pdf">PDF version of this
               document</a>.
            
         </div>
         <div id="eqn-warning-buf"></div>
         <div id="breadcrumbs"><a href="compatibility-and-versioning.html" shape="rect">&lt; Previous</a> | <a href="device-api-overview.html" shape="rect">Next &gt;</a></div>
         <div id="release-info">CURAND
            (<a href="../../pdf/CURAND_Library.pdf">PDF</a>)
            -
            CUDA Toolkit v5.5
            (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
            -
            Last updated 
            July 19, 2013
            -
            <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: curand">Send Feedback</a></div>
         <div class="topic nested1" id="host-api-overview"><a name="host-api-overview" shape="rect">
               <!-- --></a><h2 class="topictitle2">2.&nbsp;Host API Overview</h2>
            <div class="body conbody">
               <p class="p">To use the host API, user code should include the library header file <samp class="ph codeph">curand.h</samp> and dynamically link against the CURAND library. The library uses the CUDA runtime, so user code must also use the runtime.
                  The CUDA driver API is not supported by CURAND.
               </p>
               <p class="p">Random numbers are produced by generators. A generator in CURAND encapsulates all the internal state necessary to produce
                  a sequence of pseudorandom or quasirandom numbers. The normal sequence of operations is as follows:
               </p>
               <p class="p">1. Create a new generator of the desired type (see <a class="xref" href="host-api-overview.html#generator-types" shape="rect">Generator Types</a> ) with <samp class="ph codeph">curandCreateGenerator()</samp>.
               </p>
               <p class="p">2. Set the generator options (see <a class="xref" href="host-api-overview.html#generator-options" shape="rect">Generator Options</a>); for example, use <samp class="ph codeph">curandSetPseudoRandomGeneratorSeed()</samp> to set the seed.
               </p>
               <p class="p">3. Allocate memory on the device with <samp class="ph codeph">cudaMalloc()</samp>.
               </p>
               <p class="p">4. Generate random numbers with <samp class="ph codeph">curandGenerate()</samp> or another generation function.
               </p>
               <p class="p">5. Use the results.</p>
               <p class="p">6. If desired, generate more random numbers with more calls to <samp class="ph codeph">curandGenerate()</samp>.
               </p>
               <p class="p">7. Clean up with <samp class="ph codeph">curandDestroyGenerator()</samp>.
               </p>
               <p class="p">To generate random numbers on the host CPU, in step one above call <samp class="ph codeph">curandCreateGeneratorHost()</samp>, and in step three, allocate a host memory buffer to receive the results. All other calls work identically whether you are
                  generating random numbers on the device or on the host CPU.
               </p>
               <p class="p">It is legal to create several generators at the same time. Each generator encapsulates a separate state and is independent
                  of all other generators. The sequence of numbers produced by each generator is deterministic. Given the same set-up parameters,
                  the same sequence will be generated with every run of the program. Generating random numbers on the device will result in
                  the same sequence as generating them on the host CPU.
               </p>
               <p class="p">Note that <samp class="ph codeph">curandGenerate()</samp> in step 4 above launches a kernel and returns asynchronously. If you launch another kernel in a different stream, and that
                  kernel needs to use the results of curandGenerate(), you must either call <samp class="ph codeph">cudaThreadSynchronize()</samp> or use the stream management/event management routines, to ensure that the random generation kernel has finished execution
                  before the new kernel is launched.
               </p>
               <p class="p">Note that it is not valid to pass a host memory pointer to a generator that is running on the device, and it is not valid
                  to pass a device memory pointer to a generator that is running on the CPU. Behavior in these cases is undefined.
               </p>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="generator-types"><a name="generator-types" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.1.&nbsp;Generator Types</h3>
               <div class="body conbody">
                  <p class="p">Random number generators are created by passing a type to <samp class="ph codeph">curandCreateGenerator()</samp>. There are seven types of random number generators in CURAND, that fall into two categories. <samp class="ph codeph">CURAND_RNG_PSEUDO_XORWOW</samp>, <samp class="ph codeph">CURAND_RNG_PSEUDO_MRG32K3A</samp>, and <samp class="ph codeph">CURAND_RNG_PSEUDO_MTGP32</samp> are pseudorandom number generators. <samp class="ph codeph">CURAND_RNG_PSEUDO_XORWOW</samp> is implemented using the XORWOW algorithm, a member of the xor-shift family of pseudorandom number generators. <samp class="ph codeph">CURAND_RNG_PSEUDO_MRG32K3A</samp> is a member of the Combined Multiple Recursive family of pseudorandom number generators. <samp class="ph codeph">CURAND_RNG_PSEUDO_MTGP32</samp> is a member of the Mersenne Twister family of pseudorandom number generators, with parameters customized for operation on
                     the GPU. There are 4 variants of the basic SOBOL’ quasi random number generator. All of the variants generate sequences in
                     up to 20,000 dimensions. <samp class="ph codeph">CURAND_RNG_QUASI_SOBOL32</samp>, <samp class="ph codeph">CURAND_RNG_QUASI_SCRAMBLED_SOBOL32</samp>, <samp class="ph codeph">CURAND_RNG_QUASI_SOBOL64</samp>, and <samp class="ph codeph">CURAND_RNG_QUASI_SCRAMBLED_SOBOL64</samp> are quasirandom number generator types. <samp class="ph codeph">CURAND_RNG_QUASI_SOBOL32</samp> is a Sobol’ generator of 32-bit sequences. <samp class="ph codeph">CURAND_RNG_QUASI_SCRAMBLED_SOBOL32</samp> is a scrambled Sobol’ generator of 32-bit sequences. <samp class="ph codeph">CURAND_RNG_QUASI_SOBOL64</samp> is a Sobol’ generator of 64-bit sequences. <samp class="ph codeph">CURAND_RNG_QUASI_SCRAMBLED_SOBOL64</samp> is a scrambled Sobol’ generator of 64-bit sequences.
                  </p>
               </div>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="generator-options"><a name="generator-options" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.2.&nbsp;Generator Options</h3>
               <div class="body conbody">
                  <p class="p">Once created, random number generators can be defined using the general options seed, offset, and order.</p>
               </div>
               <div class="topic concept nested2" xml:lang="en-us" id="seed"><a name="seed" shape="rect">
                     <!-- --></a><h4 class="topictitle4">2.2.1.&nbsp;Seed</h4>
                  <div class="body conbody">
                     <p class="p">The seed parameter is a 64-bit integer that initializes the starting state of a pseudorandom number generator. The same seed
                        always produces the same sequence of results.
                     </p>
                  </div>
               </div>
               <div class="topic concept nested2" xml:lang="en-us" id="offset"><a name="offset" shape="rect">
                     <!-- --></a><h4 class="topictitle4">2.2.2.&nbsp;Offset</h4>
                  <div class="body conbody">
                     <p class="p">The offset parameter is used to skip ahead in the sequence. If offset = 100, the first random number generated will be the
                        100th in the sequence. This allows multiple runs of the same program to continue generating results from the same sequence
                        without overlap. Note that the skip ahead function is not available for the <samp class="ph codeph">CURAND_RNG_PSEUDO_MTGP32</samp> generator.
                     </p>
                  </div>
               </div>
               <div class="topic concept nested2" xml:lang="en-us" id="order"><a name="order" shape="rect">
                     <!-- --></a><h4 class="topictitle4">2.2.3.&nbsp;Order</h4>
                  <div class="body conbody">
                     <p class="p">The order parameter is used to choose how the results are ordered in global memory. There are three ordering choices for pseudorandom
                        sequences: <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp>, <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp>, and <samp class="ph codeph">CURAND_ORDERING_PSEUDO_SEEDED</samp>. There is one ordering choice for quasirandom numbers, <samp class="ph codeph">CURAND_ORDERING_QUASI_DEFAULT</samp>. The default ordering for pseudorandom number generators is <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp>, while the default ordering for quasirandom number generators is <samp class="ph codeph">CURAND_ORDERING_QUASI_DEFAULT</samp>.
                     </p>
                     <p class="p">Currently, the two pseudorandom orderings <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> and <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> produce the same output ordering for all pseudo-random generators. However, future releases of CURAND may change the ordering
                        associated with <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> to improve either performance or the quality of the results. It will always be the case that the ordering obtained with <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> is deterministic and is the same for each run of the program. The ordering returned by <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> is guaranteed to remain the same for all CURAND releases. In the current release, only the XORWOW generator has more than
                        one ordering.
                     </p>
                     <p class="p">The behavior of the ordering parameters for each generator type is outlined below:</p>
                     <ul class="ul">
                        <li class="li">
                           <p class="p">XORWOW pseudorandom generator</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp></p>
                                 <p class="p">The output ordering of <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> is the same as <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> in the current release.
                                 </p>
                              </li>
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp></p>
                                 <p class="p">The result at offset 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mn>n</mn>
                                    </math> in global memory is from position
                                 </p>
                                 <math xmlns="http://www.w3.org/1998/Math/MathML">
                                    <mo stretchy="false">(</mo>
                                    <mi>n</mi>
                                    <mo>mod</mo>
                                    <mn>4096</mn>
                                    <mo stretchy="false">)</mo>
                                    <mo>⋅</mo>
                                    <msup>
                                       <mn>2</mn>
                                       <mn>67</mn>
                                    </msup>
                                    <mo>+</mo>
                                    <mo fence="false" stretchy="false">⌊</mo>
                                    <mi>n</mi>
                                    <mo>/</mo>
                                    <mn>4096</mn>
                                    <mo fence="false" stretchy="false">⌋</mo>
                                 </math>
                                 <p class="p">in the original XORWOW sequence.</p>
                              </li>
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_SEEDED</samp></p>
                                 <p class="p">The result at offset 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mn>n</mn>
                                    </math> in global memory is from position 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                       <mo>/</mo>
                                       <mn>4096</mn>
                                       <mo fence="false" stretchy="false">⌋</mo>
                                    </math> in the XORWOW sequence seeded with a combination of the user seed and the number 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                       <mo>mod</mo>
                                       <mn>4096</mn>
                                    </math>. In other words, each of 4096 threads uses a different seed. This seeding method reduces state setup time but may result
                                    in statistical weaknesses of the pseudorandom output for some user seed values.
                                 </p>
                              </li>
                           </ul>
                           <p class="p">MRG32k3a pseudorandom generator</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp></p>
                                 <p class="p">The output ordering of <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> is the same as <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> in the current release.
                                 </p>
                              </li>
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp></p>
                                 <p class="p">The result at offset 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mn>n</mn>
                                    </math> in global memory is from position
                                 </p>
                                 <math xmlns="http://www.w3.org/1998/Math/MathML">
                                    <mo stretchy="false">(</mo>
                                    <mi>n</mi>
                                    <mo>mod</mo>
                                    <mn>4096</mn>
                                    <mo stretchy="false">)</mo>
                                    <mo>⋅</mo>
                                    <msup>
                                       <mn>2</mn>
                                       <mn>76</mn>
                                    </msup>
                                    <mo>+</mo>
                                    <mo fence="false" stretchy="false">⌊</mo>
                                    <mi>n</mi>
                                    <mo>/</mo>
                                    <mn>4096</mn>
                                    <mo fence="false" stretchy="false">⌋</mo>
                                 </math>
                                 <p class="p">in the original MRG32k3a sequence. (Note that the stride between subsequent samples for MRG32k3a is not the same as for XORWOW)</p>
                              </li>
                           </ul>
                           <p class="p">MTGP32 pseudorandom generator</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp></p>
                                 <p class="p">The output ordering of <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> is the same as <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> in the current release.
                                 </p>
                              </li>
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp></p>
                                 <p class="p">The MTGP32 generator actually generates 64 distinct sequences based on different parameter sets for the basic algorithm. Let
                                    
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mi>p</mi>
                                       <mo stretchy="false">)</mo>
                                    </math> be the sequence for parameter set 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>p</mi>
                                    </math>.
                                 </p>
                                 <p class="p">The result at offset 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mn>n</mn>
                                    </math> in global memory is from position 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                       <mo>mod</mo>
                                       <mn>256</mn>
                                    </math> from the sequence
                                 </p>
                                 <math xmlns="http://www.w3.org/1998/Math/MathML">
                                    <mi>S</mi>
                                    <mo stretchy="false">(</mo>
                                    <mo fence="false" stretchy="false">⌊</mo>
                                    <mi>n</mi>
                                    <mo>/</mo>
                                    <mn>256</mn>
                                    <mo fence="false" stretchy="false">⌋</mo>
                                    <mo>mod</mo>
                                    <mn>64</mn>
                                    <mo stretchy="false">)</mo>
                                 </math>
                                 <p class="p">In other words 256 samples from 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mn>0</mn>
                                       <mo stretchy="false">)</mo>
                                    </math> are followed by 256 samples from 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mn>1</mn>
                                       <mo stretchy="false">)</mo>
                                    </math> and so-on, up to 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mn>63</mn>
                                       <mo stretchy="false">)</mo>
                                    </math>. This pattern repeats, so the subsequent 256 samples are from 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mn>0</mn>
                                       <mo stretchy="false">)</mo>
                                    </math>, followed by 256 samples from 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>S</mi>
                                       <mo stretchy="false">(</mo>
                                       <mn>1</mn>
                                       <mo stretchy="false">)</mo>
                                    </math>, ands so on.
                                 </p>
                              </li>
                           </ul>
                           <p class="p">Philox_4x32_10 pseudorandom generator</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp></p>
                                 <p class="p">The output ordering of <samp class="ph codeph">CURAND_ORDERING_PSEUDO_BEST</samp> is the same as <samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp> in the current release.
                                 </p>
                              </li>
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_PSEUDO_DEFAULT</samp></p>
                                 <p class="p">Each thread in Philox_4x32_10 generator generates distinct sequences based on different parameter sets for the basic algorithm.
                                    In host API there are 8192 different sequences. Each four values from one sequence are followed by four values from next sequence.
                                 </p>
                              </li>
                           </ul>
                           <p class="p">32 and 64 bit SOBOL and Scrambled SOBOL quasirandom generators</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p"><samp class="ph codeph">CURAND_ORDERING_QUASI_DEFAULT</samp></p>
                                 <p class="p">When generating 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                    </math> results in 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>d</mi>
                                    </math> dimensions, the output will consist of 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                       <mo>/</mo>
                                       <mi>d</mi>
                                    </math> results from dimension 1, followed by 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>n</mi>
                                       <mo>/</mo>
                                       <mi>d</mi>
                                    </math> results from dimension 2, and so on up to dimension 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>d</mi>
                                    </math>. Only exact multiples of the dimension size may be generated. The dimension parameter 
                                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                                       <mi>d</mi>
                                    </math> is set with <samp class="ph codeph">curandSetQuasiRandomGeneratorDimensions()</samp> and defaults to 1.
                                 </p>
                              </li>
                           </ul>
                        </li>
                     </ul>
                  </div>
               </div>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="return-values"><a name="return-values" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.3.&nbsp;Return Values</h3>
               <div class="body conbody">
                  <p class="p">All CURAND host library calls have a return value of <samp class="ph codeph">curandStatus_t</samp>. Calls that succeed without errors return <samp class="ph codeph">CURAND_STATUS_SUCCESS</samp>. If errors occur, other values are returned depending on the error. Because CUDA allows kernels to execute asynchronously
                     from CPU code, it is possible that errors in a non-CURAND kernel will be detected during a call to a library function. In
                     this case, <samp class="ph codeph">CURAND_STATUS_PREEXISTING_ERROR</samp> is returned.
                  </p>
               </div>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="generation-functions"><a name="generation-functions" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.4.&nbsp;Generation Functions</h3>
               <div class="body conbody"><pre xml:space="preserve">
curandStatus_t 
curandGenerate(
    curandGenerator_t generator, 
    unsigned int *outputPtr, size_t num)
</pre><p class="p">The <samp class="ph codeph">curandGenerate()</samp> function is used to generate pseudo- or quasirandom bits of output. For XORWOW, MRG32k3a, MTGP32, and SOBOL32 generators,
                     each output element is a 32-bit unsigned int where all bits are random. For SOBOL64 generators, each output element is a 64-bit
                     unsigned long long where all bits are random.
                  </p><pre xml:space="preserve">
curandStatus_t 
curandGenerateUniform(
    curandGenerator_t generator, 
    float *outputPtr, size_t num)
</pre><p class="p">The <samp class="ph codeph">curandGenerateUniform()</samp> function is used to generate uniformly distributed floating point values between 0.0 and 1.0, where 0.0 is excluded and 1.0
                     is included.
                  </p><pre xml:space="preserve">
curandStatus_t 
curandGenerateNormal(
    curandGenerator_t generator, 
    float *outputPtr, size_t n, 
    float mean, float stddev)
</pre><p class="p">The <samp class="ph codeph">curandGenerateNormal()</samp> function is used to generate normally distributed floating point values with the given mean and standard deviation.
                  </p><pre xml:space="preserve">
curandStatus_t 
curandGenerateLogNormal(
    curandGenerator_t generator, 
    float *outputPtr, size_t n, 
    float mean, float stddev)
</pre><p class="p">The <samp class="ph codeph">curandGenerateLogNormal()</samp> function is used to generate log-normally distributed floating point values based on a normal distribution with the given
                     mean and standard deviation.
                  </p><pre xml:space="preserve">
curandStatus_t 
curandGeneratePoisson(
    curandGenerator_t generator, 
    unsigned int *outputPtr, size_t n, 
    double lambda)
</pre><p class="p">The <samp class="ph codeph">curandGeneratePoisson()</samp> function is used to generate Poisson-distributed integer values based on a Poisson distribution with the given lambda.
                  </p><pre xml:space="preserve">
curandStatus_t
curandGenerateUniformDouble(
    curandGenerator_t generator, 
    double *outputPtr, size_t num)
</pre><p class="p">The <samp class="ph codeph">curandGenerateUniformDouble()</samp> function generates uniformly distributed random numbers in double precision.
                  </p><pre xml:space="preserve">
curandStatus_t
curandGenerateNormalDouble(
    curandGenerator_t generator,
    double *outputPtr, size_t n, 
    double mean, double stddev)
</pre><p class="p"><samp class="ph codeph">curandGenerateNormalDouble()</samp> generates normally distributed results in double precision with the given mean and standard deviation. Double precision results
                     can only be generated on devices of compute capability 1.3 or above, and the host.
                  </p><pre xml:space="preserve">
curandStatus_t
curandGenerateLogNormalDouble(
    curandGenerator_t generator,
    double *outputPtr, size_t n, 
    double mean, double stddev)
</pre><p class="p"><samp class="ph codeph">curandGenerateLogNormalDouble()</samp> generates log-normally distributed results in double precision, based on a normal distribution with the given mean and standard
                     deviation.
                  </p>
                  <p class="p">For quasirandom generation, the number of results returned must be a multiple of the dimension of the generator.</p>
                  <p class="p">Generation functions can be called multiple times on the same generator to generate successive blocks of results. For pseudorandom
                     generators, multiple calls to generation functions will yield the same result as a single call with a large size. For quasirandom
                     generators, because of the ordering of dimensions in memory, many shorter calls will not produce the same results in memory
                     as one larger call; however the generated 
                     <math xmlns="http://www.w3.org/1998/Math/MathML">
                        <mi>n</mi>
                     </math>-dimensional vectors will be the same.
                  </p>
                  <p class="p">Double precision results can only be generated on devices of compute capability 1.3 or above, and the host.</p>
               </div>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="host-api-example"><a name="host-api-example" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.5.&nbsp;Host API Example</h3>
               <div class="body conbody"><pre xml:space="preserve">

/*
 * This program uses the host CURAND API to generate 100 
 * pseudorandom floats.
 */
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;cuda.h&gt;
#include &lt;curand.h&gt;

#define CUDA_CALL(x) do { if((x)!=cudaSuccess) { \
    printf("Error at %s:%d\n",__FILE__,__LINE__);\
    return EXIT_FAILURE;}} while(0)
#define CURAND_CALL(x) do { if((x)!=CURAND_STATUS_SUCCESS) { \
    printf("Error at %s:%d\n",__FILE__,__LINE__);\
    return EXIT_FAILURE;}} while(0)

int main(int argc, char *argv[])
{
    size_t n = 100;
    size_t i;
    curandGenerator_t gen;
    float *devData, *hostData;

    /* Allocate n floats on host */
    hostData = (float *)calloc(n, sizeof(float));

    /* Allocate n floats on device */
    CUDA_CALL(cudaMalloc((void **)&amp;devData, n*sizeof(float)));

    /* Create pseudo-random number generator */
    CURAND_CALL(curandCreateGenerator(&amp;gen, 
                CURAND_RNG_PSEUDO_DEFAULT));
    
    /* Set seed */
    CURAND_CALL(curandSetPseudoRandomGeneratorSeed(gen, 
                1234ULL));

    /* Generate n floats on device */
    CURAND_CALL(curandGenerateUniform(gen, devData, n));

    /* Copy device memory to host */
    CUDA_CALL(cudaMemcpy(hostData, devData, n * sizeof(float),
        cudaMemcpyDeviceToHost));

    /* Show result */
    for(i = 0; i &lt; n; i++) {
        printf("%1.4f ", hostData[i]);
    }
    printf("\n");

    /* Cleanup */
    CURAND_CALL(curandDestroyGenerator(gen));
    CUDA_CALL(cudaFree(devData));
    free(hostData);    
    return EXIT_SUCCESS;
}

</pre></div>
            </div>
            <div class="topic concept nested1" xml:lang="en-us" id="performance-notes2"><a name="performance-notes2" shape="rect">
                  <!-- --></a><h3 class="topictitle3">2.6.&nbsp;Performance Notes</h3>
               <div class="body conbody">
                  <p class="p">In general you will get the best performance from the CURAND library by generating blocks of random numbers that are as large
                     as possible. Fewer calls to generate many random numbers is more efficient than many calls generating only a few random numbers.
                     The default pseudorandom generator, XORWOW, with the default ordering takes some time to setup the first time it is called.
                     Subsequent generation calls do not require this setup. To avoid this setup time, use the <samp class="ph codeph">CURAND_ORDERING_PSEUDO_SEEDED</samp> ordering.
                  </p>
                  <p class="p">The MTGP32 Mersenne Twister algorithm is closely tied to the thread and block count. The state structure for MTGP32 actually
                     contains the state for 256 consecutive samples from a given sequence, as determined by a specific parameter set. Each of 64
                     blocks uses a different parameter set and each of 256 threads generates one sample from the state, and updates the state.
                     Hence the most efficient use of MTGP32 is to generate a multiple of 16384 samples.
                  </p>
               </div>
            </div>
         </div>
         
         <hr id="contents-end"></hr>
         <div id="breadcrumbs"><a href="compatibility-and-versioning.html" shape="rect">&lt; Previous</a> | <a href="device-api-overview.html" shape="rect">Next &gt;</a></div>
         <div id="release-info">CURAND
            (<a href="../../pdf/CURAND_Library.pdf">PDF</a>)
            -
            CUDA Toolkit v5.5
            (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
            -
            Last updated 
            July 19, 2013
            -
            <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: curand">Send Feedback</a></div>
         
      </article>
      
      <header id="header"><span id="company">NVIDIA</span><span id="site-title">CUDA Toolkit Documentation</span><form id="search" method="get" action="search">
            <input type="text" name="search-text"></input><fieldset id="search-location">
               <legend>Search In:</legend>
               <label><input type="radio" name="search-type" value="site"></input>Entire Site</label>
               <label><input type="radio" name="search-type" value="document"></input>Just This Document</label></fieldset>
            <button type="reset">clear search</button>
            <button id="submit" type="submit">search</button></form>
      </header>
      <nav id="site-nav">
         <div class="category closed"><span class="twiddle">▷</span><a href="../index.html" title="The root of the site.">CUDA Toolkit</a></div>
         <ul class="closed">
            <li><a href="../cuda-toolkit-release-notes/index.html" title="The Release Notes for the CUDA Toolkit from v4.0 to today.">Release Notes</a></li>
            <li><a href="../eula/index.html" title="The End User License Agreements for the NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA Display Driver, and NVIDIA NSight (Visual Studio Edition).">EULA</a></li>
            <li><a href="../cuda-getting-started-guide-for-linux/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on GNU/Linux systems.">Getting Started Linux</a></li>
            <li><a href="../cuda-getting-started-guide-for-mac-os-x/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Mac OS X systems.">Getting Started Mac OS X</a></li>
            <li><a href="../cuda-getting-started-guide-for-microsoft-windows/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Microsoft Windows systems.">Getting Started Windows</a></li>
            <li><a href="../cuda-c-programming-guide/index.html" title="This guide provides a detailed discussion of the CUDA programming model and programming interface. It then describes the hardware implementation, and provides guidance on how to achieve maximum performance. The Appendixes include a list of all CUDA-enabled devices, detailed description of all extensions to the C language, listings of supported mathematical functions, C++ features supported in host and device code, details on texture fetching, technical specifications of various devices, and concludes by introducing the low-level driver API.">Programming Guide</a></li>
            <li><a href="../cuda-c-best-practices-guide/index.html" title="This guide presents established parallelization and optimization techniques and explains coding metaphors and idioms that can greatly simplify programming for CUDA-capable GPU architectures. The intent is to provide guidelines for obtaining the best performance from NVIDIA GPUs using the CUDA Toolkit.">Best Practices Guide</a></li>
            <li><a href="../kepler-compatibility-guide/index.html" title="This application note is intended to help developers ensure that their NVIDIA CUDA applications will run effectively on GPUs based on the NVIDIA Kepler Architecture. This document provides guidance to ensure that your software applications are compatible with Kepler.">Kepler Compatibility Guide</a></li>
            <li><a href="../kepler-tuning-guide/index.html" title="Kepler is NVIDIA's next-generation architecture for CUDA compute applications. Applications that follow the best practices for the Fermi architecture should typically see speedups on the Kepler architecture without any code changes. This guide summarizes the ways that an application can be fine-tuned to gain additional speedups by leveraging Kepler architectural features.">Kepler Tuning Guide</a></li>
            <li><a href="../parallel-thread-execution/index.html" title="This guide provides detailed instructions on the use of PTX, a low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device.">PTX ISA</a></li>
            <li><a href="../optimus-developer-guide/index.html" title="This document explains how CUDA APIs can be used to query for GPU capabilities in NVIDIA Optimus systems.">Developer Guide for Optimus</a></li>
            <li><a href="../video-decoder/index.html" title="This document provides the video decoder API specification and the format conversion and display using DirectX or OpenGL following decode.">Video Decoder</a></li>
            <li><a href="../video-encoder/index.html" title="This document provides the CUDA video encoder specifications, including the C-library API functions and encoder query parameters.">Video Encoder</a></li>
            <li><a href="../inline-ptx-assembly/index.html" title="This document shows how to inline PTX (parallel thread execution) assembly language statements into CUDA code. It describes available assembler statement parameters and constraints, and the document also provides a list of some pitfalls that you may encounter.">Inline PTX Assembly</a></li>
            <li><a href="../cuda-runtime-api/index.html" title="The CUDA runtime API.">CUDA Runtime API</a></li>
            <li><a href="../cuda-driver-api/index.html" title="The CUDA driver API.">CUDA Driver API</a></li>
            <li><a href="../cuda-math-api/index.html" title="The CUDA math API.">CUDA Math API</a></li>
            <li><a href="../cublas/index.html" title="The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access the computational resources of NVIDIA Graphical Processing Unit (GPU), but does not auto-parallelize across multiple GPUs.">CUBLAS</a></li>
            <li><a href="../cufft/index.html" title="The CUFFT library user guide.">CUFFT</a></li>
            <li><a href="../curand/index.html" title="The CURAND library user guide.">CURAND</a></li>
            <li><a href="../cusparse/index.html" title="The CUSPARSE library user guide.">CUSPARSE</a></li>
            <li><a href="../npp/index.html" title="NVIDIA NPP is a library of functions for performing CUDA accelerated processing. The initial set of functionality in the library focuses on imaging and video processing and is widely applicable for developers in these areas. NPP will evolve over time to encompass more of the compute heavy tasks in a variety of problem domains. The NPP library is written to maximize flexibility, while maintaining high performance.">NPP</a></li>
            <li><a href="../thrust/index.html" title="The Thrust getting started guide.">Thrust</a></li>
            <li><a href="../cuda-samples/index.html" title="This document contains a complete listing of the code samples that are included with the NVIDIA CUDA Toolkit. It describes each code sample, lists the minimum GPU specification, and provides links to the source code and white papers if available.">CUDA Samples</a></li>
            <li><a href="../cuda-compiler-driver-nvcc/index.html" title="This document is a reference guide on the use of the CUDA compiler driver nvcc. Instead of being a specific CUDA compilation driver, nvcc mimics the behavior of the GNU compiler gcc, accepting a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process.">NVCC</a></li>
            <li><a href="../cuda-gdb/index.html" title="The NVIDIA tool for debugging CUDA applications running on Linux and Mac, providing developers with a mechanism for debugging CUDA applications running on actual hardware. CUDA-GDB is an extension to the x86-64 port of GDB, the GNU Project debugger.">CUDA-GDB</a></li>
            <li><a href="../cuda-memcheck/index.html" title="CUDA-MEMCHECK is a suite of run time tools capable of precisely detecting out of bounds and misaligned memory access errors, checking device allocation leaks, reporting hardware errors and identifying shared memory data access hazards.">CUDA-MEMCHECK</a></li>
            <li><a href="../nsight-eclipse-edition-getting-started-guide/index.html" title="Nsight Eclipse Edition getting started guide">Nsight Eclipse Edition</a></li>
            <li><a href="../profiler-users-guide/index.html" title="This is the guide to the Profiler.">Profiler</a></li>
            <li><a href="../cuda-binary-utilities/index.html" title="The application notes for cuobjdump and nvdisasm.">CUDA Binary Utilities</a></li>
            <li><a href="../floating-point/index.html" title="A number of issues related to floating point accuracy and compliance are a frequent source of confusion on both CPUs and GPUs. The purpose of this white paper is to discuss the most common issues related to NVIDIA GPUs and to supplement the documentation in the CUDA C Programming Guide.">Floating Point and IEEE 754</a></li>
            <li><a href="../incomplete-lu-cholesky/index.html" title="In this white paper we show how to use the CUSPARSE and CUBLAS libraries to achieve a 2x speedup over CPU in the incomplete-LU and Cholesky preconditioned iterative methods. We focus on the Bi-Conjugate Gradient Stabilized and Conjugate Gradient iterative methods, that can be used to solve large sparse nonsymmetric and symmetric positive definite linear systems, respectively. Also, we comment on the parallel sparse triangular solve, which is an essential building block in these algorithms.">Incomplete-LU and Cholesky Preconditioned Iterative Methods</a></li>
            <li><a href="../libnvvm-api/index.html" title="The libNVVM API.">libNVVM API</a></li>
            <li><a href="../libdevice-users-guide/index.html" title="The libdevice library is an LLVM bitcode library that implements common functions for GPU kernels.">libdevice User's Guide</a></li>
            <li><a href="../nvvm-ir-spec/index.html" title="NVVM IR is a compiler IR (internal representation) based on the LLVM IR. The NVVM IR is designed to represent GPU compute kernels (for example, CUDA kernels). High-level language front-ends, like the CUDA C compiler front-end, can generate NVVM IR.">NVVM IR</a></li>
            <li><a href="../cupti/index.html" title="The CUPTI API.">CUPTI</a></li>
            <li><a href="../debugger-api/index.html" title="The CUDA debugger API.">Debugger API</a></li>
            <li><a href="../gpudirect-rdma/index.html" title="A tool for Kepler-class GPUs and CUDA 5.0 enabling a direct path for communication between the GPU and a peer device on the PCI Express bus when the devices share the same upstream root complex using standard features of PCI Express. This document introduces the technology and describes the steps necessary to enable a RDMA for GPUDirect connection to NVIDIA GPUs within the Linux device driver model.">RDMA for GPUDirect</a></li>
         </ul>
         <div class="category"><span class="twiddle">▼</span><a href="index.html" title="CURAND">CURAND</a></div>
         <ul>
            <li><a href="introduction.html#introduction">Introduction</a></li>
            <li><a href="compatibility-and-versioning.html#compatibility-and-versioning">1.&nbsp;Compatibility and Versioning</a></li>
            <li><a href="host-api-overview.html#host-api-overview">2.&nbsp;Host API Overview</a><ul>
                  <li><a href="host-api-overview.html#generator-types">2.1.&nbsp;Generator Types</a></li>
                  <li><a href="host-api-overview.html#generator-options">2.2.&nbsp;Generator Options</a><ul>
                        <li><a href="host-api-overview.html#seed">2.2.1.&nbsp;Seed</a></li>
                        <li><a href="host-api-overview.html#offset">2.2.2.&nbsp;Offset</a></li>
                        <li><a href="host-api-overview.html#order">2.2.3.&nbsp;Order</a></li>
                     </ul>
                  </li>
                  <li><a href="host-api-overview.html#return-values">2.3.&nbsp;Return Values</a></li>
                  <li><a href="host-api-overview.html#generation-functions">2.4.&nbsp;Generation Functions</a></li>
                  <li><a href="host-api-overview.html#host-api-example">2.5.&nbsp;Host API Example</a></li>
                  <li><a href="host-api-overview.html#performance-notes2">2.6.&nbsp;Performance Notes</a></li>
               </ul>
            </li>
            <li><a href="device-api-overview.html#device-api-overview">3.&nbsp;Device API Overview</a><ul>
                  <li><a href="device-api-overview.html#pseudorandom-sequences">3.1.&nbsp;Pseudorandom Sequences</a><ul>
                        <li><a href="device-api-overview.html#bit-generation-1">3.1.1.&nbsp;Bit Generation with XORWOW and MRG32k3a generators</a></li>
                        <li><a href="device-api-overview.html#bit-generation-2">3.1.2.&nbsp;Bit Generation with the MTGP32 generator</a></li>
                        <li><a href="device-api-overview.html#distributions">3.1.3.&nbsp;Distributions</a></li>
                     </ul>
                  </li>
                  <li><a href="device-api-overview.html#quasirandom-sequences">3.2.&nbsp;Quasirandom Sequences</a></li>
                  <li><a href="device-api-overview.html#skip-ahead">3.3.&nbsp;Skip-Ahead</a></li>
                  <li><a href="device-api-overview.html#device-api-for-discrete-distributions">3.4.&nbsp;Device API for discrete distributions</a></li>
                  <li><a href="device-api-overview.html#performance-notes">3.5.&nbsp;Performance Notes</a></li>
                  <li><a href="device-api-overview.html#device-api-example">3.6.&nbsp;Device API Examples</a></li>
                  <li><a href="device-api-overview.html#thrust-and-curand-example">3.7.&nbsp;Thrust and CURAND Example</a></li>
                  <li><a href="device-api-overview.html#poisson-api-example">3.8.&nbsp;Poisson API Example</a></li>
               </ul>
            </li>
            <li><a href="testing.html#testing">4.&nbsp;Testing</a></li>
            <li><a href="modules.html#modules">5.&nbsp;Modules</a><ul>
                  <li><a href="group__HOST.html#group__HOST">5.1.&nbsp;Host API</a></li>
                  <li><a href="group__DEVICE.html#group__DEVICE">5.2.&nbsp;Device API</a></li>
               </ul>
            </li>
            <li><a href="bibliography.html#bibliography">A.&nbsp;Bibliography</a></li>
            <li><a href="acknowledgements.html#acknowledgements">B.&nbsp;Acknowledgements</a></li>
            <li><a href="notices-header.html#notices-header">Notices</a><ul></ul>
            </li>
         </ul>
      </nav>
      <nav id="search-results">
         <h2>Search Results</h2>
         <ol></ol>
      </nav>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/formatting/common.min.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/s_code_us_dev_aut1-nolinktrackin.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/omniture.js"></script>
      <noscript><a href="http://www.omniture.com" title="Web Analytics"><img src="http://omniture.nvidia.com/b/ss/nvidiacudadocs/1/H.17--NS/0" height="1" width="1" border="0" alt=""></img></a></noscript>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-write.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-tracker.js"></script>
      </body>
</html>