Sophie

Sophie

distrib > Mageia > 4 > x86_64 > by-pkgid > b0aa6cd23b567cd0e312b072b2e3b0bf > files > 1266

nvidia-cuda-toolkit-devel-5.5.22-2.mga4.nonfree.x86_64.rpm

<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-us" xml:lang="en-us">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
      <meta http-equiv="X-UA-Compatible" content="IE=edge"></meta>
      <meta name="copyright" content="(C) Copyright 2005"></meta>
      <meta name="DC.rights.owner" content="(C) Copyright 2005"></meta>
      <meta name="DC.Type" content="cuda_reference"></meta>
      <meta name="DC.Title" content="Introduction"></meta>
      <meta name="DC.Format" content="XHTML"></meta>
      <meta name="DC.Identifier" content="r_main"></meta>
      <link rel="stylesheet" type="text/css" href="../common/formatting/commonltr.css"></link>
      <link rel="stylesheet" type="text/css" href="../common/formatting/site.css"></link>
      <title>Debugger API :: CUDA Toolkit Documentation</title>
      <!--[if lt IE 9]>
      <script src="../common/formatting/html5shiv-printshiv.min.js"></script>
      <![endif]-->
      <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.min.js"></script>
      <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.ba-hashchange.min.js"></script>
      <link rel="canonical" href="http://docs.nvidia.com/cuda/debugger-api/index.html"></link>
      <link rel="stylesheet" type="text/css" href="../common/formatting/qwcode.highlight.css"></link>
   </head>
   <body>
      
      <article id="contents">
         <div id="breadcrumbs"><a href="index.html" shape="rect">&lt; Previous</a> | <a href="modules.html" shape="rect">Next &gt;</a></div>
         <div id="release-info">Debugger API
            (<a href="../../pdf/CUDA_Debugger_API.pdf">PDF</a>)
            -
            CUDA Toolkit v5.5
            (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
            -
            Last updated 
            July 19, 2013
            -
            <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: debugger-api">Send Feedback</a></div>
         <div class="topic nested1" id="r_main"><a name="r_main" shape="rect">
               <!-- --></a><h2 class="topictitle2">1.&nbsp;Introduction</h2>
            <div class="body refbody">
               <div class="section">
                  <p class="p">This document describes the API for the set routines and data structures available in
                     the CUDA library to any debugger.
                  </p>
               </div>
               <div class="section">
                  <div class="p">Starting with 3.0, the CUDA debugger API includes several major changes, of which only
                     few are directly visible to end-users: 
                     <ul class="ul">
                        <li class="li">
                           <p class="p">Performance is greatly improved, both with respect to interactions with the
                              debugger and the performance of applications being debugged. 
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">The format of cubins has changed to ELF and, as a consequence, most
                              restrictions on debug compilations have been lifted. More information about the
                              new object format is included below.
                           </p>
                        </li>
                     </ul>
                     The debugger API has significantly changed, reflected in the CUDA-GDB sources.
                  </div>
               </div>
            </div>
            <div class="topic reference cuda_reference nested1" id="r_api"><a name="r_api" shape="rect">
                  <!-- --></a><h3 class="topictitle3">1.1.&nbsp;Debugger API</h3>
               <div class="body refbody">
                  <div class="section">
                     <p class="p">The CUDA Debugger API was developed with the goal of adhering to the following principles:</p>
                     <div class="p">
                        <ul class="ul">
                           <li class="li">
                              <p class="p">Policy free </p>
                           </li>
                           <li class="li">
                              <p class="p">Explicit </p>
                           </li>
                           <li class="li">
                              <p class="p">Axiomatic </p>
                           </li>
                           <li class="li">
                              <p class="p">Extensible </p>
                           </li>
                           <li class="li">
                              <p class="p">Machine oriented</p>
                           </li>
                        </ul>
                        
                        Being explicit is another way of saying that we minimize the assumptions we make. As much as possible the API reflects machine
                        state, not internal state.
                     </div>
                     <p class="p">There are two major "modes" of the devices: stopped or running. We switch between these modes explicitly with suspendDevice
                        and resumeDevice, though the machine may suspend on its own accord, for example when hitting a breakpoint.
                     </p>
                     <p class="p">Only when stopped, can we query the machine's state. Warp state includes which function is it runnning, which block, which
                        lanes are valid, etc.
                     </p>
                  </div>
               </div>
            </div>
            <div class="topic reference cuda_reference nested1" id="r_elf"><a name="r_elf" shape="rect">
                  <!-- --></a><h3 class="topictitle3">1.2.&nbsp;ELF and DWARF</h3>
               <div class="body refbody">
                  <div class="section">
                     <p class="p">CUDA applications are compiled in ELF binary format.</p>
                     <p class="p">DWARF device information is obtained through a <a href="structCUDBGEvent.html" shape="rect">CUDBGEvent</a> of type CUDBG_EVENT_ELF_IMAGE_LOADED. This means that the information is not available until runtime, after the CUDA driver
                        has loaded.
                     </p>
                     <p class="p">DWARF device information contains physical addresses for all device memory regions except for code memory. The address class
                        field (DW_AT_address_class) is set for all device variables, and is used to indicate the memory segment type (ptxStorageKind).
                        The physical addresses must be accessed using several segment-specific API calls.
                     </p>
                     <div class="p">For memory reads, see: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g96d8d7f7158aacc75b0013fb14e070df" shape="rect">CUDBGAPI_st::readCodeMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g403214b4c091fa8f1805e652fa720717" shape="rect">CUDBGAPI_st::readConstMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g3a55358b9bdbc9284a1f3d8a673627c7" shape="rect">CUDBGAPI_st::readGlobalMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g2f5bf430b5202e893f896a4e53e7473e" shape="rect">CUDBGAPI_st::readParamMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1ge242a5b3d2877bb06e69b29e08079d04" shape="rect">CUDBGAPI_st::readSharedMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g81729a1eb1b4d90e63f505bc4e407917" shape="rect">CUDBGAPI_st::readLocalMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g0d793af43e61047ee8069835d4407819" shape="rect">CUDBGAPI_st::readTextureMemory()</a></p>
                           </li>
                        </ul>
                        
                        For memory writes, see: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__WRITE.html#group__WRITE_1g8ff20f825d68bee5174c0c83b443943d" shape="rect">CUDBGAPI_st::writeGlobalMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__WRITE.html#group__WRITE_1gf2396f87598ff9edb9c2cf0d0d9e51c2" shape="rect">CUDBGAPI_st::writeParamMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__WRITE.html#group__WRITE_1gec20ac438034a0a94f548ee9c1cf13cc" shape="rect">CUDBGAPI_st::writeSharedMemory()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__WRITE.html#group__WRITE_1g8552847e859c3014eb5cf021ca1a07ff" shape="rect">CUDBGAPI_st::writeLocalMemory()</a></p>
                           </li>
                        </ul>
                        
                        Access to code memory requires a virtual address. This virtual address is embedded for all device code sections in the device
                        ELF image. See the API call: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1gb8f4830c29701bae9198de2351f51985" shape="rect">CUDBGAPI_st::readVirtualPC()</a></p>
                           </li>
                        </ul>
                        
                        Here is a typical DWARF entry for a device variable located in memory:
                     </div>
                     <div class="p"><pre xml:space="preserve">&lt;2&gt;&lt;321&gt;: Abbrev Number: 18 (DW_TAG_formal_parameter)
     DW_AT_decl_file   : 27
     DW_AT_decl_line   : 5
     DW_AT_name        : res
     DW_AT_type        : &lt;2c6&gt;
     DW_AT_location    : 9 byte block: 3 18 0 0 0 0 0 0 0       (DW_OP_addr: 18)
     DW_AT_address_class: 7</pre></div>
                     <p class="p">The above shows that variable 'res' has an address class of 7 (ptxParamStorage). Its location information shows it is located
                        at address 18 within the parameter memory segment.
                     </p>
                     <p class="p">Local variables are no longer spilled to local memory by default. The DWARF now contains variable-to-register mapping and
                        liveness information for all variables. It can be the case that variables are spilled to local memory, and this is all contained
                        in the DWARF information which is ULEB128 encoded (as a DW_OP_regx stack operation in the DW_AT_location attribute).
                     </p>
                     <p class="p">Here is a typical DWARF entry for a variable located in a local register:</p>
                     <div class="p"><pre xml:space="preserve">&lt;3&gt;&lt;359&gt;: Abbrev Number: 20 (DW_TAG_variable)
     DW_AT_decl_file   : 27
     DW_AT_decl_line   : 7
     DW_AT_name        : c
     DW_AT_type        : &lt;1aa&gt;
     DW_AT_location    : 7 byte block: 90 b9 e2 90 b3 d6 4      (DW_OP_regx: 160631632185)
     DW_AT_address_class: 2</pre></div>
                     <p class="p">This shows variable 'c' has address class 2 (ptxRegStorage) and its location can be found by decoding the ULEB128 value, DW_OP_regx:
                        160631632185. See cuda-tdep.c in the cuda-gdb source drop for information on decoding this value and how to obtain which physical
                        register holds this variable during a specific device PC range.
                     </p>
                     <div class="p"> Access to physical registers liveness information requires a 0-based physical PC. See the API call: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g4e5d98dced2544bbe90d0a9483527f3f" shape="rect">CUDBGAPI_st::readPC()</a></p>
                           </li>
                        </ul>
                     </div>
                  </div>
               </div>
            </div>
            <div class="topic reference cuda_reference nested1" id="r_abi31"><a name="r_abi31" shape="rect">
                  <!-- --></a><h3 class="topictitle3">1.3.&nbsp;ABI Support</h3>
               <div class="body refbody">
                  <div class="section">
                     <div class="p">ABI support is handled through the following thread API calls: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g6b438d1fd6d089bc430dd8ba6b53daf8" shape="rect">CUDBGAPI_st::readCallDepth()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g31dee949a5b53d5c509668c764ec9171" shape="rect">CUDBGAPI_st::readReturnAddress()</a></p>
                           </li>
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1ga2b518d57cfab4feba42e1fcfffb5913" shape="rect">CUDBGAPI_st::readVirtualReturnAddress()</a></p>
                           </li>
                        </ul>
                        
                        The return address is not accessible on the local stack and the API call must be used to access its value.
                     </div>
                     <p class="p">For more information, please refer to the ABI documentation titled "Fermi ABI: Application Binary Interface".</p>
                  </div>
               </div>
            </div>
            <div class="topic reference cuda_reference nested1" id="r_exceptions31"><a name="r_exceptions31" shape="rect">
                  <!-- --></a><h3 class="topictitle3">1.4.&nbsp;Exception Reporting</h3>
               <div class="body refbody">
                  <div class="section">
                     <div class="p">Some kernel exceptions are reported as device events and accessible via the API call: 
                        <ul class="ul">
                           <li class="li">
                              <p class="p"><a href="group__READ.html#group__READ_1g67afc64c7b4e87e14bad401242a2077a" shape="rect">CUDBGAPI_st::readLaneException()</a></p>
                           </li>
                        </ul>
                        
                        The reported exceptions are listed in the CUDBGException_t enum type. Each prefix, (Device, Warp, Lane), refers to the precision
                        of the exception. That is, the lowest known execution unit that is responsible for the origin of the exception. All lane errors
                        are precise; the exact instruction and lane that caused the error are known. Warp errors are typically within a few instructions
                        of where the actual error occurred, but the exact lane within the warp is not known. On device errors, we <em class="ph i">may</em> know the <em class="ph i">kernel</em> that caused it. Explanations about each exception type can be found in the documentation of the struct.
                     </div>
                     <p class="p">Exception reporting is only supported on Fermi (sm_20 or greater). </p>
                  </div>
               </div>
            </div>
            <div class="topic reference cuda_reference nested1" id="r_attach"><a name="r_attach" shape="rect">
                  <!-- --></a><h3 class="topictitle3">1.5.&nbsp;Attaching and Detaching</h3>
               <div class="body refbody">
                  <div class="section">
                     <p class="p">The debug client must take the following steps to attach to a running CUDA application:</p>
                     <ol class="ol">
                        <li class="li">
                           <p class="p">Attach to the CPU process corresponding to the CUDA application. The CPU part of the application will be frozen at this point.</p>
                        </li>
                        <li class="li">
                           <p class="p">Check to see if the CUDBG_IPC_FLAG_NAME variable is accessible from the memory space of the application. If not, it implies
                              that the application has not loaded the CUDA driver, and the attaching to the application is complete.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">Make a dynamic function call to the function cudbgApiInit() with an argument of "2", i.e., "cudbgApiInit(2)". This causes
                              a helper process to be forked off from the application, which assists in attaching to the CUDA process.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">Ensure that the initialization of the CUDA debug API is complete, or wait till API initialization is successful.</p>
                        </li>
                        <li class="li">
                           <p class="p">Make the "initializeAttachStub()" API call to initialize the helper process that was forked off from the application earlier.</p>
                        </li>
                        <li class="li">
                           <p class="p">Read the value of the CUDBG_ATTACH_HANDLER_AVAILABLE variable from the memory space of the application:</p>
                           <ul class="ul">
                              <li class="li">
                                 <p class="p">If the value is non-zero, resume the CUDA application so that more data can be collected about the application and sent to
                                    the debugger. When the application is resumed, the debug client can expect to receive various CUDA events from the CUDA application.
                                    Once all state has been collected, the debug client will receive the event CUDBG_EVENT_ATTACH_COMPLETE.
                                 </p>
                              </li>
                              <li class="li">
                                 <p class="p">If the value is zero, there is no more attach data to collect. Set the CUDBG_IPC_FLAG_NAME variable to 1 in the application's
                                    process space, which enables further events from the CUDA application.
                                 </p>
                              </li>
                           </ul>
                        </li>
                        <li class="li">
                           <p class="p">At this point, attaching to the CUDA application is complete and all GPUs belonging to the CUDA application will be suspended.</p>
                        </li>
                     </ol>
                  </div>
                  <div class="section">
                     <p class="p">The debug client must take the following steps to detach from a running CUDA application:</p>
                     <ol class="ol">
                        <li class="li">
                           <p class="p">Check to see if the CUDBG_IPC_FLAG_NAME variable is accessible from the memory space of the application, and that the CUDA
                              debug API is initialized. If either of these conditions is not met, treat the application as CPU-only and detach from the
                              application.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">Next, make the "clearAttachState" API call to prepare the CUDA application for detach.</p>
                        </li>
                        <li class="li">
                           <p class="p">Read the value of the CUDBG_ATTACH_HANDLER_AVAILABLE variable from the memory space of the application. If the value is non-zero,
                              make the "requestCleanupOnDetach" API call.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">Set the CUDBG_DEBUGGER_INITIALIZED variable to 0 in the memory space of the application. This makes sure the debugger is reinitialized
                              from scratch if the debug client re-attaches to the application in the future.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">If the value of the CUDBG_ATTACH_HANDLER_AVAILABLE variable was found to be non-zero in step 3, delete all breakpoints and
                              resume the CUDA application. This allows the CUDA driver to perform cleanups before the debug client detaches from it. Once
                              the cleanup is complete, the debug client will receive the event CUDBG_EVENT_DETACH_COMPLETE.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">Set the CUDBG_IPC_FLAG_NAME variable to zero in the memory space of the application. This prevents any more callbacks from
                              the CUDA application to the debugger.
                           </p>
                        </li>
                        <li class="li">
                           <p class="p">The client must then finalize the CUDA debug API.</p>
                        </li>
                        <li class="li">
                           <p class="p">Finally, detach from the CPU part of the CUDA application. At this point all GPUs belonging to the CUDA application will be
                              resumed.
                           </p>
                        </li>
                     </ol>
                  </div>
               </div>
            </div>
         </div>
         
         <hr id="contents-end"></hr>
         <div id="breadcrumbs"><a href="index.html" shape="rect">&lt; Previous</a> | <a href="modules.html" shape="rect">Next &gt;</a></div>
         <div id="release-info">Debugger API
            (<a href="../../pdf/CUDA_Debugger_API.pdf">PDF</a>)
            -
            CUDA Toolkit v5.5
            (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
            -
            Last updated 
            July 19, 2013
            -
            <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: debugger-api">Send Feedback</a></div>
         
      </article>
      
      <header id="header"><span id="company">NVIDIA</span><span id="site-title">CUDA Toolkit Documentation</span><form id="search" method="get" action="search">
            <input type="text" name="search-text"></input><fieldset id="search-location">
               <legend>Search In:</legend>
               <label><input type="radio" name="search-type" value="site"></input>Entire Site</label>
               <label><input type="radio" name="search-type" value="document"></input>Just This Document</label></fieldset>
            <button type="reset">clear search</button>
            <button id="submit" type="submit">search</button></form>
      </header>
      <nav id="site-nav">
         <div class="category closed"><span class="twiddle">▷</span><a href="../index.html" title="The root of the site.">CUDA Toolkit</a></div>
         <ul class="closed">
            <li><a href="../cuda-toolkit-release-notes/index.html" title="The Release Notes for the CUDA Toolkit from v4.0 to today.">Release Notes</a></li>
            <li><a href="../eula/index.html" title="The End User License Agreements for the NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA Display Driver, and NVIDIA NSight (Visual Studio Edition).">EULA</a></li>
            <li><a href="../cuda-getting-started-guide-for-linux/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on GNU/Linux systems.">Getting Started Linux</a></li>
            <li><a href="../cuda-getting-started-guide-for-mac-os-x/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Mac OS X systems.">Getting Started Mac OS X</a></li>
            <li><a href="../cuda-getting-started-guide-for-microsoft-windows/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Microsoft Windows systems.">Getting Started Windows</a></li>
            <li><a href="../cuda-c-programming-guide/index.html" title="This guide provides a detailed discussion of the CUDA programming model and programming interface. It then describes the hardware implementation, and provides guidance on how to achieve maximum performance. The Appendixes include a list of all CUDA-enabled devices, detailed description of all extensions to the C language, listings of supported mathematical functions, C++ features supported in host and device code, details on texture fetching, technical specifications of various devices, and concludes by introducing the low-level driver API.">Programming Guide</a></li>
            <li><a href="../cuda-c-best-practices-guide/index.html" title="This guide presents established parallelization and optimization techniques and explains coding metaphors and idioms that can greatly simplify programming for CUDA-capable GPU architectures. The intent is to provide guidelines for obtaining the best performance from NVIDIA GPUs using the CUDA Toolkit.">Best Practices Guide</a></li>
            <li><a href="../kepler-compatibility-guide/index.html" title="This application note is intended to help developers ensure that their NVIDIA CUDA applications will run effectively on GPUs based on the NVIDIA Kepler Architecture. This document provides guidance to ensure that your software applications are compatible with Kepler.">Kepler Compatibility Guide</a></li>
            <li><a href="../kepler-tuning-guide/index.html" title="Kepler is NVIDIA's next-generation architecture for CUDA compute applications. Applications that follow the best practices for the Fermi architecture should typically see speedups on the Kepler architecture without any code changes. This guide summarizes the ways that an application can be fine-tuned to gain additional speedups by leveraging Kepler architectural features.">Kepler Tuning Guide</a></li>
            <li><a href="../parallel-thread-execution/index.html" title="This guide provides detailed instructions on the use of PTX, a low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device.">PTX ISA</a></li>
            <li><a href="../optimus-developer-guide/index.html" title="This document explains how CUDA APIs can be used to query for GPU capabilities in NVIDIA Optimus systems.">Developer Guide for Optimus</a></li>
            <li><a href="../video-decoder/index.html" title="This document provides the video decoder API specification and the format conversion and display using DirectX or OpenGL following decode.">Video Decoder</a></li>
            <li><a href="../video-encoder/index.html" title="This document provides the CUDA video encoder specifications, including the C-library API functions and encoder query parameters.">Video Encoder</a></li>
            <li><a href="../inline-ptx-assembly/index.html" title="This document shows how to inline PTX (parallel thread execution) assembly language statements into CUDA code. It describes available assembler statement parameters and constraints, and the document also provides a list of some pitfalls that you may encounter.">Inline PTX Assembly</a></li>
            <li><a href="../cuda-runtime-api/index.html" title="The CUDA runtime API.">CUDA Runtime API</a></li>
            <li><a href="../cuda-driver-api/index.html" title="The CUDA driver API.">CUDA Driver API</a></li>
            <li><a href="../cuda-math-api/index.html" title="The CUDA math API.">CUDA Math API</a></li>
            <li><a href="../cublas/index.html" title="The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access the computational resources of NVIDIA Graphical Processing Unit (GPU), but does not auto-parallelize across multiple GPUs.">CUBLAS</a></li>
            <li><a href="../cufft/index.html" title="The CUFFT library user guide.">CUFFT</a></li>
            <li><a href="../curand/index.html" title="The CURAND library user guide.">CURAND</a></li>
            <li><a href="../cusparse/index.html" title="The CUSPARSE library user guide.">CUSPARSE</a></li>
            <li><a href="../npp/index.html" title="NVIDIA NPP is a library of functions for performing CUDA accelerated processing. The initial set of functionality in the library focuses on imaging and video processing and is widely applicable for developers in these areas. NPP will evolve over time to encompass more of the compute heavy tasks in a variety of problem domains. The NPP library is written to maximize flexibility, while maintaining high performance.">NPP</a></li>
            <li><a href="../thrust/index.html" title="The Thrust getting started guide.">Thrust</a></li>
            <li><a href="../cuda-samples/index.html" title="This document contains a complete listing of the code samples that are included with the NVIDIA CUDA Toolkit. It describes each code sample, lists the minimum GPU specification, and provides links to the source code and white papers if available.">CUDA Samples</a></li>
            <li><a href="../cuda-compiler-driver-nvcc/index.html" title="This document is a reference guide on the use of the CUDA compiler driver nvcc. Instead of being a specific CUDA compilation driver, nvcc mimics the behavior of the GNU compiler gcc, accepting a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process.">NVCC</a></li>
            <li><a href="../cuda-gdb/index.html" title="The NVIDIA tool for debugging CUDA applications running on Linux and Mac, providing developers with a mechanism for debugging CUDA applications running on actual hardware. CUDA-GDB is an extension to the x86-64 port of GDB, the GNU Project debugger.">CUDA-GDB</a></li>
            <li><a href="../cuda-memcheck/index.html" title="CUDA-MEMCHECK is a suite of run time tools capable of precisely detecting out of bounds and misaligned memory access errors, checking device allocation leaks, reporting hardware errors and identifying shared memory data access hazards.">CUDA-MEMCHECK</a></li>
            <li><a href="../nsight-eclipse-edition-getting-started-guide/index.html" title="Nsight Eclipse Edition getting started guide">Nsight Eclipse Edition</a></li>
            <li><a href="../profiler-users-guide/index.html" title="This is the guide to the Profiler.">Profiler</a></li>
            <li><a href="../cuda-binary-utilities/index.html" title="The application notes for cuobjdump and nvdisasm.">CUDA Binary Utilities</a></li>
            <li><a href="../floating-point/index.html" title="A number of issues related to floating point accuracy and compliance are a frequent source of confusion on both CPUs and GPUs. The purpose of this white paper is to discuss the most common issues related to NVIDIA GPUs and to supplement the documentation in the CUDA C Programming Guide.">Floating Point and IEEE 754</a></li>
            <li><a href="../incomplete-lu-cholesky/index.html" title="In this white paper we show how to use the CUSPARSE and CUBLAS libraries to achieve a 2x speedup over CPU in the incomplete-LU and Cholesky preconditioned iterative methods. We focus on the Bi-Conjugate Gradient Stabilized and Conjugate Gradient iterative methods, that can be used to solve large sparse nonsymmetric and symmetric positive definite linear systems, respectively. Also, we comment on the parallel sparse triangular solve, which is an essential building block in these algorithms.">Incomplete-LU and Cholesky Preconditioned Iterative Methods</a></li>
            <li><a href="../libnvvm-api/index.html" title="The libNVVM API.">libNVVM API</a></li>
            <li><a href="../libdevice-users-guide/index.html" title="The libdevice library is an LLVM bitcode library that implements common functions for GPU kernels.">libdevice User's Guide</a></li>
            <li><a href="../nvvm-ir-spec/index.html" title="NVVM IR is a compiler IR (internal representation) based on the LLVM IR. The NVVM IR is designed to represent GPU compute kernels (for example, CUDA kernels). High-level language front-ends, like the CUDA C compiler front-end, can generate NVVM IR.">NVVM IR</a></li>
            <li><a href="../cupti/index.html" title="The CUPTI API.">CUPTI</a></li>
            <li><a href="../debugger-api/index.html" title="The CUDA debugger API.">Debugger API</a></li>
            <li><a href="../gpudirect-rdma/index.html" title="A tool for Kepler-class GPUs and CUDA 5.0 enabling a direct path for communication between the GPU and a peer device on the PCI Express bus when the devices share the same upstream root complex using standard features of PCI Express. This document introduces the technology and describes the steps necessary to enable a RDMA for GPUDirect connection to NVIDIA GPUs within the Linux device driver model.">RDMA for GPUDirect</a></li>
         </ul>
         <div class="category"><span class="twiddle">▼</span><a href="index.html" title="Debugger API">Debugger API</a></div>
         <ul>
            <li><a href="r_main.html#r_main">1.&nbsp;Introduction</a><ul>
                  <li><a href="r_main.html#r_api">1.1.&nbsp;Debugger API</a></li>
                  <li><a href="r_main.html#r_elf">1.2.&nbsp;ELF and DWARF</a></li>
                  <li><a href="r_main.html#r_abi31">1.3.&nbsp;ABI Support</a></li>
                  <li><a href="r_main.html#r_exceptions31">1.4.&nbsp;Exception Reporting</a></li>
                  <li><a href="r_main.html#r_attach">1.5.&nbsp;Attaching and Detaching</a></li>
               </ul>
            </li>
            <li><a href="modules.html#modules">2.&nbsp;Modules</a><ul>
                  <li><a href="group__GENERAL.html#group__GENERAL">2.1.&nbsp;General</a></li>
                  <li><a href="group__INIT.html#group__INIT">2.2.&nbsp;Initialization</a></li>
                  <li><a href="group__EXEC.html#group__EXEC">2.3.&nbsp;Device Execution Control</a></li>
                  <li><a href="group__BP.html#group__BP">2.4.&nbsp;Breakpoints</a></li>
                  <li><a href="group__READ.html#group__READ">2.5.&nbsp;Device State Inspection</a></li>
                  <li><a href="group__WRITE.html#group__WRITE">2.6.&nbsp;Device State Alteration</a></li>
                  <li><a href="group__GRID.html#group__GRID">2.7.&nbsp;Grid Properties</a></li>
                  <li><a href="group__DEV.html#group__DEV">2.8.&nbsp;Device Properties</a></li>
                  <li><a href="group__DWARF.html#group__DWARF">2.9.&nbsp;DWARF Utilities</a></li>
                  <li><a href="group__EVENT.html#group__EVENT">2.10.&nbsp;Events</a></li>
               </ul>
            </li>
            <li><a href="annotated.html#annotated">3.&nbsp;Data Structures</a><ul>
                  <li><a href="structCUDBGAPI__st.html#structCUDBGAPI__st">3.1.&nbsp;CUDBGAPI_st</a></li>
                  <li><a href="structCUDBGEvent.html#structCUDBGEvent">3.2.&nbsp;CUDBGEvent</a></li>
                  <li><a href="unionCUDBGEvent_1_1cases__st.html#unionCUDBGEvent_1_1cases__st">3.3.&nbsp;CUDBGEvent::cases_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1contextCreate__st.html#structCUDBGEvent_1_1cases__st_1_1contextCreate__st">3.4.&nbsp;CUDBGEvent::cases_st::contextCreate_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1contextDestroy__st.html#structCUDBGEvent_1_1cases__st_1_1contextDestroy__st">3.5.&nbsp;CUDBGEvent::cases_st::contextDestroy_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1contextPop__st.html#structCUDBGEvent_1_1cases__st_1_1contextPop__st">3.6.&nbsp;CUDBGEvent::cases_st::contextPop_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1contextPush__st.html#structCUDBGEvent_1_1cases__st_1_1contextPush__st">3.7.&nbsp;CUDBGEvent::cases_st::contextPush_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1elfImageLoaded__st.html#structCUDBGEvent_1_1cases__st_1_1elfImageLoaded__st">3.8.&nbsp;CUDBGEvent::cases_st::elfImageLoaded_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1internalError__st.html#structCUDBGEvent_1_1cases__st_1_1internalError__st">3.9.&nbsp;CUDBGEvent::cases_st::internalError_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1kernelFinished__st.html#structCUDBGEvent_1_1cases__st_1_1kernelFinished__st">3.10.&nbsp;CUDBGEvent::cases_st::kernelFinished_st</a></li>
                  <li><a href="structCUDBGEvent_1_1cases__st_1_1kernelReady__st.html#structCUDBGEvent_1_1cases__st_1_1kernelReady__st">3.11.&nbsp;CUDBGEvent::cases_st::kernelReady_st</a></li>
                  <li><a href="structCUDBGEventCallbackData.html#structCUDBGEventCallbackData">3.12.&nbsp;CUDBGEventCallbackData</a></li>
                  <li><a href="structCUDBGEventCallbackData40.html#structCUDBGEventCallbackData40">3.13.&nbsp;CUDBGEventCallbackData40</a></li>
                  <li><a href="structCUDBGGridInfo.html#structCUDBGGridInfo">3.14.&nbsp;CUDBGGridInfo</a></li>
               </ul>
            </li>
            <li><a href="functions.html#functions">4.&nbsp;Data Fields</a></li>
            <li><a href="files.html#files">5.&nbsp;File List</a><ul>
                  <li><a href="cudadebugger_8h.html#cudadebugger_8h">5.1.&nbsp;cudadebugger.h</a></li>
               </ul>
            </li>
            <li><a href="globals.html#globals">6.&nbsp;
                  Globals
                  </a><ul>
                  <li><a href="globals_func.html#globals_func">6.1.&nbsp;
                        Globals - Functions</a></li>
                  <li><a href="globals_type.html#globals_type">6.2.&nbsp;
                        Globals - Typedefs</a></li>
                  <li><a href="globals_enum.html#globals_enum">6.3.&nbsp;
                        Globals - Enumerations</a></li>
                  <li><a href="globals_eval.html#globals_eval">6.4.&nbsp;
                        Globals - Enumerator</a></li>
               </ul>
            </li>
            <li><a href="deprecated.html#deprecated">7.&nbsp;Deprecated List</a></li>
            <li><a href="notices-header.html#notices-header">Notices</a><ul></ul>
            </li>
         </ul>
      </nav>
      <nav id="search-results">
         <h2>Search Results</h2>
         <ol></ol>
      </nav>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/formatting/common.min.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/s_code_us_dev_aut1-nolinktrackin.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/omniture.js"></script>
      <noscript><a href="http://www.omniture.com" title="Web Analytics"><img src="http://omniture.nvidia.com/b/ss/nvidiacudadocs/1/H.17--NS/0" height="1" width="1" border="0" alt=""></img></a></noscript>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-write.js"></script>
      <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-tracker.js"></script>
      </body>
</html>