Sophie: nvidia-cuda-toolkit-devel-10.1.168-1.2.mga7.nonfree x86

nvidia-cuda-toolkit-devel-10.1.168-1.2.mga7.nonfree.x86_64.rpm

.TH "Occupancy" 3 "24 Apr 2019" "Version 6.0" "Doxygen" \" -*- nroff -*-
.ad l
.nh
.SH NAME
Occupancy \- 
.SS "Functions"

.in +1c
.ti -1c
.RI "__cudart_builtin__ \fBcudaError_t\fP \fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize)"
.br
.RI "\fIReturns occupancy for a device function. \fP"
.ti -1c
.RI "__cudart_builtin__ \fBcudaError_t\fP \fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize, unsigned int flags)"
.br
.RI "\fIReturns occupancy for a device function with the specified flags. \fP"
.in -1c
.SH "Detailed Description"
.PP 
\\brief occupancy calculation functions of the CUDA runtime API (cuda_runtime_api.h)
.PP
This section describes the occupancy calculation functions of the CUDA runtime application programming interface.
.PP
Besides the occupancy calculator functions (\fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP and \fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP), there are also C++ only occupancy-based launch configuration functions documented in \fBC++ API Routines\fP module.
.PP
See \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP 
.SH "Function Documentation"
.PP 
.SS "__cudart_builtin__ \fBcudaError_t\fP cudaOccupancyMaxActiveBlocksPerMultiprocessor (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize)"
.PP
Returns in \fC*numBlocks\fP the maximum number of active blocks per streaming multiprocessor for the device function.
.PP
\fBParameters:\fP
.RS 4
\fInumBlocks\fP - Returned occupancy 
.br
\fIfunc\fP - Kernel function for which occupancy is calculated 
.br
\fIblockSize\fP - Block size the kernel is intended to be launched with 
.br
\fIdynamicSMemSize\fP - Per-block dynamic shared memory usage intended, in bytes
.RE
.PP
\fBReturns:\fP
.RS 4
\fBcudaSuccess\fP, \fBcudaErrorInvalidDevice\fP, \fBcudaErrorInvalidDeviceFunction\fP, \fBcudaErrorInvalidValue\fP, \fBcudaErrorUnknown\fP, 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.  
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API)\fP, cuOccupancyMaxActiveBlocksPerMultiprocessor 
.RE
.PP

.SS "__cudart_builtin__ \fBcudaError_t\fP cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize, unsigned int flags)"
.PP
Returns in \fC*numBlocks\fP the maximum number of active blocks per streaming multiprocessor for the device function.
.PP
The \fCflags\fP parameter controls how special cases are handled. Valid flags include:
.PP
.IP "\(bu" 2
\fBcudaOccupancyDefault\fP: keeps the default behavior as \fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP
.PP
.PP
.IP "\(bu" 2
\fBcudaOccupancyDisableCachingOverride\fP: This flag suppresses the default behavior on platform where global caching affects occupancy. On such platforms, if caching is enabled, but per-block SM resource usage would result in zero occupancy, the occupancy calculator will calculate the occupancy as if caching is disabled. Setting this flag makes the occupancy calculator to return 0 in such cases. More information can be found about this feature in the 'Unified L1/Texture Cache' section of the Maxwell tuning guide.
.PP
.PP
\fBParameters:\fP
.RS 4
\fInumBlocks\fP - Returned occupancy 
.br
\fIfunc\fP - Kernel function for which occupancy is calculated 
.br
\fIblockSize\fP - Block size the kernel is intended to be launched with 
.br
\fIdynamicSMemSize\fP - Per-block dynamic shared memory usage intended, in bytes 
.br
\fIflags\fP - Requested behavior for the occupancy calculator
.RE
.PP
\fBReturns:\fP
.RS 4
\fBcudaSuccess\fP, \fBcudaErrorInvalidDevice\fP, \fBcudaErrorInvalidDeviceFunction\fP, \fBcudaErrorInvalidValue\fP, \fBcudaErrorUnknown\fP, 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.  
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API)\fP, cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags 
.RE
.PP

.SH "Author"
.PP 
Generated automatically by Doxygen from the source code.