.TH "Occupancy" 3 "24 Apr 2019" "Version 6.0" "Doxygen" \" -*- nroff -*- .ad l .nh .SH NAME Occupancy \- .SS "Functions" .in +1c .ti -1c .RI "__cudart_builtin__ \fBcudaError_t\fP \fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize)" .br .RI "\fIReturns occupancy for a device function. \fP" .ti -1c .RI "__cudart_builtin__ \fBcudaError_t\fP \fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP (int *numBlocks, const void *func, int blockSize, size_t dynamicSMemSize, unsigned int flags)" .br .RI "\fIReturns occupancy for a device function with the specified flags. \fP" .in -1c .SH "Detailed Description" .PP \\brief occupancy calculation functions of the CUDA runtime API (cuda_runtime_api.h) .PP This section describes the occupancy calculation functions of the CUDA runtime application programming interface. .PP Besides the occupancy calculator functions (\fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP and \fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP), there are also C++ only occupancy-based launch configuration functions documented in \fBC++ API Routines\fP module. .PP See \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP .SH "Function Documentation" .PP .SS "__cudart_builtin__ \fBcudaError_t\fP cudaOccupancyMaxActiveBlocksPerMultiprocessor (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize)" .PP Returns in \fC*numBlocks\fP the maximum number of active blocks per streaming multiprocessor for the device function. .PP \fBParameters:\fP .RS 4 \fInumBlocks\fP - Returned occupancy .br \fIfunc\fP - Kernel function for which occupancy is calculated .br \fIblockSize\fP - Block size the kernel is intended to be launched with .br \fIdynamicSMemSize\fP - Per-block dynamic shared memory usage intended, in bytes .RE .PP \fBReturns:\fP .RS 4 \fBcudaSuccess\fP, \fBcudaErrorInvalidDevice\fP, \fBcudaErrorInvalidDeviceFunction\fP, \fBcudaErrorInvalidValue\fP, \fBcudaErrorUnknown\fP, .RE .PP \fBNote:\fP .RS 4 Note that this function may also return error codes from previous, asynchronous launches. .RE .PP \fBSee also:\fP .RS 4 \fBcudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API)\fP, cuOccupancyMaxActiveBlocksPerMultiprocessor .RE .PP .SS "__cudart_builtin__ \fBcudaError_t\fP cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int * numBlocks, const void * func, int blockSize, size_t dynamicSMemSize, unsigned int flags)" .PP Returns in \fC*numBlocks\fP the maximum number of active blocks per streaming multiprocessor for the device function. .PP The \fCflags\fP parameter controls how special cases are handled. Valid flags include: .PP .IP "\(bu" 2 \fBcudaOccupancyDefault\fP: keeps the default behavior as \fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP .PP .PP .IP "\(bu" 2 \fBcudaOccupancyDisableCachingOverride\fP: This flag suppresses the default behavior on platform where global caching affects occupancy. On such platforms, if caching is enabled, but per-block SM resource usage would result in zero occupancy, the occupancy calculator will calculate the occupancy as if caching is disabled. Setting this flag makes the occupancy calculator to return 0 in such cases. More information can be found about this feature in the 'Unified L1/Texture Cache' section of the Maxwell tuning guide. .PP .PP \fBParameters:\fP .RS 4 \fInumBlocks\fP - Returned occupancy .br \fIfunc\fP - Kernel function for which occupancy is calculated .br \fIblockSize\fP - Block size the kernel is intended to be launched with .br \fIdynamicSMemSize\fP - Per-block dynamic shared memory usage intended, in bytes .br \fIflags\fP - Requested behavior for the occupancy calculator .RE .PP \fBReturns:\fP .RS 4 \fBcudaSuccess\fP, \fBcudaErrorInvalidDevice\fP, \fBcudaErrorInvalidDeviceFunction\fP, \fBcudaErrorInvalidValue\fP, \fBcudaErrorUnknown\fP, .RE .PP \fBNote:\fP .RS 4 Note that this function may also return error codes from previous, asynchronous launches. .RE .PP \fBSee also:\fP .RS 4 \fBcudaOccupancyMaxActiveBlocksPerMultiprocessor\fP, \fBcudaOccupancyMaxPotentialBlockSize (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeWithFlags (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMem (C++ API)\fP, \fBcudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags (C++ API)\fP, cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags .RE .PP .SH "Author" .PP Generated automatically by Doxygen from the source code.