Sophie

Sophie

distrib > Mageia > 7 > x86_64 > media > nonfree-updates > by-pkgid > b86a85131cc739c1c53d0b55840a4328 > files > 3875

nvidia-cuda-toolkit-devel-10.1.168-1.2.mga7.nonfree.x86_64.rpm

.TH "Stream memory operations" 3 "24 Apr 2019" "Version 6.0" "Doxygen" \" -*- nroff -*-
.ad l
.nh
.SH NAME
Stream memory operations \- 
.SS "Functions"

.in +1c
.ti -1c
.RI "\fBCUresult\fP \fBcuStreamBatchMemOp\fP (\fBCUstream\fP stream, unsigned int count, \fBCUstreamBatchMemOpParams\fP *paramArray, unsigned int flags)"
.br
.RI "\fIBatch operations to synchronize the stream via memory operations. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuStreamWaitValue32\fP (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint32_t value, unsigned int flags)"
.br
.RI "\fIWait on a memory location. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuStreamWaitValue64\fP (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint64_t value, unsigned int flags)"
.br
.RI "\fIWait on a memory location. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuStreamWriteValue32\fP (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint32_t value, unsigned int flags)"
.br
.RI "\fIWrite a value to memory. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuStreamWriteValue64\fP (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint64_t value, unsigned int flags)"
.br
.RI "\fIWrite a value to memory. \fP"
.in -1c
.SH "Detailed Description"
.PP 
\\brief Stream memory operations of the low-level CUDA driver API (\fBcuda.h\fP)
.PP
This section describes the stream memory operations of the low-level CUDA driver application programming interface.
.PP
The whole set of operations is disabled by default. Users are required to explicitly enable them, e.g. on Linux by passing the kernel module parameter shown below: modprobe nvidia NVreg_EnableStreamMemOPs=1 There is currently no way to enable these operations on other operating systems.
.PP
Users can programmatically query whether the device supports these operations with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS\fP.
.PP
Support for the \fBCU_STREAM_WAIT_VALUE_NOR\fP flag can be queried with \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_WAIT_VALUE_NOR\fP.
.PP
Support for the \fBcuStreamWriteValue64()\fP and \fBcuStreamWaitValue64()\fP functions, as well as for the \fBCU_STREAM_MEM_OP_WAIT_VALUE_64\fP and \fBCU_STREAM_MEM_OP_WRITE_VALUE_64\fP flags, can be queried with \fBCU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS\fP.
.PP
Support for both \fBCU_STREAM_WAIT_VALUE_FLUSH\fP and \fBCU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES\fP requires dedicated platform hardware features and can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_FLUSH_REMOTE_WRITES\fP.
.PP
Note that all memory pointers passed as parameters to these operations are device pointers. Where necessary a device pointer should be obtained, for example with \fBcuMemHostGetDevicePointer()\fP.
.PP
None of the operations accepts pointers to managed memory buffers (\fBcuMemAllocManaged\fP). 
.SH "Function Documentation"
.PP 
.SS "\fBCUresult\fP cuStreamBatchMemOp (\fBCUstream\fP stream, unsigned int count, \fBCUstreamBatchMemOpParams\fP * paramArray, unsigned int flags)"
.PP
This is a batch version of \fBcuStreamWaitValue32()\fP and \fBcuStreamWriteValue32()\fP. Batching operations may avoid some performance overhead in both the API call and the device execution versus adding them to the stream in separate API calls. The operations are enqueued in the order they appear in the array.
.PP
See \fBCUstreamBatchMemOpType\fP for the full set of supported operations, and \fBcuStreamWaitValue32()\fP, \fBcuStreamWaitValue64()\fP, \fBcuStreamWriteValue32()\fP, and \fBcuStreamWriteValue64()\fP for details of specific operations.
.PP
Basic support for this can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS\fP. See related APIs for details on querying support for specific operations.
.PP
\fBParameters:\fP
.RS 4
\fIstream\fP The stream to enqueue the operations in. 
.br
\fIcount\fP The number of operations in the array. Must be less than 256. 
.br
\fIparamArray\fP The types and parameters of the individual operations. 
.br
\fIflags\fP Reserved for future expansion; must be 0.
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_NOT_SUPPORTED\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuStreamWaitValue32\fP, \fBcuStreamWaitValue64\fP, \fBcuStreamWriteValue32\fP, \fBcuStreamWriteValue64\fP, \fBcuMemHostRegister\fP 
.RE
.PP

.SS "\fBCUresult\fP cuStreamWaitValue32 (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint32_t value, unsigned int flags)"
.PP
Enqueues a synchronization of the stream on the given memory location. Work ordered after the operation will block until the given condition on the memory is satisfied. By default, the condition is to wait for (int32_t)(*addr - value) >= 0, a cyclic greater-or-equal. Other condition types can be specified via \fCflags\fP.
.PP
If the memory was registered via \fBcuMemHostRegister()\fP, the device pointer should be obtained with \fBcuMemHostGetDevicePointer()\fP. This function cannot be used with managed memory (\fBcuMemAllocManaged\fP).
.PP
Support for this can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS\fP.
.PP
Support for CU_STREAM_WAIT_VALUE_NOR can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_WAIT_VALUE_NOR\fP.
.PP
\fBParameters:\fP
.RS 4
\fIstream\fP The stream to synchronize on the memory location. 
.br
\fIaddr\fP The memory location to wait on. 
.br
\fIvalue\fP The value to compare with the memory location. 
.br
\fIflags\fP See \fBCUstreamWaitValue_flags\fP.
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_NOT_SUPPORTED\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuStreamWaitValue64\fP, \fBcuStreamWriteValue32\fP, \fBcuStreamWriteValue64\fP \fBcuStreamBatchMemOp\fP, \fBcuMemHostRegister\fP, \fBcuStreamWaitEvent\fP 
.RE
.PP

.SS "\fBCUresult\fP cuStreamWaitValue64 (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint64_t value, unsigned int flags)"
.PP
Enqueues a synchronization of the stream on the given memory location. Work ordered after the operation will block until the given condition on the memory is satisfied. By default, the condition is to wait for (int64_t)(*addr - value) >= 0, a cyclic greater-or-equal. Other condition types can be specified via \fCflags\fP.
.PP
If the memory was registered via \fBcuMemHostRegister()\fP, the device pointer should be obtained with \fBcuMemHostGetDevicePointer()\fP.
.PP
Support for this can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS\fP.
.PP
\fBParameters:\fP
.RS 4
\fIstream\fP The stream to synchronize on the memory location. 
.br
\fIaddr\fP The memory location to wait on. 
.br
\fIvalue\fP The value to compare with the memory location. 
.br
\fIflags\fP See \fBCUstreamWaitValue_flags\fP.
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_NOT_SUPPORTED\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuStreamWaitValue32\fP, \fBcuStreamWriteValue32\fP, \fBcuStreamWriteValue64\fP, \fBcuStreamBatchMemOp\fP, \fBcuMemHostRegister\fP, \fBcuStreamWaitEvent\fP 
.RE
.PP

.SS "\fBCUresult\fP cuStreamWriteValue32 (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint32_t value, unsigned int flags)"
.PP
Write a value to memory. Unless the \fBCU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER\fP flag is passed, the write is preceded by a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream rather than a CUDA thread.
.PP
If the memory was registered via \fBcuMemHostRegister()\fP, the device pointer should be obtained with \fBcuMemHostGetDevicePointer()\fP. This function cannot be used with managed memory (\fBcuMemAllocManaged\fP).
.PP
Support for this can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS\fP.
.PP
\fBParameters:\fP
.RS 4
\fIstream\fP The stream to do the write in. 
.br
\fIaddr\fP The device address to write to. 
.br
\fIvalue\fP The value to write. 
.br
\fIflags\fP See \fBCUstreamWriteValue_flags\fP.
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_NOT_SUPPORTED\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuStreamWriteValue64\fP, \fBcuStreamWaitValue32\fP, \fBcuStreamWaitValue64\fP, \fBcuStreamBatchMemOp\fP, \fBcuMemHostRegister\fP, \fBcuEventRecord\fP 
.RE
.PP

.SS "\fBCUresult\fP cuStreamWriteValue64 (\fBCUstream\fP stream, \fBCUdeviceptr\fP addr, cuuint64_t value, unsigned int flags)"
.PP
Write a value to memory. Unless the \fBCU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER\fP flag is passed, the write is preceded by a system-wide memory fence, equivalent to a __threadfence_system() but scoped to the stream rather than a CUDA thread.
.PP
If the memory was registered via \fBcuMemHostRegister()\fP, the device pointer should be obtained with \fBcuMemHostGetDevicePointer()\fP.
.PP
Support for this can be queried with \fBcuDeviceGetAttribute()\fP and \fBCU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS\fP.
.PP
\fBParameters:\fP
.RS 4
\fIstream\fP The stream to do the write in. 
.br
\fIaddr\fP The device address to write to. 
.br
\fIvalue\fP The value to write. 
.br
\fIflags\fP See \fBCUstreamWriteValue_flags\fP.
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_NOT_SUPPORTED\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuStreamWriteValue32\fP, \fBcuStreamWaitValue32\fP, \fBcuStreamWaitValue64\fP, \fBcuStreamBatchMemOp\fP, \fBcuMemHostRegister\fP, \fBcuEventRecord\fP 
.RE
.PP

.SH "Author"
.PP 
Generated automatically by Doxygen from the source code.