Sophie

Sophie

distrib > Mageia > 7 > x86_64 > media > nonfree-updates > by-pkgid > b86a85131cc739c1c53d0b55840a4328 > files > 3893

nvidia-cuda-toolkit-devel-10.1.168-1.2.mga7.nonfree.x86_64.rpm

.TH "Unified Addressing" 3 "24 Apr 2019" "Version 6.0" "Doxygen" \" -*- nroff -*-
.ad l
.nh
.SH NAME
Unified Addressing \- 
.SS "Functions"

.in +1c
.ti -1c
.RI "\fBCUresult\fP \fBcuMemAdvise\fP (\fBCUdeviceptr\fP devPtr, size_t count, \fBCUmem_advise\fP advice, \fBCUdevice\fP device)"
.br
.RI "\fIAdvise about the usage of a given memory range. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuMemPrefetchAsync\fP (\fBCUdeviceptr\fP devPtr, size_t count, \fBCUdevice\fP dstDevice, \fBCUstream\fP hStream)"
.br
.RI "\fIPrefetches memory to the specified destination device. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuMemRangeGetAttribute\fP (void *data, size_t dataSize, \fBCUmem_range_attribute\fP attribute, \fBCUdeviceptr\fP devPtr, size_t count)"
.br
.RI "\fIQuery an attribute of a given memory range. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuMemRangeGetAttributes\fP (void **data, size_t *dataSizes, \fBCUmem_range_attribute\fP *attributes, size_t numAttributes, \fBCUdeviceptr\fP devPtr, size_t count)"
.br
.RI "\fIQuery attributes of a given memory range. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuPointerGetAttribute\fP (void *data, \fBCUpointer_attribute\fP attribute, \fBCUdeviceptr\fP ptr)"
.br
.RI "\fIReturns information about a pointer. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuPointerGetAttributes\fP (unsigned int numAttributes, \fBCUpointer_attribute\fP *attributes, void **data, \fBCUdeviceptr\fP ptr)"
.br
.RI "\fIReturns information about a pointer. \fP"
.ti -1c
.RI "\fBCUresult\fP \fBcuPointerSetAttribute\fP (const void *value, \fBCUpointer_attribute\fP attribute, \fBCUdeviceptr\fP ptr)"
.br
.RI "\fISet attributes on a previously allocated memory region. \fP"
.in -1c
.SH "Detailed Description"
.PP 
\\brief unified addressing functions of the low-level CUDA driver API (\fBcuda.h\fP)
.PP
This section describes the unified addressing functions of the low-level CUDA driver application programming interface.
.SH "Overview"
.PP
CUDA devices can share a unified address space with the host. For these devices there is no distinction between a device pointer and a host pointer -- the same pointer value may be used to access memory from the host program and from a kernel running on the device (with exceptions enumerated below).
.SH "Supported Platforms"
.PP
Whether or not a device supports unified addressing may be queried by calling \fBcuDeviceGetAttribute()\fP with the device attribute \fBCU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING\fP.
.PP
Unified addressing is automatically enabled in 64-bit processes
.SH "Looking Up Information from Pointer Values"
.PP
It is possible to look up information about the memory which backs a pointer value. For instance, one may want to know if a pointer points to host or device memory. As another example, in the case of device memory, one may want to know on which CUDA device the memory resides. These properties may be queried using the function \fBcuPointerGetAttribute()\fP
.PP
Since pointers are unique, it is not necessary to specify information about the pointers specified to the various copy functions in the CUDA API. The function \fBcuMemcpy()\fP may be used to perform a copy between two pointers, ignoring whether they point to host or device memory (making \fBcuMemcpyHtoD()\fP, \fBcuMemcpyDtoD()\fP, and \fBcuMemcpyDtoH()\fP unnecessary for devices supporting unified addressing). For multidimensional copies, the memory type \fBCU_MEMORYTYPE_UNIFIED\fP may be used to specify that the CUDA driver should infer the location of the pointer from its value.
.SH "Automatic Mapping of Host Allocated Host Memory"
.PP
All host memory allocated in all contexts using \fBcuMemAllocHost()\fP and \fBcuMemHostAlloc()\fP is always directly accessible from all contexts on all devices that support unified addressing. This is the case regardless of whether or not the flags \fBCU_MEMHOSTALLOC_PORTABLE\fP and \fBCU_MEMHOSTALLOC_DEVICEMAP\fP are specified.
.PP
The pointer value through which allocated host memory may be accessed in kernels on all devices that support unified addressing is the same as the pointer value through which that memory is accessed on the host, so it is not necessary to call \fBcuMemHostGetDevicePointer()\fP to get the device pointer for these allocations.
.PP
Note that this is not the case for memory allocated using the flag \fBCU_MEMHOSTALLOC_WRITECOMBINED\fP, as discussed below.
.SH "Automatic Registration of Peer Memory"
.PP
Upon enabling direct access from a context that supports unified addressing to another peer context that supports unified addressing using \fBcuCtxEnablePeerAccess()\fP all memory allocated in the peer context using \fBcuMemAlloc()\fP and \fBcuMemAllocPitch()\fP will immediately be accessible by the current context. The device pointer value through which any peer memory may be accessed in the current context is the same pointer value through which that memory may be accessed in the peer context.
.SH "Exceptions, Disjoint Addressing"
.PP
Not all memory may be accessed on devices through the same pointer value through which they are accessed on the host. These exceptions are host memory registered using \fBcuMemHostRegister()\fP and host memory allocated using the flag \fBCU_MEMHOSTALLOC_WRITECOMBINED\fP. For these exceptions, there exists a distinct host and device address for the memory. The device address is guaranteed to not overlap any valid host pointer range and is guaranteed to have the same value across all contexts that support unified addressing.
.PP
This device address may be queried using \fBcuMemHostGetDevicePointer()\fP when a context using unified addressing is current. Either the host or the unified device pointer value may be used to refer to this memory through \fBcuMemcpy()\fP and similar functions using the \fBCU_MEMORYTYPE_UNIFIED\fP memory type. 
.SH "Function Documentation"
.PP 
.SS "\fBCUresult\fP cuMemAdvise (\fBCUdeviceptr\fP devPtr, size_t count, \fBCUmem_advise\fP advice, \fBCUdevice\fP device)"
.PP
Advise the Unified Memory subsystem about the usage pattern for the memory range starting at \fCdevPtr\fP with a size of \fCcount\fP bytes. The start address and end address of the memory range will be rounded down and rounded up respectively to be aligned to CPU page size before the advice is applied. The memory range must refer to managed memory allocated via \fBcuMemAllocManaged\fP or declared via __managed__ variables. The memory range could also refer to system-allocated pageable memory provided it represents a valid, host-accessible region of memory and all additional constraints imposed by \fCadvice\fP as outlined below are also satisfied. Specifying an invalid system-allocated pageable memory range results in an error being returned.
.PP
The \fCadvice\fP parameter can take the following values:
.IP "\(bu" 2
\fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP: This implies that the data is mostly going to be read from and only occasionally written to. Any read accesses from any processor to this region will create a read-only copy of at least the accessed pages in that processor's memory. Additionally, if \fBcuMemPrefetchAsync\fP is called on this region, it will create a read-only copy of the data on the destination processor. If any processor writes to this region, all copies of the corresponding page will be invalidated except for the one where the write occurred. The \fCdevice\fP argument is ignored for this advice. Note that for a page to be read-duplicated, the accessing processor must either be the CPU or a GPU that has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP. Also, if a context is created on a device that does not have the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP set, then read-duplication will not occur until all such contexts are destroyed. If the memory region refers to valid system-allocated pageable memory, then the accessing device must have a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS\fP for a read-only copy to be created on that device. Note however that if the accessing device also has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES\fP, then setting this advice will not create a read-only copy when that device accesses this memory region.
.PP
.PP
.IP "\(bu" 2
\fBCU_MEM_ADVISE_UNSET_READ_MOSTLY\fP: Undoes the effect of \fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP and also prevents the Unified Memory driver from attempting heuristic read-duplication on the memory range. Any read-duplicated copies of the data will be collapsed into a single copy. The location for the collapsed copy will be the preferred location if the page has a preferred location and one of the read-duplicated copies was resident at that location. Otherwise, the location chosen is arbitrary.
.PP
.PP
.IP "\(bu" 2
\fBCU_MEM_ADVISE_SET_PREFERRED_LOCATION\fP: This advice sets the preferred location for the data to be the memory belonging to \fCdevice\fP. Passing in CU_DEVICE_CPU for \fCdevice\fP sets the preferred location as host memory. If \fCdevice\fP is a GPU, then it must have a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP. Setting the preferred location does not cause data to migrate to that location immediately. Instead, it guides the migration policy when a fault occurs on that memory region. If the data is already in its preferred location and the faulting processor can establish a mapping without requiring the data to be migrated, then data migration will be avoided. On the other hand, if the data is not in its preferred location or if a direct mapping cannot be established, then it will be migrated to the processor accessing it. It is important to note that setting the preferred location does not prevent data prefetching done using \fBcuMemPrefetchAsync\fP. Having a preferred location can override the page thrash detection and resolution logic in the Unified Memory driver. Normally, if a page is detected to be constantly thrashing between for example host and device memory, the page may eventually be pinned to host memory by the Unified Memory driver. But if the preferred location is set as device memory, then the page will continue to thrash indefinitely. If \fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP is also set on this memory region or any subset of it, then the policies associated with that advice will override the policies of this advice, unless read accesses from \fCdevice\fP will not result in a read-only copy being created on that device as outlined in description for the advice \fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP. If the memory region refers to valid system-allocated pageable memory, then \fCdevice\fP must have a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS\fP. Additionally, if \fCdevice\fP has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES\fP, then this call has no effect. Note however that this behavior may change in the future.
.PP
.PP
.IP "\(bu" 2
\fBCU_MEM_ADVISE_UNSET_PREFERRED_LOCATION\fP: Undoes the effect of \fBCU_MEM_ADVISE_SET_PREFERRED_LOCATION\fP and changes the preferred location to none.
.PP
.PP
.IP "\(bu" 2
\fBCU_MEM_ADVISE_SET_ACCESSED_BY\fP: This advice implies that the data will be accessed by \fCdevice\fP. Passing in \fBCU_DEVICE_CPU\fP for \fCdevice\fP will set the advice for the CPU. If \fCdevice\fP is a GPU, then the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP must be non-zero. This advice does not cause data migration and has no impact on the location of the data per se. Instead, it causes the data to always be mapped in the specified processor's page tables, as long as the location of the data permits a mapping to be established. If the data gets migrated for any reason, the mappings are updated accordingly. This advice is recommended in scenarios where data locality is not important, but avoiding faults is. Consider for example a system containing multiple GPUs with peer-to-peer access enabled, where the data located on one GPU is occasionally accessed by peer GPUs. In such scenarios, migrating data over to the other GPUs is not as important because the accesses are infrequent and the overhead of migration may be too high. But preventing faults can still help improve performance, and so having a mapping set up in advance is useful. Note that on CPU access of this data, the data may be migrated to host memory because the CPU typically cannot access device memory directly. Any GPU that had the \fBCU_MEM_ADVISE_SET_ACCESSED_BY\fP flag set for this data will now have its mapping updated to point to the page in host memory. If \fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP is also set on this memory region or any subset of it, then the policies associated with that advice will override the policies of this advice. Additionally, if the preferred location of this memory region or any subset of it is also \fCdevice\fP, then the policies associated with \fBCU_MEM_ADVISE_SET_PREFERRED_LOCATION\fP will override the policies of this advice. If the memory region refers to valid system-allocated pageable memory, then \fCdevice\fP must have a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS\fP. Additionally, if \fCdevice\fP has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES\fP, then this call has no effect.
.PP
.PP
.IP "\(bu" 2
\fBCU_MEM_ADVISE_UNSET_ACCESSED_BY\fP: Undoes the effect of \fBCU_MEM_ADVISE_SET_ACCESSED_BY\fP. Any mappings to the data from \fCdevice\fP may be removed at any time causing accesses to result in non-fatal page faults. If the memory region refers to valid system-allocated pageable memory, then \fCdevice\fP must have a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS\fP. Additionally, if \fCdevice\fP has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES\fP, then this call has no effect.
.PP
.PP
\fBParameters:\fP
.RS 4
\fIdevPtr\fP - Pointer to memory to set the advice for 
.br
\fIcount\fP - Size in bytes of the memory range 
.br
\fIadvice\fP - Advice to be applied for the specified memory range 
.br
\fIdevice\fP - Device to apply the advice for
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches. 
.PP
This function exhibits  behavior for most use cases. 
.PP
This function uses standard  semantics.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuMemcpy\fP, \fBcuMemcpyPeer\fP, \fBcuMemcpyAsync\fP, \fBcuMemcpy3DPeerAsync\fP, \fBcuMemPrefetchAsync\fP, cudaMemAdvise 
.RE
.PP

.SS "\fBCUresult\fP cuMemPrefetchAsync (\fBCUdeviceptr\fP devPtr, size_t count, \fBCUdevice\fP dstDevice, \fBCUstream\fP hStream)"
.PP
Prefetches memory to the specified destination device. \fCdevPtr\fP is the base device pointer of the memory to be prefetched and \fCdstDevice\fP is the destination device. \fCcount\fP specifies the number of bytes to copy. \fChStream\fP is the stream in which the operation is enqueued. The memory range must refer to managed memory allocated via \fBcuMemAllocManaged\fP or declared via __managed__ variables.
.PP
Passing in CU_DEVICE_CPU for \fCdstDevice\fP will prefetch the data to host memory. If \fCdstDevice\fP is a GPU, then the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP must be non-zero. Additionally, \fChStream\fP must be associated with a device that has a non-zero value for the device attribute \fBCU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS\fP.
.PP
The start address and end address of the memory range will be rounded down and rounded up respectively to be aligned to CPU page size before the prefetch operation is enqueued in the stream.
.PP
If no physical memory has been allocated for this region, then this memory region will be populated and mapped on the destination device. If there's insufficient memory to prefetch the desired region, the Unified Memory driver may evict pages from other \fBcuMemAllocManaged\fP allocations to host memory in order to make room. Device memory allocated using \fBcuMemAlloc\fP or \fBcuArrayCreate\fP will not be evicted.
.PP
By default, any mappings to the previous location of the migrated pages are removed and mappings for the new location are only setup on \fCdstDevice\fP. The exact behavior however also depends on the settings applied to this memory range via \fBcuMemAdvise\fP as described below:
.PP
If \fBCU_MEM_ADVISE_SET_READ_MOSTLY\fP was set on any subset of this memory range, then that subset will create a read-only copy of the pages on \fCdstDevice\fP.
.PP
If \fBCU_MEM_ADVISE_SET_PREFERRED_LOCATION\fP was called on any subset of this memory range, then the pages will be migrated to \fCdstDevice\fP even if \fCdstDevice\fP is not the preferred location of any pages in the memory range.
.PP
If \fBCU_MEM_ADVISE_SET_ACCESSED_BY\fP was called on any subset of this memory range, then mappings to those pages from all the appropriate processors are updated to refer to the new location if establishing such a mapping is possible. Otherwise, those mappings are cleared.
.PP
Note that this API is not required for functionality and only serves to improve performance by allowing the application to migrate data to a suitable location before it is accessed. Memory accesses to this range are always coherent and are allowed even when the data is actively being migrated.
.PP
Note that this function is asynchronous with respect to the host and all work on other devices.
.PP
\fBParameters:\fP
.RS 4
\fIdevPtr\fP - Pointer to be prefetched 
.br
\fIcount\fP - Size in bytes 
.br
\fIdstDevice\fP - Destination device to prefetch to 
.br
\fIhStream\fP - Stream to enqueue prefetch operation
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches. 
.PP
This function exhibits  behavior for most use cases. 
.PP
This function uses standard  semantics.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuMemcpy\fP, \fBcuMemcpyPeer\fP, \fBcuMemcpyAsync\fP, \fBcuMemcpy3DPeerAsync\fP, \fBcuMemAdvise\fP, cudaMemPrefetchAsync 
.RE
.PP

.SS "\fBCUresult\fP cuMemRangeGetAttribute (void * data, size_t dataSize, \fBCUmem_range_attribute\fP attribute, \fBCUdeviceptr\fP devPtr, size_t count)"
.PP
Query an attribute about the memory range starting at \fCdevPtr\fP with a size of \fCcount\fP bytes. The memory range must refer to managed memory allocated via \fBcuMemAllocManaged\fP or declared via __managed__ variables.
.PP
The \fCattribute\fP parameter can take the following values:
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_READ_MOSTLY\fP: If this attribute is specified, \fCdata\fP will be interpreted as a 32-bit integer, and \fCdataSize\fP must be 4. The result returned will be 1 if all pages in the given memory range have read-duplication enabled, or 0 otherwise.
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_PREFERRED_LOCATION\fP: If this attribute is specified, \fCdata\fP will be interpreted as a 32-bit integer, and \fCdataSize\fP must be 4. The result returned will be a GPU device id if all pages in the memory range have that GPU as their preferred location, or it will be CU_DEVICE_CPU if all pages in the memory range have the CPU as their preferred location, or it will be CU_DEVICE_INVALID if either all the pages don't have the same preferred location or some of the pages don't have a preferred location at all. Note that the actual location of the pages in the memory range at the time of the query may be different from the preferred location.
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_ACCESSED_BY\fP: If this attribute is specified, \fCdata\fP will be interpreted as an array of 32-bit integers, and \fCdataSize\fP must be a non-zero multiple of 4. The result returned will be a list of device ids that had \fBCU_MEM_ADVISE_SET_ACCESSED_BY\fP set for that entire memory range. If any device does not have that advice set for the entire memory range, that device will not be included. If \fCdata\fP is larger than the number of devices that have that advice set for that memory range, CU_DEVICE_INVALID will be returned in all the extra space provided. For ex., if \fCdataSize\fP is 12 (i.e. \fCdata\fP has 3 elements) and only device 0 has the advice set, then the result returned will be { 0, CU_DEVICE_INVALID, CU_DEVICE_INVALID }. If \fCdata\fP is smaller than the number of devices that have that advice set, then only as many devices will be returned as can fit in the array. There is no guarantee on which specific devices will be returned, however.
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION\fP: If this attribute is specified, \fCdata\fP will be interpreted as a 32-bit integer, and \fCdataSize\fP must be 4. The result returned will be the last location to which all pages in the memory range were prefetched explicitly via \fBcuMemPrefetchAsync\fP. This will either be a GPU id or CU_DEVICE_CPU depending on whether the last location for prefetch was a GPU or the CPU respectively. If any page in the memory range was never explicitly prefetched or if all pages were not prefetched to the same location, CU_DEVICE_INVALID will be returned. Note that this simply returns the last location that the applicaton requested to prefetch the memory range to. It gives no indication as to whether the prefetch operation to that location has completed or even begun.
.PP
.PP
\fBParameters:\fP
.RS 4
\fIdata\fP - A pointers to a memory location where the result of each attribute query will be written to. 
.br
\fIdataSize\fP - Array containing the size of data 
.br
\fIattribute\fP - The attribute to query 
.br
\fIdevPtr\fP - Start of the range to query 
.br
\fIcount\fP - Size of the range to query
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches. 
.PP
This function exhibits  behavior for most use cases. 
.PP
This function uses standard  semantics.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuMemRangeGetAttributes\fP, \fBcuMemPrefetchAsync\fP, \fBcuMemAdvise\fP, cudaMemRangeGetAttribute 
.RE
.PP

.SS "\fBCUresult\fP cuMemRangeGetAttributes (void ** data, size_t * dataSizes, \fBCUmem_range_attribute\fP * attributes, size_t numAttributes, \fBCUdeviceptr\fP devPtr, size_t count)"
.PP
Query attributes of the memory range starting at \fCdevPtr\fP with a size of \fCcount\fP bytes. The memory range must refer to managed memory allocated via \fBcuMemAllocManaged\fP or declared via __managed__ variables. The \fCattributes\fP array will be interpreted to have \fCnumAttributes\fP entries. The \fCdataSizes\fP array will also be interpreted to have \fCnumAttributes\fP entries. The results of the query will be stored in \fCdata\fP.
.PP
The list of supported attributes are given below. Please refer to \fBcuMemRangeGetAttribute\fP for attribute descriptions and restrictions.
.PP
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_READ_MOSTLY\fP
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_PREFERRED_LOCATION\fP
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_ACCESSED_BY\fP
.IP "\(bu" 2
\fBCU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION\fP
.PP
.PP
\fBParameters:\fP
.RS 4
\fIdata\fP - A two-dimensional array containing pointers to memory locations where the result of each attribute query will be written to. 
.br
\fIdataSizes\fP - Array containing the sizes of each result 
.br
\fIattributes\fP - An array of attributes to query (numAttributes and the number of attributes in this array should match) 
.br
\fInumAttributes\fP - Number of attributes to query 
.br
\fIdevPtr\fP - Start of the range to query 
.br
\fIcount\fP - Size of the range to query
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_DEINITIALIZED\fP, \fBCUDA_ERROR_INVALID_CONTEXT\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuMemRangeGetAttribute\fP, \fBcuMemAdvise\fP \fBcuMemPrefetchAsync\fP, cudaMemRangeGetAttributes 
.RE
.PP

.SS "\fBCUresult\fP cuPointerGetAttribute (void * data, \fBCUpointer_attribute\fP attribute, \fBCUdeviceptr\fP ptr)"
.PP
The supported attributes are:
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_CONTEXT\fP:
.PP
.PP
Returns in \fC*data\fP the \fBCUcontext\fP in which \fCptr\fP was allocated or registered. The type of \fCdata\fP must be \fBCUcontext\fP *.
.PP
If \fCptr\fP was not allocated by, mapped by, or registered with a \fBCUcontext\fP which uses unified virtual addressing then \fBCUDA_ERROR_INVALID_VALUE\fP is returned.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_MEMORY_TYPE\fP:
.PP
.PP
Returns in \fC*data\fP the physical memory type of the memory that \fCptr\fP addresses as a \fBCUmemorytype\fP enumerated value. The type of \fCdata\fP must be unsigned int.
.PP
If \fCptr\fP addresses device memory then \fC*data\fP is set to \fBCU_MEMORYTYPE_DEVICE\fP. The particular \fBCUdevice\fP on which the memory resides is the \fBCUdevice\fP of the \fBCUcontext\fP returned by the \fBCU_POINTER_ATTRIBUTE_CONTEXT\fP attribute of \fCptr\fP.
.PP
If \fCptr\fP addresses host memory then \fC*data\fP is set to \fBCU_MEMORYTYPE_HOST\fP.
.PP
If \fCptr\fP was not allocated by, mapped by, or registered with a \fBCUcontext\fP which uses unified virtual addressing then \fBCUDA_ERROR_INVALID_VALUE\fP is returned.
.PP
If the current \fBCUcontext\fP does not support unified virtual addressing then \fBCUDA_ERROR_INVALID_CONTEXT\fP is returned.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_DEVICE_POINTER\fP:
.PP
.PP
Returns in \fC*data\fP the device pointer value through which \fCptr\fP may be accessed by kernels running in the current \fBCUcontext\fP. The type of \fCdata\fP must be CUdeviceptr *.
.PP
If there exists no device pointer value through which kernels running in the current \fBCUcontext\fP may access \fCptr\fP then \fBCUDA_ERROR_INVALID_VALUE\fP is returned.
.PP
If there is no current \fBCUcontext\fP then \fBCUDA_ERROR_INVALID_CONTEXT\fP is returned.
.PP
Except in the exceptional disjoint addressing cases discussed below, the value returned in \fC*data\fP will equal the input value \fCptr\fP.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_HOST_POINTER\fP:
.PP
.PP
Returns in \fC*data\fP the host pointer value through which \fCptr\fP may be accessed by by the host program. The type of \fCdata\fP must be void **. If there exists no host pointer value through which the host program may directly access \fCptr\fP then \fBCUDA_ERROR_INVALID_VALUE\fP is returned.
.PP
Except in the exceptional disjoint addressing cases discussed below, the value returned in \fC*data\fP will equal the input value \fCptr\fP.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_P2P_TOKENS\fP:
.PP
.PP
Returns in \fC*data\fP two tokens for use with the nv-p2p.h Linux kernel interface. \fCdata\fP must be a struct of type \fBCUDA_POINTER_ATTRIBUTE_P2P_TOKENS\fP.
.PP
\fCptr\fP must be a pointer to memory obtained from :\fBcuMemAlloc()\fP. Note that p2pToken and vaSpaceToken are only valid for the lifetime of the source allocation. A subsequent allocation at the same address may return completely different tokens. Querying this attribute has a side effect of setting the attribute \fBCU_POINTER_ATTRIBUTE_SYNC_MEMOPS\fP for the region of memory that \fCptr\fP points to.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_SYNC_MEMOPS\fP:
.PP
.PP
A boolean attribute which when set, ensures that synchronous memory operations initiated on the region of memory that \fCptr\fP points to will always synchronize. See further documentation in the section titled 'API synchronization behavior' to learn more about cases when synchronous memory operations can exhibit asynchronous behavior.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_BUFFER_ID\fP:
.PP
.PP
Returns in \fC*data\fP a buffer ID which is guaranteed to be unique within the process. \fCdata\fP must point to an unsigned long long.
.PP
\fCptr\fP must be a pointer to memory obtained from a CUDA memory allocation API. Every memory allocation from any of the CUDA memory allocation APIs will have a unique ID over a process lifetime. Subsequent allocations do not reuse IDs from previous freed allocations. IDs are only unique within a single process.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_IS_MANAGED\fP:
.PP
.PP
Returns in \fC*data\fP a boolean that indicates whether the pointer points to managed memory or not.
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_DEVICE_ORDINAL\fP:
.PP
.PP
Returns in \fC*data\fP an integer representing a device ordinal of a device against which the memory was allocated or registered.
.PP
\fB\fP.RS 4
.RE
.PP
Note that for most allocations in the unified virtual address space the host and device pointer for accessing the allocation will be the same. The exceptions to this are
.IP "\(bu" 2
user memory registered using \fBcuMemHostRegister\fP
.IP "\(bu" 2
host memory allocated using \fBcuMemHostAlloc\fP with the \fBCU_MEMHOSTALLOC_WRITECOMBINED\fP flag For these types of allocation there will exist separate, disjoint host and device addresses for accessing the allocation. In particular
.IP "\(bu" 2
The host address will correspond to an invalid unmapped device address (which will result in an exception if accessed from the device)
.IP "\(bu" 2
The device address will correspond to an invalid unmapped host address (which will result in an exception if accessed from the host). For these types of allocations, querying \fBCU_POINTER_ATTRIBUTE_HOST_POINTER\fP and \fBCU_POINTER_ATTRIBUTE_DEVICE_POINTER\fP may be used to retrieve the host and device addresses from either address.
.PP
.PP
\fBParameters:\fP
.RS 4
\fIdata\fP - Returned pointer attribute value 
.br
\fIattribute\fP - Pointer attribute to query 
.br
\fIptr\fP - Pointer
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_DEINITIALIZED\fP, \fBCUDA_ERROR_NOT_INITIALIZED\fP, \fBCUDA_ERROR_INVALID_CONTEXT\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuPointerSetAttribute\fP, \fBcuMemAlloc\fP, \fBcuMemFree\fP, \fBcuMemAllocHost\fP, \fBcuMemFreeHost\fP, \fBcuMemHostAlloc\fP, \fBcuMemHostRegister\fP, \fBcuMemHostUnregister\fP, cudaPointerGetAttributes 
.RE
.PP

.SS "\fBCUresult\fP cuPointerGetAttributes (unsigned int numAttributes, \fBCUpointer_attribute\fP * attributes, void ** data, \fBCUdeviceptr\fP ptr)"
.PP
The supported attributes are (refer to \fBcuPointerGetAttribute\fP for attribute descriptions and restrictions):
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_CONTEXT\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_MEMORY_TYPE\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_DEVICE_POINTER\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_HOST_POINTER\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_SYNC_MEMOPS\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_BUFFER_ID\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_IS_MANAGED\fP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_DEVICE_ORDINAL\fP
.PP
.PP
\fBParameters:\fP
.RS 4
\fInumAttributes\fP - Number of attributes to query 
.br
\fIattributes\fP - An array of attributes to query (numAttributes and the number of attributes in this array should match) 
.br
\fIdata\fP - A two-dimensional array containing pointers to memory locations where the result of each attribute query will be written to. 
.br
\fIptr\fP - Pointer to query
.RE
.PP
Unlike \fBcuPointerGetAttribute\fP, this function will not return an error when the \fCptr\fP encountered is not a valid CUDA pointer. Instead, the attributes are assigned default NULL values and CUDA_SUCCESS is returned.
.PP
If \fCptr\fP was not allocated by, mapped by, or registered with a \fBCUcontext\fP which uses UVA (Unified Virtual Addressing), \fBCUDA_ERROR_INVALID_CONTEXT\fP is returned.
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_DEINITIALIZED\fP, \fBCUDA_ERROR_INVALID_CONTEXT\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuPointerGetAttribute\fP, \fBcuPointerSetAttribute\fP, cudaPointerGetAttributes 
.RE
.PP

.SS "\fBCUresult\fP cuPointerSetAttribute (const void * value, \fBCUpointer_attribute\fP attribute, \fBCUdeviceptr\fP ptr)"
.PP
The supported attributes are:
.PP
.IP "\(bu" 2
\fBCU_POINTER_ATTRIBUTE_SYNC_MEMOPS\fP:
.PP
.PP
A boolean attribute that can either be set (1) or unset (0). When set, the region of memory that \fCptr\fP points to is guaranteed to always synchronize memory operations that are synchronous. If there are some previously initiated synchronous memory operations that are pending when this attribute is set, the function does not return until those memory operations are complete. See further documentation in the section titled 'API synchronization behavior' to learn more about cases when synchronous memory operations can exhibit asynchronous behavior. \fCvalue\fP will be considered as a pointer to an unsigned integer to which this attribute is to be set.
.PP
\fBParameters:\fP
.RS 4
\fIvalue\fP - Pointer to memory containing the value to be set 
.br
\fIattribute\fP - Pointer attribute to set 
.br
\fIptr\fP - Pointer to a memory region allocated using CUDA memory allocation APIs
.RE
.PP
\fBReturns:\fP
.RS 4
\fBCUDA_SUCCESS\fP, \fBCUDA_ERROR_DEINITIALIZED\fP, \fBCUDA_ERROR_NOT_INITIALIZED\fP, \fBCUDA_ERROR_INVALID_CONTEXT\fP, \fBCUDA_ERROR_INVALID_VALUE\fP, \fBCUDA_ERROR_INVALID_DEVICE\fP 
.RE
.PP
\fBNote:\fP
.RS 4
Note that this function may also return error codes from previous, asynchronous launches.
.RE
.PP
\fBSee also:\fP
.RS 4
\fBcuPointerGetAttribute\fP, \fBcuPointerGetAttributes\fP, \fBcuMemAlloc\fP, \fBcuMemFree\fP, \fBcuMemAllocHost\fP, \fBcuMemFreeHost\fP, \fBcuMemHostAlloc\fP, \fBcuMemHostRegister\fP, \fBcuMemHostUnregister\fP 
.RE
.PP

.SH "Author"
.PP 
Generated automatically by Doxygen from the source code.