rocBLAS Types¶
Definitions¶
rocblas_int¶

typedef int32_t
rocblas_int
¶ To specify whether int32 is used for LP64 or int64 is used for ILP64.
rocblas_stride¶

typedef int64_t
rocblas_stride
¶ Stride between matrices or vectors in strided_batched functions.
rocblas_bfloat16¶

struct
rocblas_bfloat16
¶ Struct to represent a 16 bit Brain floatingpoint number.
rocblas_float_complex¶

struct
rocblas_float_complex
¶ Struct to represent a complex number with single precision real and imaginary parts.
rocblas_double_complex¶

struct
rocblas_double_complex
¶ Struct to represent a complex number with double precision real and imaginary parts.
rocblas_handle¶

typedef struct _rocblas_handle *
rocblas_handle
¶ rocblas_handle is a structure holding the rocblas library context. It must be initialized using rocblas_create_handle() and the returned handle must be passed to all subsequent library function calls. It should be destroyed at the end using rocblas_destroy_handle().
Enums¶
Enumeration constants have numbering that is consistent with CBLAS, ACML and most standard C BLAS libraries.
rocblas_operation¶

enum
rocblas_operation
¶ Used to specify whether the matrix is to be transposed or not.
parameter constants. numbering is consistent with CBLAS, ACML and most standard C BLAS libraries
Values:

enumerator
rocblas_operation_none
¶ Operate with the matrix.

enumerator
rocblas_operation_transpose
¶ Operate with the transpose of the matrix.

enumerator
rocblas_operation_conjugate_transpose
¶ Operate with the conjugate transpose of the matrix.

enumerator
rocblas_fill¶
rocblas_diagonal¶
rocblas_side¶

enum
rocblas_side
¶ Indicates the side matrix A is located relative to matrix B during multiplication.
Values:

enumerator
rocblas_side_left
¶ Multiply general matrix by symmetric, Hermitian or triangular matrix on the left.

enumerator
rocblas_side_right
¶ Multiply general matrix by symmetric, Hermitian or triangular matrix on the right.

enumerator
rocblas_side_both
¶

enumerator
rocblas_status¶

enum
rocblas_status
¶ rocblas status codes definition
Values:

enumerator
rocblas_status_success
¶ success

enumerator
rocblas_status_invalid_handle
¶ handle not initialized, invalid or null

enumerator
rocblas_status_not_implemented
¶ function is not implemented

enumerator
rocblas_status_invalid_pointer
¶ invalid pointer argument

enumerator
rocblas_status_invalid_size
¶ invalid size argument

enumerator
rocblas_status_memory_error
¶ failed internal memory allocation, copy or dealloc

enumerator
rocblas_status_internal_error
¶ other internal library failure

enumerator
rocblas_status_perf_degraded
¶ performance degraded due to low device memory

enumerator
rocblas_status_size_query_mismatch
¶ unmatched start/stop size query

enumerator
rocblas_status_size_increased
¶ queried device memory size increased

enumerator
rocblas_status_size_unchanged
¶ queried device memory size unchanged

enumerator
rocblas_status_invalid_value
¶ passed argument not valid

enumerator
rocblas_status_continue
¶ nothing preventing function to proceed

enumerator
rocblas_status_check_numerics_fail
¶ will be set if the vector/matrix has a NaN or an Infinity

enumerator
rocblas_datatype¶

enum
rocblas_datatype
¶ Indicates the precision width of data stored in a blas type.
Values:

enumerator
rocblas_datatype_f16_r
¶ 16 bit floating point, real

enumerator
rocblas_datatype_f32_r
¶ 32 bit floating point, real

enumerator
rocblas_datatype_f64_r
¶ 64 bit floating point, real

enumerator
rocblas_datatype_f16_c
¶ 16 bit floating point, complex

enumerator
rocblas_datatype_f32_c
¶ 32 bit floating point, complex

enumerator
rocblas_datatype_f64_c
¶ 64 bit floating point, complex

enumerator
rocblas_datatype_i8_r
¶ 8 bit signed integer, real

enumerator
rocblas_datatype_u8_r
¶ 8 bit unsigned integer, real

enumerator
rocblas_datatype_i32_r
¶ 32 bit signed integer, real

enumerator
rocblas_datatype_u32_r
¶ 32 bit unsigned integer, real

enumerator
rocblas_datatype_i8_c
¶ 8 bit signed integer, complex

enumerator
rocblas_datatype_u8_c
¶ 8 bit unsigned integer, complex

enumerator
rocblas_datatype_i32_c
¶ 32 bit signed integer, complex

enumerator
rocblas_datatype_u32_c
¶ 32 bit unsigned integer, complex

enumerator
rocblas_datatype_bf16_r
¶ 16 bit bfloat, real

enumerator
rocblas_datatype_bf16_c
¶ 16 bit bfloat, complex

enumerator
rocblas_pointer_mode¶

enum
rocblas_pointer_mode
¶ Indicates if scalar pointers are on host or device. This is used for scalars alpha and beta and for scalar function return values.
Values:

enumerator
rocblas_pointer_mode_host
¶ Scalar values affected by this variable will be located on the host.

enumerator
rocblas_pointer_mode_device
¶ Scalar values affected by this variable will be located on the device.

enumerator
rocblas_atomics_mode¶

enum
rocblas_atomics_mode
¶ Indicates if atomics operations are allowed. Not allowing atomic operations may generally improve determinism and repeatability of results at a cost of performance.
Values:

enumerator
rocblas_atomics_not_allowed
¶ Algorithms will refrain from atomics where applicable.

enumerator
rocblas_atomics_allowed
¶ Algorithms will take advantage of atomics where applicable.

enumerator
rocblas_layer_mode¶

enum
rocblas_layer_mode
¶ Indicates if layer is active with bitmask.
Values:

enumerator
rocblas_layer_mode_none
¶ No logging will take place.

enumerator
rocblas_layer_mode_log_trace
¶ A line containing the function name and value of arguments passed will be printed with each rocBLAS function call.

enumerator
rocblas_layer_mode_log_bench
¶ Outputs a line each time a rocBLAS function is called, this line can be used with rocblasbench to make the same call again.

enumerator
rocblas_layer_mode_log_profile
¶ Outputs a YAML description of each rocBLAS function called, along with its arguments and number of times it was called.

enumerator
rocBLAS Functions¶
Level 1 BLAS¶
rocblas_iXamax + batched, strided_batched¶

rocblas_status
rocblas_isamax
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_idamax
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_icamax
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_izamax
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_int *result)¶ BLAS Level 1 API.
amax finds the first index of the element of maximum magnitude of a vector x. vector
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the amax index. return is 0.0 if n, incx<=0.

rocblas_status
rocblas_isamax_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_idamax_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_icamax_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_izamax_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶ BLAS Level 1 API.
amax_batched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
batch_count – [in] [rocblas_int] number of instances in the batch, must be > 0.
result – [out] device or host array of pointers of batch_count size for results. return is 0 if n, incx<=0.

rocblas_status
rocblas_isamax_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_idamax_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_icamax_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_izamax_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶ BLAS Level 1 API.
amax_strided_batched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [rocblas_stride] specifies the pointer increment between one x_i and the next x_(i + 1).
batch_count – [in] [rocblas_int] number of instances in the batch
result – [out] device or host pointer for storing contiguous batch_count results. return is 0 if n <= 0, incx<=0.
rocblas_iXamin + batched, strided_batched¶

rocblas_status
rocblas_isamin
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_idamin
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_icamin
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_int *result)¶

rocblas_status
rocblas_izamin
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_int *result)¶ BLAS Level 1 API.
amin finds the first index of the element of minimum magnitude of a vector x.
vector
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the amin index. return is 0.0 if n, incx<=0.

rocblas_status
rocblas_isamin_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_idamin_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_icamin_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_izamin_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)¶ BLAS Level 1 API.
amin_batched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
batch_count – [in] [rocblas_int] number of instances in the batch, must be > 0.
result – [out] device or host pointers to array of batch_count size for results. return is 0 if n, incx<=0.

rocblas_status
rocblas_isamin_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_idamin_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_icamin_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶

rocblas_status
rocblas_izamin_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)¶ BLAS Level 1 API.
amin_strided_batched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [rocblas_stride] specifies the pointer increment between one x_i and the next x_(i + 1)
batch_count – [in] [rocblas_int] number of instances in the batch
result – [out] device or host pointer to array for storing contiguous batch_count results. return is 0 if n <= 0, incx<=0.
rocblas_Xasum + batched, strided_batched¶

rocblas_status
rocblas_sasum
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *result)¶

rocblas_status
rocblas_dasum
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *result)¶

rocblas_status
rocblas_scasum
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, float *result)¶

rocblas_status
rocblas_dzasum
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, double *result)¶ BLAS Level 1 API.
asum computes the sum of the magnitudes of elements of a real vector x, or the sum of magnitudes of the real and imaginary parts of elements if x is a complex vector
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x and y.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x. incx must be > 0.
result – [inout] device pointer or host pointer to store the asum product. return is 0.0 if n <= 0.

rocblas_status
rocblas_sasum_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dasum_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, double *results)¶

rocblas_status
rocblas_scasum_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dzasum_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, double *results)¶ BLAS Level 1 API.
asum_batched computes the sum of the magnitudes of the elements in a batch of real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
results – [out] device array or host array of batch_count size for results. return is 0.0 if n, incx<=0.
batch_count – [in] [rocblas_int] number of instances in the batch.

rocblas_status
rocblas_sasum_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dasum_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)¶

rocblas_status
rocblas_scasum_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dzasum_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)¶ BLAS Level 1 API.
asum_strided_batched computes the sum of the magnitudes of elements of a real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each vector x_i
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.
results – [out] device pointer or host pointer to array for storing contiguous batch_count results. return is 0.0 if n, incx<=0.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xaxpy + batched, strided_batched¶

rocblas_status
rocblas_saxpy
(rocblas_handle handle, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, float *y, rocblas_int incy)¶

rocblas_status
rocblas_daxpy
(rocblas_handle handle, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, double *y, rocblas_int incy)¶

rocblas_status
rocblas_haxpy
(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *x, rocblas_int incx, rocblas_half *y, rocblas_int incy)¶

rocblas_status
rocblas_caxpy
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zaxpy
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 1 API.
axpy computes constant alpha multiplied by vector x, plus vector y
y := alpha * x + y
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x and y.
alpha – [in] device pointer or host pointer to specify the scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [out] device pointer storing vector y.
incy – [inout] [rocblas_int] specifies the increment for the elements of y.

rocblas_status
rocblas_saxpy_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_daxpy_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_haxpy_batched
(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *const x[], rocblas_int incx, rocblas_half *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_caxpy_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zaxpy_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 1 API.
axpy_batched compute y := alpha * x + y over a set of batched vectors.
 Parameters
handle – [in] rocblas_handle handle to the rocblas library context queue.
n – [in] rocblas_int
alpha – [in] specifies the scalar alpha.
x – [in] pointer storing vector x on the GPU.
incx – [in] rocblas_int specifies the increment for the elements of x.
y – [out] pointer storing vector y on the GPU.
incy – [inout] rocblas_int specifies the increment for the elements of y.
batch_count – [in] rocblas_int number of instances in the batch

rocblas_status
rocblas_saxpy_strided_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_daxpy_strided_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_haxpy_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *x, rocblas_int incx, rocblas_stride stridex, rocblas_half *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_caxpy_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_zaxpy_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 1 API.
axpy_strided_batched compute y := alpha * x + y over a set of strided batched vectors.
 Parameters
handle – [in] rocblas_handle handle to the rocblas library context queue.
n – [in] rocblas_int
alpha – [in] specifies the scalar alpha.
x – [in] pointer storing vector x on the GPU.
incx – [in] rocblas_int specifies the increment for the elements of x.
stridex – [in] rocblas_stride specifies the increment between vectors of x.
y – [out] pointer storing vector y on the GPU.
incy – [inout] rocblas_int specifies the increment for the elements of y.
stridey – [in] rocblas_stride specifies the increment between vectors of y.
batch_count – [in] rocblas_int number of instances in the batch
rocblas_Xcopy + batched, strided_batched¶

rocblas_status
rocblas_scopy
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dcopy
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *y, rocblas_int incy)¶

rocblas_status
rocblas_ccopy
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zcopy
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 1 API.
copy copies each element x[i] into y[i], for i = 1 , … , n
y := x,
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x to be copied to y.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [out] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.

rocblas_status
rocblas_scopy_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dcopy_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_ccopy_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zcopy_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 1 API.
copy_batched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batch_count
y_i := x_i,
where (x_i, y_i) is the ith instance of the batch. x_i and y_i are vectors.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i to be copied to y_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i.
y – [out] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i.
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_scopy_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dcopy_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_ccopy_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_zcopy_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 1 API.
copy_strided_batched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batch_count
y_i := x_i,
where (x_i, y_i) is the ith instance of the batch. x_i and y_i are vectors.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i to be copied to y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [rocblas_int] specifies the increments for the elements of vectors x_i.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.
y – [out] device pointer to the first vector (y_1) in the batch.
incy – [in] [rocblas_int] specifies the increment for the elements of vectors y_i.
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, however the user should take care to ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy. stridey should be non zero.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xdot + batched, strided_batched¶

rocblas_status
rocblas_sdot
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, const float *y, rocblas_int incy, float *result)¶

rocblas_status
rocblas_ddot
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, const double *y, rocblas_int incy, double *result)¶

rocblas_status
rocblas_hdot
(rocblas_handle handle, rocblas_int n, const rocblas_half *x, rocblas_int incx, const rocblas_half *y, rocblas_int incy, rocblas_half *result)¶

rocblas_status
rocblas_bfdot
(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *x, rocblas_int incx, const rocblas_bfloat16 *y, rocblas_int incy, rocblas_bfloat16 *result)¶

rocblas_status
rocblas_cdotu
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *result)¶

rocblas_status
rocblas_cdotc
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *result)¶

rocblas_status
rocblas_zdotu
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *result)¶

rocblas_status
rocblas_zdotc
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *result)¶ BLAS Level 1 API.
dot(u) performs the dot product of vectors x and y
result = x * y;
dotc performs the dot product of the conjugate of complex vector x and complex vector y
result = conjugate (x) * y;
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x and y.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of y.
y – [in] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the dot product. return is 0.0 if n <= 0.

rocblas_status
rocblas_sdot_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, const float *const y[], rocblas_int incy, rocblas_int batch_count, float *result)¶

rocblas_status
rocblas_ddot_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, const double *const y[], rocblas_int incy, rocblas_int batch_count, double *result)¶

rocblas_status
rocblas_hdot_batched
(rocblas_handle handle, rocblas_int n, const rocblas_half *const x[], rocblas_int incx, const rocblas_half *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_half *result)¶

rocblas_status
rocblas_bfdot_batched
(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *const x[], rocblas_int incx, const rocblas_bfloat16 *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_bfloat16 *result)¶

rocblas_status
rocblas_cdotu_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_float_complex *result)¶

rocblas_status
rocblas_cdotc_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_float_complex *result)¶

rocblas_status
rocblas_zdotu_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_double_complex *result)¶

rocblas_status
rocblas_zdotc_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_double_complex *result)¶ BLAS Level 1 API.
dot_batched(u) performs a batch of dot products of vectors x and y
result_i = x_i * y_i;
dotc_batched performs a batch of dot products of the conjugate of complex vector x and complex vector y
result_i = conjugate (x_i) * y_i;
where (x_i, y_i) is the ith instance of the batch. x_i and y_i are vectors, for i = 1, …, batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i and y_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
batch_count – [in] [rocblas_int] number of instances in the batch
result – [inout] device array or host array of batch_count size to store the dot products of each batch. return 0.0 for each element if n <= 0.

rocblas_status
rocblas_sdot_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, const float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, float *result)¶

rocblas_status
rocblas_ddot_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, const double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, double *result)¶

rocblas_status
rocblas_hdot_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_half *x, rocblas_int incx, rocblas_stride stridex, const rocblas_half *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_half *result)¶

rocblas_status
rocblas_bfdot_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *x, rocblas_int incx, rocblas_stride stridex, const rocblas_bfloat16 *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_bfloat16 *result)¶

rocblas_status
rocblas_cdotu_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_float_complex *result)¶

rocblas_status
rocblas_cdotc_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_float_complex *result)¶

rocblas_status
rocblas_zdotu_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_double_complex *result)¶

rocblas_status
rocblas_zdotc_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_double_complex *result)¶ BLAS Level 1 API.
dot_strided_batched(u) performs a batch of dot products of vectors x and y
result_i = x_i * y_i;
dotc_strided_batched performs a batch of dot products of the conjugate of complex vector x and complex vector y
result_i = conjugate (x_i) * y_i;
where (x_i, y_i) is the ith instance of the batch. x_i and y_i are vectors, for i = 1, …, batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i and y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1)
y – [in] device pointer to the first vector (y_1) in the batch.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch
result – [inout] device array or host array of batch_count size to store the dot products of each batch. return 0.0 for each element if n <= 0.
rocblas_Xnrm2 + batched, strided_batched¶

rocblas_status
rocblas_snrm2
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *result)¶

rocblas_status
rocblas_dnrm2
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *result)¶

rocblas_status
rocblas_scnrm2
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, float *result)¶

rocblas_status
rocblas_dznrm2
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, double *result)¶ BLAS Level 1 API.
nrm2 computes the euclidean norm of a real or complex vector
result := sqrt( x'*x ) for real vectors result := sqrt( x**H*x ) for complex vectors
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the nrm2 product. return is 0.0 if n, incx<=0.

rocblas_status
rocblas_snrm2_batched
(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dnrm2_batched
(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, double *results)¶

rocblas_status
rocblas_scnrm2_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dznrm2_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, double *results)¶ BLAS Level 1 API.
nrm2_batched computes the euclidean norm over a batch of real or complex vectors
result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batch_count result := sqrt( x_i**H*x_i ) for complex vectors x, for i = 1, ..., batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each x_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
batch_count – [in] [rocblas_int] number of instances in the batch
results – [out] device pointer or host pointer to array of batch_count size for nrm2 results. return is 0.0 for each element if n <= 0, incx<=0.

rocblas_status
rocblas_snrm2_strided_batched
(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dnrm2_strided_batched
(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)¶

rocblas_status
rocblas_scnrm2_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)¶

rocblas_status
rocblas_dznrm2_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)¶ BLAS Level 1 API.
nrm2_strided_batched computes the euclidean norm over a batch of real or complex vectors
:= sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batch_count := sqrt( x_i**H*x_i ) for complex vectors, for i = 1, ..., batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each x_i.
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.
batch_count – [in] [rocblas_int] number of instances in the batch
results – [out] device pointer or host pointer to array for storing contiguous batch_count results. return is 0.0 for each element if n <= 0, incx<=0.
rocblas_Xrot + batched, strided_batched¶

rocblas_status
rocblas_srot
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy, const float *c, const float *s)¶

rocblas_status
rocblas_drot
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy, const double *c, const double *s)¶

rocblas_status
rocblas_crot
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy, const float *c, const rocblas_float_complex *s)¶

rocblas_status
rocblas_csrot
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy, const float *c, const float *s)¶

rocblas_status
rocblas_zrot
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy, const double *c, const rocblas_double_complex *s)¶

rocblas_status
rocblas_zdrot
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy, const double *c, const double *s)¶ BLAS Level 1 API.
rot applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to vectors x and y. Scalars c and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in the x and y vectors.
x – [inout] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment between elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment between elements of y.
c – [in] device pointer or host pointer storing scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer storing scalar sine component of the rotation matrix.

rocblas_status
rocblas_srot_batched
(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, const float *c, const float *s, rocblas_int batch_count)¶

rocblas_status
rocblas_drot_batched
(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, const double *c, const double *s, rocblas_int batch_count)¶

rocblas_status
rocblas_crot_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, const float *c, const rocblas_float_complex *s, rocblas_int batch_count)¶

rocblas_status
rocblas_csrot_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, const float *c, const float *s, rocblas_int batch_count)¶

rocblas_status
rocblas_zrot_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, const double *c, const rocblas_double_complex *s, rocblas_int batch_count)¶

rocblas_status
rocblas_zdrot_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, const double *c, const double *s, rocblas_int batch_count)¶ BLAS Level 1 API.
rot_batched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to batched vectors x_i and y_i, for i = 1, …, batch_count. Scalars c and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each x_i and y_i vectors.
x – [inout] device array of deivce pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment between elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment between elements of each y_i.
c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to scalar sine component of the rotation matrix.
batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches.

rocblas_status
rocblas_srot_strided_batched
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stride_x, float *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const float *s, rocblas_int batch_count)¶

rocblas_status
rocblas_drot_strided_batched
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stride_x, double *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const double *s, rocblas_int batch_count)¶

rocblas_status
rocblas_crot_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const rocblas_float_complex *s, rocblas_int batch_count)¶

rocblas_status
rocblas_csrot_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const float *s, rocblas_int batch_count)¶

rocblas_status
rocblas_zrot_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const rocblas_double_complex *s, rocblas_int batch_count)¶

rocblas_status
rocblas_zdrot_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const double *s, rocblas_int batch_count)¶ BLAS Level 1 API.
rot_strided_batched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to strided batched vectors x_i and y_i, for i = 1, …, batch_count. Scalars c and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in each x_i and y_i vectors.
x – [inout] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment between elements of each x_i.
stride_x – [in] [rocblas_stride] specifies the increment from the beginning of x_i to the beginning of x_(i+1)
y – [inout] device pointer to the first vector y_1.
incy – [in] [rocblas_int] specifies the increment between elements of each y_i.
stride_y – [in] [rocblas_stride] specifies the increment from the beginning of y_i to the beginning of y_(i+1)
c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to scalar sine component of the rotation matrix.
batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches.
rocblas_Xrotg + batched, strided_batched¶

rocblas_status
rocblas_srotg
(rocblas_handle handle, float *a, float *b, float *c, float *s)¶

rocblas_status
rocblas_drotg
(rocblas_handle handle, double *a, double *b, double *c, double *s)¶

rocblas_status
rocblas_crotg
(rocblas_handle handle, rocblas_float_complex *a, rocblas_float_complex *b, float *c, rocblas_float_complex *s)¶

rocblas_status
rocblas_zrotg
(rocblas_handle handle, rocblas_double_complex *a, rocblas_double_complex *b, double *c, rocblas_double_complex *s)¶ BLAS Level 1 API.
rotg creates the Givens rotation matrix for the vector (a b). Scalars c and s and arrays a and b may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
a – [inout] device pointer or host pointer to input vector element, overwritten with r.
b – [inout] device pointer or host pointer to input vector element, overwritten with z.
c – [inout] device pointer or host pointer to cosine element of Givens rotation.
s – [inout] device pointer or host pointer sine element of Givens rotation.

rocblas_status
rocblas_srotg_batched
(rocblas_handle handle, float *const a[], float *const b[], float *const c[], float *const s[], rocblas_int batch_count)¶

rocblas_status
rocblas_drotg_batched
(rocblas_handle handle, double *const a[], double *const b[], double *const c[], double *const s[], rocblas_int batch_count)¶

rocblas_status
rocblas_crotg_batched
(rocblas_handle handle, rocblas_float_complex *const a[], rocblas_float_complex *const b[], float *const c[], rocblas_float_complex *const s[], rocblas_int batch_count)¶

rocblas_status
rocblas_zrotg_batched
(rocblas_handle handle, rocblas_double_complex *const a[], rocblas_double_complex *const b[], double *const c[], rocblas_double_complex *const s[], rocblas_int batch_count)¶ BLAS Level 1 API.
rotg_batched creates the Givens rotation matrix for the batched vectors (a_i b_i), for i = 1, …, batch_count. a, b, c, and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
a – [inout] device array of device pointers storing each single input vector element a_i, overwritten with r_i.
b – [inout] device array of device pointers storing each single input vector element b_i, overwritten with z_i.
c – [inout] device array of device pointers storing each cosine element of Givens rotation for the batch.
s – [inout] device array of device pointers storing each sine element of Givens rotation for the batch.
batch_count – [in] [rocblas_int] number of batches (length of arrays a, b, c, and s).

rocblas_status
rocblas_srotg_strided_batched
(rocblas_handle handle, float *a, rocblas_stride stride_a, float *b, rocblas_stride stride_b, float *c, rocblas_stride stride_c, float *s, rocblas_stride stride_s, rocblas_int batch_count)¶

rocblas_status
rocblas_drotg_strided_batched
(rocblas_handle handle, double *a, rocblas_stride stride_a, double *b, rocblas_stride stride_b, double *c, rocblas_stride stride_c, double *s, rocblas_stride stride_s, rocblas_int batch_count)¶

rocblas_status
rocblas_crotg_strided_batched
(rocblas_handle handle, rocblas_float_complex *a, rocblas_stride stride_a, rocblas_float_complex *b, rocblas_stride stride_b, float *c, rocblas_stride stride_c, rocblas_float_complex *s, rocblas_stride stride_s, rocblas_int batch_count)¶

rocblas_status
rocblas_zrotg_strided_batched
(rocblas_handle handle, rocblas_double_complex *a, rocblas_stride stride_a, rocblas_double_complex *b, rocblas_stride stride_b, double *c, rocblas_stride stride_c, rocblas_double_complex *s, rocblas_stride stride_s, rocblas_int batch_count)¶ BLAS Level 1 API.
rotg_strided_batched creates the Givens rotation matrix for the strided batched vectors (a_i b_i), for i = 1, …, batch_count. a, b, c, and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
a – [inout] device strided_batched pointer or host strided_batched pointer to first single input vector element a_1, overwritten with r.
stride_a – [in] [rocblas_stride] distance between elements of a in batch (distance between a_i and a_(i + 1))
b – [inout] device strided_batched pointer or host strided_batched pointer to first single input vector element b_1, overwritten with z.
stride_b – [in] [rocblas_stride] distance between elements of b in batch (distance between b_i and b_(i + 1))
c – [inout] device strided_batched pointer or host strided_batched pointer to first cosine element of Givens rotations c_1.
stride_c – [in] [rocblas_stride] distance between elements of c in batch (distance between c_i and c_(i + 1))
s – [inout] device strided_batched pointer or host strided_batched pointer to sine element of Givens rotations s_1.
stride_s – [in] [rocblas_stride] distance between elements of s in batch (distance between s_i and s_(i + 1))
batch_count – [in] [rocblas_int] number of batches (length of arrays a, b, c, and s).
rocblas_Xrotm + batched, strided_batched¶

rocblas_status
rocblas_srotm
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy, const float *param)¶

rocblas_status
rocblas_drotm
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy, const double *param)¶ BLAS Level 1 API.
rotm applies the modified Givens rotation matrix defined by param to vectors x and y.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in the x and y vectors.
x – [inout] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment between elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment between elements of y.
param – [in] device vector or host vector of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.

rocblas_status
rocblas_srotm_batched
(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, const float *const param[], rocblas_int batch_count)¶

rocblas_status
rocblas_drotm_batched
(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, const double *const param[], rocblas_int batch_count)¶ BLAS Level 1 API.
rotm_batched applies the modified Givens rotation matrix defined by param_i to batched vectors x_i and y_i, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in the x and y vectors.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment between elements of each x_i.
y – [inout] device array of device pointers storing each vector y_1.
incy – [in] [rocblas_int] specifies the increment between elements of each y_i.
param – [in] device array of device vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the batched version of this function.
batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches.

rocblas_status
rocblas_srotm_strided_batched
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stride_x, float *y, rocblas_int incy, rocblas_stride stride_y, const float *param, rocblas_stride stride_param, rocblas_int batch_count)¶

rocblas_status
rocblas_drotm_strided_batched
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stride_x, double *y, rocblas_int incy, rocblas_stride stride_y, const double *param, rocblas_stride stride_param, rocblas_int batch_count)¶ BLAS Level 1 API.
rotm_strided_batched applies the modified Givens rotation matrix defined by param_i to strided batched vectors x_i and y_i, for i = 1, …, batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] number of elements in the x and y vectors.
x – [inout] device pointer pointing to first strided batched vector x_1.
incx – [in] [rocblas_int] specifies the increment between elements of each x_i.
stride_x – [in] [rocblas_stride] specifies the increment between the beginning of x_i and x_(i + 1)
y – [inout] device pointer pointing to first strided batched vector y_1.
incy – [in] [rocblas_int] specifies the increment between elements of each y_i.
stride_y – [in] [rocblas_stride] specifies the increment between the beginning of y_i and y_(i + 1)
param – [in] device pointer pointing to first array of 5 elements defining the rotation (param_1). param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the strided_batched version of this function.
stride_param – [in] [rocblas_stride] specifies the increment between the beginning of param_i and param_(i + 1)
batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches.
rocblas_Xrotmg + batched, strided_batched¶

rocblas_status
rocblas_srotmg
(rocblas_handle handle, float *d1, float *d2, float *x1, const float *y1, float *param)¶

rocblas_status
rocblas_drotmg
(rocblas_handle handle, double *d1, double *d2, double *x1, const double *y1, double *param)¶ BLAS Level 1 API.
rotmg creates the modified Givens rotation matrix for the vector (d1 * x1, d2 * y1). Parameters may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
d1 – [inout] device pointer or host pointer to input scalar that is overwritten.
d2 – [inout] device pointer or host pointer to input scalar that is overwritten.
x1 – [inout] device pointer or host pointer to input scalar that is overwritten.
y1 – [in] device pointer or host pointer to input scalar.
param – [out] device vector or host vector of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.

rocblas_status
rocblas_srotmg_batched
(rocblas_handle handle, float *const d1[], float *const d2[], float *const x1[], const float *const y1[], float *const param[], rocblas_int batch_count)¶

rocblas_status
rocblas_drotmg_batched
(rocblas_handle handle, double *const d1[], double *const d2[], double *const x1[], const double *const y1[], double *const param[], rocblas_int batch_count)¶ BLAS Level 1 API.
rotmg_batched creates the modified Givens rotation matrix for the batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batch_count. Parameters may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
d1 – [inout] device batched array or host batched array of input scalars that is overwritten.
d2 – [inout] device batched array or host batched array of input scalars that is overwritten.
x1 – [inout] device batched array or host batched array of input scalars that is overwritten.
y1 – [in] device batched array or host batched array of input scalars.
param – [out] device batched array or host batched array of vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.
batch_count – [in] [rocblas_int] the number of instances in the batch.

rocblas_status
rocblas_srotmg_strided_batched
(rocblas_handle handle, float *d1, rocblas_stride stride_d1, float *d2, rocblas_stride stride_d2, float *x1, rocblas_stride stride_x1, const float *y1, rocblas_stride stride_y1, float *param, rocblas_stride stride_param, rocblas_int batch_count)¶

rocblas_status
rocblas_drotmg_strided_batched
(rocblas_handle handle, double *d1, rocblas_stride stride_d1, double *d2, rocblas_stride stride_d2, double *x1, rocblas_stride stride_x1, const double *y1, rocblas_stride stride_y1, double *param, rocblas_stride stride_param, rocblas_int batch_count)¶ BLAS Level 1 API.
rotmg_strided_batched creates the modified Givens rotation matrix for the strided batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batch_count. Parameters may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. If the pointer mode is set to rocblas_pointer_mode_host, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to rocblas_pointer_mode_device, this function returns immediately and synchronization is required to read the results.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
d1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
stride_d1 – [in] [rocblas_stride] specifies the increment between the beginning of d1_i and d1_(i+1)
d2 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
stride_d2 – [in] [rocblas_stride] specifies the increment between the beginning of d2_i and d2_(i+1)
x1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
stride_x1 – [in] [rocblas_stride] specifies the increment between the beginning of x1_i and x1_(i+1)
y1 – [in] device strided_batched array or host strided_batched array of input scalars.
stride_y1 – [in] [rocblas_stride] specifies the increment between the beginning of y1_i and y1_(i+1)
param – [out] device strided_batched array or host strided_batched array of vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = 1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 1.0 H22 ) flag = 2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode.
stride_param – [in] [rocblas_stride] specifies the increment between the beginning of param_i and param_(i + 1)
batch_count – [in] [rocblas_int] the number of instances in the batch.
rocblas_Xscal + batched, strided_batched¶

rocblas_status
rocblas_sscal
(rocblas_handle handle, rocblas_int n, const float *alpha, float *x, rocblas_int incx)¶

rocblas_status
rocblas_dscal
(rocblas_handle handle, rocblas_int n, const double *alpha, double *x, rocblas_int incx)¶

rocblas_status
rocblas_cscal
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *x, rocblas_int incx)¶

rocblas_status
rocblas_zscal
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *x, rocblas_int incx)¶

rocblas_status
rocblas_csscal
(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *x, rocblas_int incx)¶

rocblas_status
rocblas_zdscal
(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *x, rocblas_int incx)¶ BLAS Level 1 API.
scal scales each element of vector x with scalar alpha.
x := alpha * x
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x.
alpha – [in] device pointer or host pointer for the scalar alpha.
x – [inout] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.

rocblas_status
rocblas_sscal_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, float *const x[], rocblas_int incx, rocblas_int batch_count)¶

rocblas_status
rocblas_dscal_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, double *const x[], rocblas_int incx, rocblas_int batch_count)¶

rocblas_status
rocblas_cscal_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count)¶

rocblas_status
rocblas_zscal_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count)¶

rocblas_status
rocblas_csscal_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count)¶

rocblas_status
rocblas_zdscal_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count)¶ BLAS Level 1 API.
scal_batched scales each element of vector x_i with scalar alpha, for i = 1, … , batch_count.
x_i := alpha * x_i
where (x_i) is the ith instance of the batch.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i.
alpha – [in] host pointer or device pointer for the scalar alpha.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
batch_count – [in] [rocblas_int] specifies the number of batches in x.

rocblas_status
rocblas_sscal_strided_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, float *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶

rocblas_status
rocblas_dscal_strided_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, double *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶

rocblas_status
rocblas_cscal_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶

rocblas_status
rocblas_zscal_strided_batched
(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶

rocblas_status
rocblas_csscal_strided_batched
(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶

rocblas_status
rocblas_zdscal_strided_batched
(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)¶ BLAS Level 1 API.
scal_strided_batched scales each element of vector x_i with scalar alpha, for i = 1, … , batch_count.
x_i := alpha * x_i ,
where (x_i) is the ith instance of the batch.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i.
alpha – [in] host pointer or device pointer for the scalar alpha.
x – [inout] device pointer to the first vector (x_1) in the batch.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
stride_x – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.
batch_count – [in] [rocblas_int] specifies the number of batches in x.
rocblas_Xswap + batched, strided_batched¶

rocblas_status
rocblas_sswap
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dswap
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy)¶

rocblas_status
rocblas_cswap
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zswap
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 1 API.
swap interchanges vectors x and y.
y := x; x := y
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in x and y.
x – [inout] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.

rocblas_status
rocblas_sswap_batched
(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dswap_batched
(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_cswap_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zswap_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 1 API.
swap_batched interchanges vectors x_i and y_i, for i = 1 , … , batch_count
y_i := x_i; x_i := y_i
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i and y_i.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
batch_count – [in] [rocblas_int] number of instances in the batch.

rocblas_status
rocblas_sswap_strided_batched
(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dswap_strided_batched
(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_cswap_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_zswap_strided_batched
(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 1 API.
swap_strided_batched interchanges vectors x_i and y_i, for i = 1 , … , batch_count
y_i := x_i; x_i := y_i
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
n – [in] [rocblas_int] the number of elements in each x_i and y_i.
x – [inout] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.
y – [inout] device pointer to the first vector y_1.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy. stridey should be non zero.
batch_count – [in] [rocblas_int] number of instances in the batch.
Level 2 BLAS¶
rocblas_Xgbmv + batched, strided_batched¶

rocblas_status
rocblas_sgbmv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const float *alpha, const float *A, rocblas_int lda, const float *x, rocblas_int incx, const float *beta, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dgbmv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const double *alpha, const double *A, rocblas_int lda, const double *x, rocblas_int incx, const double *beta, double *y, rocblas_int incy)¶

rocblas_status
rocblas_cgbmv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zgbmv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 2 API.
xGBMV performs one of the matrixvector operations
y := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y, or y := alpha*A**H*x + beta*y,
where alpha and beta are scalars, x and y are vectors and A is an m by n banded matrix with kl subdiagonals and ku superdiagonals.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
trans – [in] [rocblas_operation] indicates whether matrix A is tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of matrix A
n – [in] [rocblas_int] number of columns of matrix A
kl – [in] [rocblas_int] number of subdiagonals of A
ku – [in] [rocblas_int] number of superdiagonals of A
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device pointer storing banded matrix A. Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first superdiagonal above on the RHS of row ku. The first subdiagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/superdiagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 > 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.
lda – [in] [rocblas_int] specifies the leading dimension of A. Must be >= (kl + ku + 1)
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.

rocblas_status
rocblas_sgbmv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const float *alpha, const float *const A[], rocblas_int lda, const float *const x[], rocblas_int incx, const float *beta, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dgbmv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const double *alpha, const double *const A[], rocblas_int lda, const double *const x[], rocblas_int incx, const double *beta, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_cgbmv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_float_complex *alpha, const rocblas_float_complex *const A[], rocblas_int lda, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zgbmv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_double_complex *alpha, const rocblas_double_complex *const A[], rocblas_int lda, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 2 API.
xGBMV_BATCHED performs one of the matrixvector operations
y_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n banded matrix with kl subdiagonals and ku superdiagonals, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
trans – [in] [rocblas_operation] indicates whether matrix A is tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of each matrix A_i
n – [in] [rocblas_int] number of columns of each matrix A_i
kl – [in] [rocblas_int] number of subdiagonals of each A_i
ku – [in] [rocblas_int] number of superdiagonals of each A_i
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device array of device pointers storing each banded matrix A_i. Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first superdiagonal above on the RHS of row ku. The first subdiagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/superdiagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 > 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i. Must be >= (kl + ku + 1)
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
batch_count – [in] [rocblas_int] specifies the number of instances in the batch.

rocblas_status
rocblas_sgbmv_strided_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const float *alpha, const float *A, rocblas_int lda, rocblas_stride stride_A, const float *x, rocblas_int incx, rocblas_stride stride_x, const float *beta, float *y, rocblas_int incy, rocblas_stride stride_y, rocblas_int batch_count)¶

rocblas_status
rocblas_dgbmv_strided_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const double *alpha, const double *A, rocblas_int lda, rocblas_stride stride_A, const double *x, rocblas_int incx, rocblas_stride stride_x, const double *beta, double *y, rocblas_int incy, rocblas_stride stride_y, rocblas_int batch_count)¶

rocblas_status
rocblas_cgbmv_strided_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, rocblas_stride stride_A, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stride_y, rocblas_int batch_count)¶

rocblas_status
rocblas_zgbmv_strided_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, rocblas_int kl, rocblas_int ku, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, rocblas_stride stride_A, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stride_y, rocblas_int batch_count)¶ BLAS Level 2 API.
xGBMV_STRIDED_BATCHED performs one of the matrixvector operations
y_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n banded matrix with kl subdiagonals and ku superdiagonals, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
trans – [in] [rocblas_operation] indicates whether matrix A is tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of matrix A
n – [in] [rocblas_int] number of columns of matrix A
kl – [in] [rocblas_int] number of subdiagonals of A
ku – [in] [rocblas_int] number of superdiagonals of A
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device pointer to first banded matrix (A_1). Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first superdiagonal above on the RHS of row ku. The first subdiagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/superdiagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 > 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.
lda – [in] [rocblas_int] specifies the leading dimension of A. Must be >= (kl + ku + 1)
stride_A – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
x – [in] device pointer to first vector (x_1).
incx – [in] [rocblas_int] specifies the increment for the elements of x.
stride_x – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1)
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer to first vector (y_1).
incy – [in] [rocblas_int] specifies the increment for the elements of y.
stride_y – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (x_i+1)
batch_count – [in] [rocblas_int] specifies the number of instances in the batch.
rocblas_Xgemv + batched, strided_batched¶

rocblas_status
rocblas_sgemv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const float *alpha, const float *A, rocblas_int lda, const float *x, rocblas_int incx, const float *beta, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dgemv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const double *alpha, const double *A, rocblas_int lda, const double *x, rocblas_int incx, const double *beta, double *y, rocblas_int incy)¶

rocblas_status
rocblas_cgemv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zgemv
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 2 API.
xGEMV performs one of the matrixvector operations
y := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y, or y := alpha*A**H*x + beta*y,
where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
trans – [in] [rocblas_operation] indicates whether matrix A is tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of matrix A
n – [in] [rocblas_int] number of columns of matrix A
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device pointer storing matrix A.
lda – [in] [rocblas_int] specifies the leading dimension of A.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.

rocblas_status
rocblas_sgemv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const float *alpha, const float *const A[], rocblas_int lda, const float *const x[], rocblas_int incx, const float *beta, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dgemv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const double *alpha, const double *const A[], rocblas_int lda, const double *const x[], rocblas_int incx, const double *beta, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_cgemv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const A[], rocblas_int lda, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zgemv_batched
(rocblas_handle handle, rocblas_operation trans, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const A[], rocblas_int lda, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 2 API.
xGEMV_BATCHED performs a batch of matrixvector operations
y_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
trans – [in] [rocblas_operation] indicates whether matrices A_i are tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of each matrix A_i
n – [in] [rocblas_int] number of columns of each matrix A_i
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device array of device pointers storing each matrix A_i.
lda – [in] [rocblas_int] specifies the leading dimension of each matrix A_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i.
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_sgemv_strided_batched
(rocblas_handle handle, rocblas_operation transA, rocblas_int m, rocblas_int n, const float *alpha, const float *A, rocblas_int lda, rocblas_stride strideA, const float *x, rocblas_int incx, rocblas_stride stridex, const float *beta, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dgemv_strided_batched
(rocblas_handle handle, rocblas_operation transA, rocblas_int m, rocblas_int n, const double *alpha, const double *A, rocblas_int lda, rocblas_stride strideA, const double *x, rocblas_int incx, rocblas_stride stridex, const double *beta, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_cgemv_strided_batched
(rocblas_handle handle, rocblas_operation transA, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_zgemv_strided_batched
(rocblas_handle handle, rocblas_operation transA, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 2 API.
xGEMV_STRIDED_BATCHED performs a batch of matrixvector operations
y_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
transA – [in] [rocblas_operation] indicates whether matrices A_i are tranposed (conjugated) or not
m – [in] [rocblas_int] number of rows of matrices A_i
n – [in] [rocblas_int] number of columns of matrices A_i
alpha – [in] device pointer or host pointer to scalar alpha.
A – [in] device pointer to the first matrix (A_1) in the batch.
lda – [in] [rocblas_int] specifies the leading dimension of matrices A_i.
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [rocblas_int] specifies the increment for the elements of vectors x_i.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size. When trans equals rocblas_operation_none this typically means stride_x >= n * incx, otherwise stride_x >= m * incx.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer to the first vector (y_1) in the batch.
incy – [in] [rocblas_int] specifies the increment for the elements of vectors y_i.
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, however the user should take care to ensure that stride_y is of appropriate size. When trans equals rocblas_operation_none this typically means stride_y >= m * incy, otherwise stride_y >= n * incy. stridey should be non zero.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xger + batched, strided_batched¶

rocblas_status
rocblas_sger
(rocblas_handle handle, rocblas_int m, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, const float *y, rocblas_int incy, float *A, rocblas_int lda)¶

rocblas_status
rocblas_dger
(rocblas_handle handle, rocblas_int m, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, const double *y, rocblas_int incy, double *A, rocblas_int lda)¶

rocblas_status
rocblas_cgeru
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *A, rocblas_int lda)¶

rocblas_status
rocblas_zgeru
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *A, rocblas_int lda)¶

rocblas_status
rocblas_cgerc
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *A, rocblas_int lda)¶

rocblas_status
rocblas_zgerc
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *A, rocblas_int lda)¶ BLAS Level 2 API.
xGER,xGERU,xGERC performs the matrixvector operations
A := A + alpha*x*y**T , OR A := A + alpha*x*y**H for xGERC
where alpha is a scalar, x and y are vectors, and A is an m by n matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
m – [in] [rocblas_int] the number of rows of the matrix A.
n – [in] [rocblas_int] the number of columns of the matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
A – [inout] device pointer storing matrix A.
lda – [in] [rocblas_int] specifies the leading dimension of A.

rocblas_status
rocblas_sger_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, const float *const y[], rocblas_int incy, float *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_dger_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, const double *const y[], rocblas_int incy, double *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_cgeru_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_float_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_zgeru_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_double_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_cgerc_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_float_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_zgerc_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_double_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶ BLAS Level 2 API.
xGER,xGERU,xGERC_BATCHED performs a batch of the matrixvector operations
A := A + alpha*x*y**T , OR A := A + alpha*x*y**H for xGERC
where (A_i, x_i, y_i) is the ith instance of the batch. alpha is a scalar, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
m – [in] [rocblas_int] the number of rows of each matrix A_i.
n – [in] [rocblas_int] the number of columns of eaceh matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i.
A – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_sger_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stridex, const float *y, rocblas_int incy, rocblas_stride stridey, float *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_dger_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stridex, const double *y, rocblas_int incy, rocblas_stride stridey, double *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_cgeru_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_zgeru_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_cgerc_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_zgerc_strided_batched
(rocblas_handle handle, rocblas_int m, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶ BLAS Level 2 API.
xGERC,xGERU,xGERC_STRIDED_BATCHED performs the matrixvector operations
A_i := A_i + alpha*x_i*y_i**T, OR A_i := A_i + alpha*x_i*y_i**H for xGERC
where (A_i, x_i, y_i) is the ith instance of the batch. alpha is a scalar, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
m – [in] [rocblas_int] the number of rows of each matrix A_i.
n – [in] [rocblas_int] the number of columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [rocblas_int] specifies the increments for the elements of each vector x_i.
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= m * incx.
y – [inout] device pointer to the first vector (y_1) in the batch.
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i.
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, however the user should take care to ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy.
A – [inout] device pointer to the first matrix (A_1) in the batch.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xsbmv + batched, strided_batched¶

rocblas_status
rocblas_ssbmv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const float *alpha, const float *A, rocblas_int lda, const float *x, rocblas_int incx, const float *beta, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dsbmv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const double *alpha, const double *A, rocblas_int lda, const double *x, rocblas_int incx, const double *beta, double *y, rocblas_int incy)¶ BLAS Level 2 API.
xSBMV performs the matrixvector operation:
y := alpha*A*x + beta*y,
where alpha and beta are scalars, x and y are n element vectors and A should contain an upper or lower triangular n by n symmetric banded matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] rocblas_fill specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int]
k – [in] [rocblas_int] specifies the number of sub and superdiagonals
alpha – [in] specifies the scalar alpha
A – [in] pointer storing matrix A on the GPU
lda – [in] [rocblas_int] specifies the leading dimension of matrix A
x – [in] pointer storing vector x on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of x
beta – [in] specifies the scalar beta
y – [out] pointer storing vector y on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of y

rocblas_status
rocblas_ssbmv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const float *alpha, const float *const A[], rocblas_int lda, const float *const x[], rocblas_int incx, const float *beta, float *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 2 API.
xSBMV_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric banded matrix, for i = 1, …, batch_count. A should contain an upper or lower triangular n by n symmetric banded matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
k – [in] [rocblas_int] specifies the number of sub and superdiagonals
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] device array of device pointers storing each matrix A_i
lda – [in] [rocblas_int] specifies the leading dimension of each matrix A_i
x – [in] device array of device pointers storing each vector x_i
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
beta – [in] device pointer or host pointer to scalar beta
y – [out] device array of device pointers storing each vector y_i
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_dsbmv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const double *alpha, const double *const A[], rocblas_int lda, const double *const x[], rocblas_int incx, const double *beta, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_ssbmv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const float *alpha, const float *A, rocblas_int lda, rocblas_stride strideA, const float *x, rocblas_int incx, rocblas_stride stridex, const float *beta, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dsbmv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, rocblas_int k, const double *alpha, const double *A, rocblas_int lda, rocblas_stride strideA, const double *x, rocblas_int incx, rocblas_stride stridex, const double *beta, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 2 API.
xSBMV_strided_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric banded matrix, for i = 1, …, batch_count. A should contain an upper or lower triangular n by n symmetric banded matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
k – [in] [rocblas_int] specifies the number of sub and superdiagonals
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] Device pointer to the first matrix A_1 on the GPU
lda – [in] [rocblas_int] specifies the leading dimension of each matrix A_i
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
x – [in] Device pointer to the first vector x_1 on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stridex, however the user should take care to ensure that stridex is of appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta
y – [out] Device pointer to the first vector y_1 on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stridey, however the user should take care to ensure that stridey is of appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xspmv + batched, strided_batched¶

rocblas_status
rocblas_sspmv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *A, const float *x, rocblas_int incx, const float *beta, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dspmv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *A, const double *x, rocblas_int incx, const double *beta, double *y, rocblas_int incy)¶ BLAS Level 2 API.
xSPMV performs the matrixvector operation:
y := alpha*A*x + beta*y,
where alpha and beta are scalars, x and y are n element vectors and A should contain an upper or lower triangular n by n packed symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] rocblas_fill specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int]
alpha – [in] specifies the scalar alpha
A – [in] pointer storing matrix A on the GPU
x – [in] pointer storing vector x on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of x
beta – [in] specifies the scalar beta
y – [out] pointer storing vector y on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of y

rocblas_status
rocblas_sspmv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const A[], const float *const x[], rocblas_int incx, const float *beta, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dspmv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const A[], const double *const x[], rocblas_int incx, const double *beta, double *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 2 API.
xSPMV_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric matrix, for i = 1, …, batch_count. A should contain an upper or lower triangular n by n packed symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] device array of device pointers storing each matrix A_i
x – [in] device array of device pointers storing each vector x_i
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
beta – [in] device pointer or host pointer to scalar beta
y – [out] device array of device pointers storing each vector y_i
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_sspmv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *A, rocblas_stride strideA, const float *x, rocblas_int incx, rocblas_stride stridex, const float *beta, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dspmv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *A, rocblas_stride strideA, const double *x, rocblas_int incx, rocblas_stride stridex, const double *beta, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 2 API.
xSPMV_strided_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric matrix, for i = 1, …, batch_count. A should contain an upper or lower triangular n by n packed symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] Device pointer to the first matrix A_1 on the GPU
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
x – [in] Device pointer to the first vector x_1 on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stridex, however the user should take care to ensure that stridex is of appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta
y – [out] Device pointer to the first vector y_1 on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stridey, however the user should take care to ensure that stridey is of appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xspr + batched, strided_batched¶

rocblas_status
rocblas_sspr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, float *AP)¶

rocblas_status
rocblas_dspr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, double *AP)¶

rocblas_status
rocblas_cspr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *AP)¶

rocblas_status
rocblas_zspr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *AP)¶ BLAS Level 2 API.
xSPR performs the matrixvector operations
A := A + alpha*x*x**T
where alpha is a scalar, x is a vector, and A is an n by n symmetric matrix, supplied in packed form.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of A is supplied in AP. rocblas_fill_lower: The lower triangular part of A is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of matrix A, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the symmetric matrix A. Of at least size ((n * (n + 1)) / 2). if uplo == rocblas_fill_upper: The upper triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(2) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0

rocblas_status
rocblas_sspr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, float *const AP[], rocblas_int batch_count)¶

rocblas_status
rocblas_dspr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, double *const AP[], rocblas_int batch_count)¶

rocblas_status
rocblas_cspr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const AP[], rocblas_int batch_count)¶

rocblas_status
rocblas_zspr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const AP[], rocblas_int batch_count)¶ BLAS Level 2 API.
xSPR_BATCHED performs the matrixvector operations
A_i := A_i + alpha*x_i*x_i**T
where alpha is a scalar, x_i is a vector, and A_i is an n by n symmetric matrix, supplied in packed form, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of each A_i is supplied in AP. rocblas_fill_lower: The lower triangular part of each A_i is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of each matrix A_i, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each symmetric matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batch_count. if uplo == rocblas_fill_upper: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(2) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0
batch_count – [in] [rocblas_int] number of instances in the batch.

rocblas_status
rocblas_sspr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stride_x, float *AP, rocblas_stride stride_A, rocblas_int batch_count)¶

rocblas_status
rocblas_dspr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stride_x, double *AP, rocblas_stride stride_A, rocblas_int batch_count)¶

rocblas_status
rocblas_cspr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_float_complex *AP, rocblas_stride stride_A, rocblas_int batch_count)¶

rocblas_status
rocblas_zspr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_double_complex *AP, rocblas_stride stride_A, rocblas_int batch_count)¶ BLAS Level 2 API.
xSPR_STRIDED_BATCHED performs the matrixvector operations
A_i := A_i + alpha*x_i*x_i**T
where alpha is a scalar, x_i is a vector, and A_i is an n by n symmetric matrix, supplied in packed form, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of each A_i is supplied in AP. rocblas_fill_lower: The lower triangular part of each A_i is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of each matrix A_i, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
stride_x – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1).
AP – [inout] device pointer storing the packed version of the specified triangular portion of each symmetric matrix A_i. Points to the first A_1. if uplo == rocblas_fill_upper: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(2) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0
stride_A – [in] [rocblas_stride] stride from the start of one (A_i) and the next (A_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch.
rocblas_Xspr2 + batched, strided_batched¶

rocblas_status
rocblas_sspr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, const float *y, rocblas_int incy, float *AP)¶

rocblas_status
rocblas_dspr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, const double *y, rocblas_int incy, double *AP)¶ BLAS Level 2 API.
xSPR2 performs the matrixvector operation
A := A + alpha*x*y**T + alpha*y*x**T
where alpha is a scalar, x and y are vectors, and A is an n by n symmetric matrix, supplied in packed form.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of A is supplied in AP. rocblas_fill_lower: The lower triangular part of A is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of matrix A, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the symmetric matrix A. Of at least size ((n * (n + 1)) / 2). if uplo == rocblas_fill_upper: The upper triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(n) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0

rocblas_status
rocblas_sspr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, const float *const y[], rocblas_int incy, float *const AP[], rocblas_int batch_count)¶

rocblas_status
rocblas_dspr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, const double *const y[], rocblas_int incy, double *const AP[], rocblas_int batch_count)¶ BLAS Level 2 API.
xSPR2_BATCHED performs the matrixvector operation
A_i := A_i + alpha*x_i*y_i**T + alpha*y*x**T
where alpha is a scalar, x_i is a vector, and A_i is an n by n symmetric matrix, supplied in packed form, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of each A_i is supplied in AP. rocblas_fill_lower: The lower triangular part of each A_i is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of each matrix A_i, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each symmetric matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batch_count. if uplo == rocblas_fill_upper: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(n) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0
batch_count – [in] [rocblas_int] number of instances in the batch.

rocblas_status
rocblas_sspr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stride_x, const float *y, rocblas_int incy, rocblas_stride stride_y, float *AP, rocblas_stride stride_A, rocblas_int batch_count)¶

rocblas_status
rocblas_dspr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stride_x, const double *y, rocblas_int incy, rocblas_stride stride_y, double *AP, rocblas_stride stride_A, rocblas_int batch_count)¶ BLAS Level 2 API.
xSPR_STRIDED_BATCHED performs the matrixvector operation
A_i := A_i + alpha*x_i*x_i**T
where alpha is a scalar, x_i is a vector, and A_i is an n by n symmetric matrix, supplied in packed form, for i = 1, …, batch_count.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ rocblas_fill_upper: The upper triangular part of each A_i is supplied in AP. rocblas_fill_lower: The lower triangular part of each A_i is supplied in AP.
n – [in] [rocblas_int] the number of rows and columns of each matrix A_i, must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
stride_x – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1).
y – [in] device pointer pointing to the first vector (y_1).
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
stride_y – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1).
AP – [inout] device pointer storing the packed version of the specified triangular portion of each symmetric matrix A_i. Points to the first A_1. if uplo == rocblas_fill_upper: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(0,1) AP(2) = A(1,1), etc. Ex: (rocblas_fill_upper; n = 4) 1 2 4 7 2 3 5 8 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 4 5 6 9 7 8 9 0 if uplo == rocblas_fill_lower: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion columnbycolumn so that: AP(0) = A(0,0) AP(1) = A(1,0) AP(n) = A(2,1), etc. Ex: (rocblas_fill_lower; n = 4) 1 2 3 4 2 5 6 7 –> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] 3 6 8 9 4 7 9 0
stride_A – [in] [rocblas_stride] stride from the start of one (A_i) and the next (A_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch.
rocblas_Xsymv + batched, strided_batched¶

rocblas_status
rocblas_ssymv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *A, rocblas_int lda, const float *x, rocblas_int incx, const float *beta, float *y, rocblas_int incy)¶

rocblas_status
rocblas_dsymv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *A, rocblas_int lda, const double *x, rocblas_int incx, const double *beta, double *y, rocblas_int incy)¶

rocblas_status
rocblas_csymv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy)¶

rocblas_status
rocblas_zsymv
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy)¶ BLAS Level 2 API.
xSYMV performs the matrixvector operation:
y := alpha*A*x + beta*y,
where alpha and beta are scalars, x and y are n element vectors and A should contain an upper or lower triangular n by n symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] rocblas_fill specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int]
alpha – [in] specifies the scalar alpha
A – [in] pointer storing matrix A on the GPU
lda – [in] [rocblas_int] specifies the leading dimension of A
x – [in] pointer storing vector x on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of x
beta – [in] specifies the scalar beta
y – [out] pointer storing vector y on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of y

rocblas_status
rocblas_ssymv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const A[], rocblas_int lda, const float *const x[], rocblas_int incx, const float *beta, float *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_dsymv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const A[], rocblas_int lda, const double *const x[], rocblas_int incx, const double *beta, double *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_csymv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const A[], rocblas_int lda, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *beta, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶

rocblas_status
rocblas_zsymv_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const A[], rocblas_int lda, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *beta, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYMV_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric matrix, for i = 1, …, batch_count. A a should contain an upper or lower triangular symmetric matrix and the opposing triangular part of A is not referenced
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] device array of device pointers storing each matrix A_i
lda – [in] [rocblas_int] specifies the leading dimension of each matrix A_i
x – [in] device array of device pointers storing each vector x_i
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
beta – [in] device pointer or host pointer to scalar beta
y – [out] device array of device pointers storing each vector y_i
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_ssymv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *A, rocblas_int lda, rocblas_stride strideA, const float *x, rocblas_int incx, rocblas_stride stridex, const float *beta, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_dsymv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *A, rocblas_int lda, rocblas_stride strideA, const double *x, rocblas_int incx, rocblas_stride stridex, const double *beta, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_csymv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *beta, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶

rocblas_status
rocblas_zsymv_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *beta, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYMV_strided_batched performs the matrixvector operation:
y_i := alpha*A_i*x_i + beta*y_i,
where (A_i, x_i, y_i) is the ith instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an n by n symmetric matrix, for i = 1, …, batch_count. A a should contain an upper or lower triangular symmetric matrix and the opposing triangular part of A is not referenced
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] number of rows and columns of each matrix A_i
alpha – [in] device pointer or host pointer to scalar alpha
A – [in] Device pointer to the first matrix A_1 on the GPU
lda – [in] [rocblas_int] specifies the leading dimension of each matrix A_i
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
x – [in] Device pointer to the first vector x_1 on the GPU
incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i
stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stridex is of appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta
y – [out] Device pointer to the first vector y_1 on the GPU
incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i
stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, however the user should take care to ensure that stridey is of appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xsyr + batched, strided_batched¶

rocblas_status
rocblas_ssyr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, float *A, rocblas_int lda)¶

rocblas_status
rocblas_dsyr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, double *A, rocblas_int lda)¶

rocblas_status
rocblas_csyr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *A, rocblas_int lda)¶

rocblas_status
rocblas_zsyr
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *A, rocblas_int lda)¶ BLAS Level 2 API.
xSYR performs the matrixvector operations
A := A + alpha*x*x**T
where alpha is a scalar, x is a vector, and A is an n by n symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
A – [inout] device pointer storing matrix A.
lda – [in] [rocblas_int] specifies the leading dimension of A.

rocblas_status
rocblas_ssyr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, float *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_dsyr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, double *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_csyr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_zsyr_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYR_batched performs a batch of matrixvector operations
A[i] := A[i] + alpha*x[i]*x[i]**T
where alpha is a scalar, x is an array of vectors, and A is an array of n by n symmetric matrices, for i = 1 , … , batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
A – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_ssyr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stridex, float *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_dsyr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stridex, double *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_csyr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_zsyr_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYR_strided_batched performs the matrixvector operations
A[i] := A[i] + alpha*x[i]*x[i]**T
where alpha is a scalar, vectors, and A is an array of n by n symmetric matrices, for i = 1 , … , batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of each matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
stridex – [in] [rocblas_stride] specifies the pointer increment between vectors (x_i) and (x_i+1).
A – [inout] device pointer to the first matrix A_1.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xsyr2 + batched, strided_batched¶

rocblas_status
rocblas_ssyr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, const float *y, rocblas_int incy, float *A, rocblas_int lda)¶

rocblas_status
rocblas_dsyr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, const double *y, rocblas_int incy, double *A, rocblas_int lda)¶

rocblas_status
rocblas_csyr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *A, rocblas_int lda)¶

rocblas_status
rocblas_zsyr2
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *A, rocblas_int lda)¶ BLAS Level 2 API.
xSYR2 performs the matrixvector operations
A := A + alpha*x*y**T + alpha*y*x**T
where alpha is a scalar, x and y are vectors, and A is an n by n symmetric matrix.
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [rocblas_int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [rocblas_int] specifies the increment for the elements of y.
A – [inout] device pointer storing matrix A.
lda – [in] [rocblas_int] specifies the leading dimension of A.

rocblas_status
rocblas_ssyr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, const float *const y[], rocblas_int incy, float *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_dsyr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, const double *const y[], rocblas_int incy, double *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_csyr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_float_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶

rocblas_status
rocblas_zsyr2_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_double_complex *const A[], rocblas_int lda, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYR2_BATCHED performs a batch of matrixvector operations
A[i] := A[i] + alpha*x[i]*y[i]**T + alpha*y[i]*x[i]**T
where alpha is a scalar, x[i] and y[i] are vectors, and A[i] is a n by n symmetric matrix, for i = 1 , … , batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
A – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
batch_count – [in] [rocblas_int] number of instances in the batch

rocblas_status
rocblas_ssyr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stridex, const float *y, rocblas_int incy, rocblas_stride stridey, float *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_dsyr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stridex, const double *y, rocblas_int incy, rocblas_stride stridey, double *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_csyr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_float_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶

rocblas_status
rocblas_zsyr2_strided_batched
(rocblas_handle handle, rocblas_fill uplo, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_double_complex *A, rocblas_int lda, rocblas_stride strideA, rocblas_int batch_count)¶ BLAS Level 2 API.
xSYR2_STRIDED_BATCHED the matrixvector operations
A[i] := A[i] + alpha*x[i]*y[i]**T + alpha*y[i]*x[i]**T
where alpha is a scalar, x[i] and y[i] are vectors, and A[i] is a n by n symmetric matrices, for i = 1 , … , batch_count
 Parameters
handle – [in] [rocblas_handle] handle to the rocblas library context queue.
uplo – [in] [rocblas_fill] specifies whether the upper ‘rocblas_fill_upper’ or lower ‘rocblas_fill_lower’ if rocblas_fill_upper, the lower part of A is not referenced if rocblas_fill_lower, the upper part of A is not referenced
n – [in] [rocblas_int] the number of rows and columns of each matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector x_1.
incx – [in] [rocblas_int] specifies the increment for the elements of each x_i.
stridex – [in] [rocblas_stride] specifies the pointer increment between vectors (x_i) and (x_i+1).
y – [in] device pointer to the first vector y_1.
incy – [in] [rocblas_int] specifies the increment for the elements of each y_i.
stridey – [in] [rocblas_stride] specifies the pointer increment between vectors (y_i) and (y_i+1).
A – [inout] device pointer to the first matrix A_1.
lda – [in] [rocblas_int] specifies the leading dimension of each A_i.
strideA – [in] [rocblas_stride] stride from the start of one matrix (A_i) and the next one (A_i+1)
batch_count – [in] [rocblas_int] number of instances in the batch
rocblas_Xtbmv + batched, strided_batched¶

rocblas_status
rocblas_stbmv
(rocblas_handle handle, rocblas_fill uplo, rocblas_operation trans,