rocBLAS User Guide¶
Contents:
- 1. Getting Started Guide
- 2. Installation and Building for Linux
- 3. Installation and Building for Windows
- 4. API Reference Guide
- 5. Using rocBLAS API
- 5.1. rocBLAS Datatypes
- 5.2. rocBLAS Enumeration
- 5.3. rocBLAS Helper functions
- 5.4. rocBLAS Level-1 functions
- 5.4.1. rocblas_iXamax + batched, strided_batched
- 5.4.2. rocblas_iXamin + batched, strided_batched
- 5.4.3. rocblas_Xasum + batched, strided_batched
- 5.4.4. rocblas_Xaxpy + batched, strided_batched
- 5.4.5. rocblas_Xcopy + batched, strided_batched
- 5.4.6. rocblas_Xdot + batched, strided_batched
- 5.4.7. rocblas_Xnrm2 + batched, strided_batched
- 5.4.8. rocblas_Xrot + batched, strided_batched
- 5.4.9. rocblas_Xrotg + batched, strided_batched
- 5.4.10. rocblas_Xrotm + batched, strided_batched
- 5.4.11. rocblas_Xrotmg + batched, strided_batched
- 5.4.12. rocblas_Xscal + batched, strided_batched
- 5.4.13. rocblas_Xswap + batched, strided_batched
- 5.5. rocBLAS Level-2 functions
- 5.5.1. rocblas_Xgbmv + batched, strided_batched
- 5.5.2. rocblas_Xgemv + batched, strided_batched
- 5.5.3. rocblas_Xger + batched, strided_batched
- 5.5.4. rocblas_Xsbmv + batched, strided_batched
- 5.5.5. rocblas_Xspmv + batched, strided_batched
- 5.5.6. rocblas_Xspr + batched, strided_batched
- 5.5.7. rocblas_Xspr2 + batched, strided_batched
- 5.5.8. rocblas_Xsymv + batched, strided_batched
- 5.5.9. rocblas_Xsyr + batched, strided_batched
- 5.5.10. rocblas_Xsyr2 + batched, strided_batched
- 5.5.11. rocblas_Xtbmv + batched, strided_batched
- 5.5.12. rocblas_Xtbsv + batched, strided_batched
- 5.5.13. rocblas_Xtpmv + batched, strided_batched
- 5.5.14. rocblas_Xtpsv + batched, strided_batched
- 5.5.15. rocblas_Xtrmv + batched, strided_batched
- 5.5.16. rocblas_Xtrsv + batched, strided_batched
- 5.5.17. rocblas_Xhemv + batched, strided_batched
- 5.5.18. rocblas_Xhbmv + batched, strided_batched
- 5.5.19. rocblas_Xhpmv + batched, strided_batched
- 5.5.20. rocblas_Xher + batched, strided_batched
- 5.5.21. rocblas_Xher2 + batched, strided_batched
- 5.5.22. rocblas_Xhpr + batched, strided_batched
- 5.5.23. rocblas_Xhpr2 + batched, strided_batched
- 5.6. rocBLAS Level-3 functions
- 5.6.1. rocblas_Xgemm + batched, strided_batched
- 5.6.2. rocblas_Xsymm + batched, strided_batched
- 5.6.3. rocblas_Xsyrk + batched, strided_batched
- 5.6.4. rocblas_Xsyr2k + batched, strided_batched
- 5.6.5. rocblas_Xsyrkx + batched, strided_batched
- 5.6.6. rocblas_Xtrmm + batched, strided_batched
- 5.6.7. rocblas_Xtrsm + batched, strided_batched
- 5.6.8. rocblas_Xhemm + batched, strided_batched
- 5.6.9. rocblas_Xherk + batched, strided_batched
- 5.6.10. rocblas_Xher2k + batched, strided_batched
- 5.6.11. rocblas_Xherkx + batched, strided_batched
- 5.6.12. rocblas_Xtrtri + batched, strided_batched
- 5.7. rocBLAS Extension
- 5.7.1. rocblas_axpy_ex + batched, strided_batched
- 5.7.2. rocblas_dot_ex + batched, strided_batched
- 5.7.3. rocblas_dotc_ex + batched, strided_batched
- 5.7.4. rocblas_nrm2_ex + batched, strided_batched
- 5.7.5. rocblas_rot_ex + batched, strided_batched
- 5.7.6. rocblas_scal_ex + batched, strided_batched
- 5.7.7. rocblas_gemm_ex + batched, strided_batched
- 5.7.8. rocblas_gemm_ext2
- 5.7.9. rocblas_trsm_ex + batched, strided_batched
- 5.7.10. rocblas_Xgeam + batched, strided_batched
- 5.7.11. rocblas_Xdgmm + batched, strided_batched
- 5.8. rocBLAS Beta Features
- 5.9. Graph Support for rocBLAS
- 5.10. Device Memory Allocation in rocBLAS
- 5.10.1. Environment Variable for Preallocating
- 5.10.2. Functions for Manually Setting Memory Size
- 5.10.3. Function for Setting User Owned Workspace
- 5.10.4. Functions for Finding How Much Memory Is Required
- 5.10.5. rocBLAS Function Return Values for Insufficient Device Memory
- 5.10.6. Stream-Ordered Memory Allocation
- 5.11. Logging in rocBLAS
- 6. Programmer’s Guide
- 6.1. Library Source Code Organization
- 6.2. Handle, Stream, and Device Management
- 6.3. Device Memory Allocation
- 6.4. Thread Safe Logging
- 6.5. rocBLAS Numerical Checking
- 6.6. rocBLAS Order of Argument Checking and Logging
- 6.6.1. Legacy BLAS
- 6.6.2. rocBLAS
- 6.6.3. rocBLAS has the Following Differences When Compared To Legacy BLAS
- 6.6.4. To Accommodate the Additions
- 6.6.5. Device Memory Size Queries
- 6.6.6. rocBLAS Control Flow
- 6.6.7. Legacy L1 BLAS “single vector”
- 6.6.8. Legacy L1 BLAS “two vector”
- 6.6.9. Legacy L2 BLAS
- 6.6.10. Legacy L3 BLAS
- 6.7. rocBLAS Benchmarking and Testing
- 7. Contributor’s Guide
- 8. Acknowledgement
- 9. Disclaimer