IRIX Base Documentation 2002 November

home *** CD-ROM | disk | FTP | other *** search

/ IRIX Base Documentation 2002 November / SGI IRIX Base Documentation 2002 November.iso / usr / share / catman / p_man / cat3 / SCSL / intro_scsl.z / intro_scsl

Wrap

Text File | 2002-10-03 | 11.2 KB | 265 lines

IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) NNNNAAAAMMMMEEEE IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL - Introduction to Scientific Computing Software Library (SCSL) routines IIIIMMMMPPPPLLLLEEEEMMMMEEEENNNNTTTTAAAATTTTIIIIOOOONNNN See individual man pages for implementation details DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN The SGI Scientific Computing Software Library (SCSL) contains the following routines: * Signal processing routines (see IIIINNNNTTTTRRRROOOO____FFFFFFFFTTTT(3S) introductory man page) - Fast Fourier Transform (FFT) routines - Convolution routines - Correlation routines * Direct linear equation solvers for real and complex sparse systems with symmetric non-zero structure, and iterative solvers for real sparse systems with arbitrary structure (see the IIIINNNNTTTTRRRROOOO____SSSSOOOOLLLLVVVVEEEERRRRSSSS(3S) introductory man page) * 64-bit thread-safe parallel random number generators (see the SSSSRRRRAAAANNNNDDDD66664444(3S) man page) * Vector-vector linear algebra subprograms (see IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS1111(3S) introductory man page) - Level 1 Basic Linear Algebra Subprograms (Level 1 BLAS) * Matrix-vector linear algebra subprograms (see IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS2222(3S) introductory man page) - Level 2 Basic Linear Algebra Subprograms (Level 2 BLAS) * Matrix-matrix linear algebra subprograms (see IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS3333(3S) introductory man page) - Level 3 Basic Linear Algebra Subprograms (Level 3 BLAS) * LAPACK routines (see the IIIINNNNTTTTRRRROOOO____LLLLAAAAPPPPAAAACCCCKKKK(3S) introductory man page) The SCSL routines can be loaded by using the ----llllssssccccssss option or the ----llllssssccccssss____mmmmpppp option. The ----llllssssccccssss____mmmmpppp option directs the linker to use the multi- processor version of the library. The multi-processor version of SCSL, libscs_mp, is a Shared Memory (SMP) version that is based on libmp. libmp uses IRIX lightweight processes (sproc) to implement parallel execution. POSIX threads (pthreads) are PPPPaaaaggggeeee 1111 IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) incompatible with sproc calls. Pthreads and sproc calls have fundamentally different characteristics that prevent coexistence, such as process identity, memory, and parent-child relationships. Therefore, a program that uses the POSIX threads cannot use the multi-processor version of SCSL. When linking to SCSL with ----llllssssccccssss or ----llllssssccccssss____mmmmpppp, the default integer size is 4 bytes (32 bits). Another version of SCSL is available in which integers are 8 bytes (64 bits). This version allows the user access to larger memory sizes and helps when porting legacy Cray codes. It can be loaded by using the ----llllssssccccssss____iiii8888 option or the ----llllssssccccssss____iiii8888____mmmmpppp option. A program may use only one of the two versions; 4-byte integer and 8-byte integer library calls cannot be mixed. NNNNOOOOTTTTEEEESSSS Many of the Scientific Library routines are _m_u_l_t_i_t_a_s_k_e_d or _m_u_l_t_i_t_h_r_e_a_d_e_d. This means that a program that calls a multitasked routine will run in parallel mode and take advantage of multiple processors whenever possible, even if the program has not specifically requested multitasking. If a significant percentage of time is spent in the routine, this feature can significantly reduce wall-clock time. The following lists show the routines that are multitasked. In many cases, a real variable (single-precision) routine is paired with its complex variable equivalent. LLLLAAAAPPPPAAAACCCCKKKK rrrroooouuuuttttiiiinnnneeeessss aaaarrrreeee nnnnooootttt lllliiiisssstttteeeedddd.... Most LAPACK routines do not perform multiprocessing, but almost all LAPACK routines call Level 2 BLAS and Level 3 BLAS that do multiprocessing. The following are the multitasked Level 2 BLAS routines: SGEMV DGEMV CGEMV ZGEMV SGBMV DGBMV CGBMV ZGBMV CHEMV ZHEMV CHBMV ZHBMV CHPMV ZHPMV SSPMV DSPMV STRSV DTRSV CTRSV ZTRSV The following are the multitasked Level 3 BLAS routines: SGEMM DGEMM CGEMM ZGEMM CGEMM3M ZGEMM3M STRMM DTRMM ZTRMM STRSM DTRSM CTRSM ZTRSM CHERK ZHERK PPPPaaaaggggeeee 2222 IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) The following are the GEMM-based Level 3 BLAS: SSYMM DSYMM CSYMM ZSYMM CHEMM ZHEMM SSYRK DSYRK CHERK ZHERK SSYR2K DSYR2K CSYR2K ZSYR2K CHER2K ZHER2K All FFT routines are multithreaded for problem sizes in which parallelization provides a performance benefit. Single one-dimensional FFTs run in parallel only if the data size exceeds the size of the L2 cache. Convolution and correlation routines having two-dimensional input sequences are also multithreaded. See INTRO_FFT(3S) for a list of all signal processing routines. The direct sparse solver routines perform multithreaded factorizations and solves of linear systems of equations; the iterative sparse solver is also parallelized. All solver routines are thread-safe, so they will operate correctly and use only a single thread if called from a parallel region of an OpenMP or lllliiiibbbbmmmmpppp program. MMMMuuuullllttttiiiipppplllleeee----rrrroooouuuuttttiiiinnnneeee MMMMaaaannnn PPPPaaaaggggeeeessss The following data types are used in these routines: * Single precision: Fortran "real" data type, C/C++ "float" data type, 32-bit floating point; these routine names begin with SSSS. * Single precision complex: Fortran "complex" data type, C/C++ "scsl_complex" data type (defined in <<<<ssssccccssssllll____bbbbllllaaaassss....hhhh>>>>), C++ STL "complex<float>" data type (defined in <<<<ccccoooommmmpppplllleeeexxxx....hhhh>), two 32-bit floating point reals; these routine names begin with CCCC. * Double precision: Fortran "double precision" data type, C/C++ "double" data type, 64-bit floating point; these routine names begin with DDDD. * Double precision complex: Fortran "double complex" data type, C/C++ "scsl_zomplex" data type (defined in <<<<ssssccccssssllll____bbbbllllaaaassss....hhhh>>>>), C++ STL "complex<double>" data type (defined in <<<<ccccoooommmmpppplllleeeexxxx....hhhh>>>>), two 64-bit floating point doubles; these routine names begin with ZZZZ. Often little or no difference exists between these versions, other than the data types of some inputs and outputs. In this case, the routines are described on the same man page, and that man page is named after the real or complex routine. The mmmmaaaannnn(1) command can find a man page online by either the real, complex, double precision, or double complex name. PPPPaaaaggggeeee 3333 IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) IIIINNNNTTTTRRRROOOO____SSSSCCCCSSSSLLLL((((3333SSSS)))) The following table describes the naming conventions for these routines: ------------------------------------------------------------- Single Double Single Double Precision Precision Precision Precision Complex Complex ------------------------------------------------------------- form: Sname Dname Cname Zname example: SGEMM DGEMM CGEMM ZGEMM ------------------------------------------------------------- NNNNOOOOTTTTEEEESSSS SCSL does not currently support reshaped arrays. SSSSEEEEEEEE AAAALLLLSSSSOOOO The introductory man pages for each topic: IIIINNNNTTTTRRRROOOO____FFFFFFFFTTTT(3S), IIIINNNNTTTTRRRROOOO____SSSSOOOOLLLLVVVVEEEERRRRSSSS(3S), IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS(3S), IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS1111(3S), IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS2222(3S), IIIINNNNTTTTRRRROOOO____BBBBLLLLAAAASSSS3333(3S), IIIINNNNTTTTRRRROOOO____CCCCBBBBLLLLAAAASSSS(3S), IIIINNNNTTTTRRRROOOO____LLLLAAAAPPPPAAAACCCCKKKK(3S) PPPPaaaaggggeeee 4444