The model of the computing environment for ScaLAPACK is represented as a one-dimensional array of processes (for operations on band or tridiagonal matrices) or also a two-dimensional process grid (for operations on dense matrices). To use ScaLAPACK, all global matrices or vectors should be distributed on this array or grid prior to calling the ScaLAPACK routines.
ScaLAPACK uses the two-dimensional block-cyclic data distribution as a layout for dense matrix computations. This distribution provides good work balance between available processors, as well as gives the opportunity to use BLAS Level 3 routines for optimal local computations. Information about the data distribution that is required to establish the mapping between each global array and its corresponding process and memory location is contained in the so called array descriptor associated with each global array. An example of an array descriptor structure is given in Table "Content of the array descriptor for dense matrices".
Array Element # | Name | Definition |
---|---|---|
1 | dtype | Descriptor type ( =1 for dense matrices) |
2 | ctxt | BLACS context handle for the process grid |
3 | m | Number of rows in the global array |
4 | n | Number of columns in the global array |
5 | mb | Row blocking factor |
6 | nb | Column blocking factor |
7 | rsrc | Process row over which the first row of the global array is distributed |
8 | csrc | Process column over which the first column of the global array is distributed |
9 | lld | Leading dimension of the local array |
The number of rows and columns of a global dense matrix that a particular process in a grid receives after data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the ScaLAPACK tool routine numroc.
After the block-cyclic distribution of global data is done, you may choose to perform an operation on a submatrix of the global matrix A, which is contained in the global subarray sub(A), defined by the following 6 values (for dense matrices):
The number of rows of sub(A)
The number of columns of sub(A)
A pointer to the local array containing the entire global array A
The row index of sub(A) in the global array
The column index of sub(A) in the global array
The array descriptor for the global array
Optimization Notice |
---|
Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the "Intel Compiler User and Reference Guides" under "Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not. Notice revision #20110307 |
Copyright © 1994 - 2011, Intel Corporation. All rights reserved.