p?gels

Solves overdetermined or underdetermined linear systems involving a matrix of full rank.

Syntax

call psgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pdgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pcgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pzgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

Include Files

The C interfaces are specified in the mkl_scalapack.h include file.

Description

The p?gels routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n matrix sub(A) = A(ia:ia+m-1,ja:ja+n-1), or its transpose/ conjugate-transpose, using a QTQ or LQ factorization of sub(A). It is assumed that sub(A) has full rank.

The following options are provided:

If trans = 'N' and m ≥ n: find the least squares solution of an overdetermined system, that is, solve the least squares problem

minimize ||sub(B) - sub(A)*X||
If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system sub(A)*X = sub(B).
If trans = 'T' and m ≥ n: find the minimum norm solution of an undetermined system sub(A)^T*X = sub(B).
If trans = 'T' and m < n: find the least squares solution of an overdetermined system, that is, solve the least squares problem

minimize ||sub(B) - sub(A)^T*X||,

where sub(B) denotes B(ib:ib+m-1, jb:jb+nrhs-1) when trans = 'N' and B(ib:ib+n-1, jb:jb+nrhs-1) otherwise. Several right hand side vectors b and solution vectors x can be handled in a single call; when trans = 'N', the solution vectors are stored as the columns of the n-by-nrhs right hand side matrix sub(B) and the m-by-nrhs right hand side matrix sub(B) otherwise.

Input Parameters

trans

(global) CHARACTER. Must be 'N', or 'T'.

If trans = 'N', the linear system involves matrix sub(A);

If trans = 'T', the linear system involves the transposed matrix A^T (for real flavors only).

m

(global) INTEGER. The number of rows in the distributed submatrix sub (A) (m ≥ 0).

n

(global) INTEGER. The number of columns in the distributed submatrix sub (A) (n ≥ 0).

nrhs

(global) INTEGER. The number of right-hand sides; the number of columns in the distributed submatrices sub(B) and X. (nrhs ≥ 0).

a

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Pointer into the local memory to an array of dimension (lld_a, LOCc(ja+n-1)). On entry, contains the m-by-n matrix A.

ia, ja

(global) INTEGER. The row and column indices in the global array a indicating the first row and the first column of the submatrix A, respectively.

desca

(global and local) INTEGER array, dimension (dlen_). The array descriptor for the distributed matrix A.

b

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Pointer into the local memory to an array of local dimension (lld_b, LOCc(jb+nrhs-1)). On entry, this array contains the local pieces of the distributed matrix B of right-hand side vectors, stored columnwise; sub(B) is m-by-nrhs if trans='N', and n-by-nrhs otherwise.

ib, jb

(global) INTEGER. The row and column indices in the global array b indicating the first row and the first column of the submatrix B, respectively.

descb

(global and local) INTEGER array, dimension (dlen_). The array descriptor for the distributed matrix B.

work

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Workspace array with dimension lwork.

lwork

(local or global) INTEGER.

The dimension of the array work lwork is local input and must be at least lwork ≥ ltau + max(lwf, lws), where if m > n, then

ltau = numroc(ja+min(m,n)-1, nb_a, MYCOL, csrc_a, NPCOL),

lwf = nb_a*(mpa0 + nqa0 + nb_a)

lws = max((nb_a*(nb_a-1))/2, (nrhsqb0 + mpb0)*nb_a) + nb_a*nb_a

else

ltau = numroc(ia+min(m,n)-1, mb_a, MYROW, rsrc_a, NPROW),

lwf = mb_a * (mpa0 + nqa0 + mb_a)

lws = max((mb_a*(mb_a-1))/2, (npb0 + max(nqa0 + numroc(numroc(n+iroffb, mb_a, 0, 0, NPROW), mb_a, 0, 0, lcmp), nrhsqb0))*mb_a) + mb_a*mb_a

end if,

where lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),

iroffa = mod(ia-1, mb_a),

icoffa = mod(ja-1, nb_a),

iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),

iacol= indxg2p(ja, nb_a, MYROW, rsrc_a, NPROW)

mpa0 = numroc(m+iroffa, mb_a, MYROW, iarow, NPROW),

nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL),

iroffb = mod(ib-1, mb_b),

icoffb = mod(jb-1, nb_b),

ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),

ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),

mpb0 = numroc(m+iroffb, mb_b, MYROW, icrow, NPROW),

nqb0 = numroc(n+icoffb, nb_b, MYCOL, ibcol, NPCOL),

ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW, and NPCOL can be determined by calling the subroutine blacs_gridinfo.

If lwork = -1, then lwork is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by pxerbla.

Output Parameters

a

On exit, If m ≥ n, sub(A) is overwritten by the details of its QR factorization as returned by p?geqrf; if m < n, sub(A) is overwritten by details of its LQ factorization as returned by p?gelqf.

b

On exit, sub(B) is overwritten by the solution vectors, stored columnwise: if trans = 'N' and m ≥ n, rows 1 to n of sub(B) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements n+1 to m in that column;

If trans = 'N' and m < n, rows 1 to n of sub(B) contain the minimum norm solution vectors;

If trans = 'T' and m ≥ n, rows 1 to m of sub(B) contain the minimum norm solution vectors; if trans = 'T' and m < n, rows 1 to m of sub(B) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements m+1 to n in that column.

work(1)

On exit, work(1) contains the minimum value of lwork required for optimum performance.

info

(global) INTEGER.

= 0: the execution is successful.

< 0: if the i-th argument is an array and the j-entry had an illegal value, then info = - (i* 100+j), if the i-th argument is a scalar and had an illegal value, then info = -i.