Skip to Main Content

Low-Rank Approximation for Multiscale PDEs

Ke Chen
Shi Chen
Qin Li
Jianfeng Lu
Stephen J. Wright

Communicated by Notices Associate Editor Reza Malek-Madani

Article cover

1. Introduction

Multiscale phenomena are ubiquitous, with applications in many physical sciences and engineering fields: aerospace, material sciences, geological structure analysis, and many others. The different scales often have different physics, which entangle to produce complicated nonlinearities. Partial differential equations (PDEs) are often used to model these problems, with different scales captured in the coefficients and functions that define the PDE. These PDE models are challenging to compute directly, so analysis and algorithms specifically targeted to multiscale problems have been developed and investigated. Following convention, we focus in this review on problems with two distinct scales, with a small positive parameter capturing the ratio between the small and large scales.

Though modern multiscale analysis dates back to asymptotic PDE analysis that was seen already in Hilbert and Poincaré expansions early last century (see review in PS08), the impetus for computations involving multiscale PDEs came largely from the US Department of Energy (DOE) National Labs within the ASCI (Advanced Strategic Computing Initiative) Hor09. Since that time, analysis and computation in multiscale PDEs have taken different paths. Analysis has tended to follow a single “universal” strategy, passed down from tradition. The equation is decomposed into several levels according to asymptotic expansions involving the scale parameters, with the subequation at each level representing physics at a single scale, and subequations at the finer level feeding information to those at the coarser level. This analytical machinery has been used to treat multiscale PDEs arising from such varied backgrounds as kinetic theory, semi-classical quantum systems, and homogenization of composite materials, among others PS08E11.

On the computational side, strategies for handling multiscale PDEs are more varied. Problems are usually handled by specifically designed solvers. One class of solvers called asymptotic-preserving schemes HJL17 are designed to preserve asymptotic limits of kinetic equations. These schemes usually contain some component of macro-solvers and micro-solvers, integrated in a clever way to reveal different structures in different regimes. Another class of solvers called numerical homogenization methods E11OS19EE03 usually target elliptic and parabolic equations in which the coefficients that represent media have oscillatory elements. These methods usually consist of offline and online stages, with either the homogenized media or the representative basis functions being prepared in the offline stage.

Why are most numerical schemes for multiscale PDEs equation-specific despite the analytical tools being largely unified? This intriguing question has motivated our investigations into devising a universal numerical strategy for solving multiscale PDEs. While the approach is yet to be developed fully, we believe that our progress on this issue is of wide interest, and this article surveys our progress to date. Crucially, our approach exploits the low-rank structure present in discretizations of multiscale PDEs.

To demonstrate the fundamental idea, we consider the following problem:

where is a linear partial differential operator that depends explicitly on the small parameter , while represents the boundary condition or the source term, which is assumed to have no dependence on . Multiscale problems that can be formulated in this way include elliptic equations with highly oscillating media and the neutron transport equation with small Knudsen number. Due to the -dependence of the operator, the solution inherits structures at both fine and coarse scales.

An asymptotic limit is revealed by multiscale analysis using asymptotic expansions as . In this limit, the oscillation at the fine scale is fast and the detailed oscillation pattern no longer matters — only macroscopic quantities are relevant. Formally, writing the homogenization limit as

we have

The norm of the approximation error depends heavily on the particular equation at hand.

The numerical challenge in solving 1 is that many degrees of freedom may be needed. Naïve finite element or finite difference methods would require mesh size to resolve fine-scale structure of the solution at the level. For a problem on , the discretized system would therefore have degrees of freedom, leading to prohibitive computational and memory cost for small . From an application perspective, it often suffices to characterize the solutions on the macroscopic level, where oscillations at the scale are largely absent. This property raises the question of whether we can obtain an approximate solution of this type using only degrees of freedom. If we know how to derive 2, we can simply solve for , which has the required macroscopic properties, and typically requires a discretization with degrees of freedom. Often, though, the limiting equation 2 and its solution are difficult to find explicitly, even when it is possible to establish their existence. These difficulties have led researchers to propose problem-specific solutions.

We believe that a universal approach for finding the large-scale solution can be devised, and that exploitation of the low-rank structure of the solution space is the key to developing such an approach. As suggested above, the Green’s matrix (the discretized Green’s function on fine grids) for the multiscale system 1 requires dimension to represent the underlying Green’s function accurately. However, if a limiting system such as 2 exists, this limiting system can be well-represented numerically by , a Green’s matrix with dimension only . This phenomenon suggests the system can largely be “compressed” and hence is of low rank; see illustration in Figure 1. In the language of numerical linear algebra, this transition amounts to performing a truncated singular value decomposition (SVD) of to obtain .

Figure 1.

PDEs with small parameters have homogenized limits, meaning the solutions to the original PDEs can be well-approximated by the solutions to the limiting equations. While analytically the two solution spaces are “close,” the original equation requires many more degrees of freedom to solve numerically than its limiting counterpart. The numerical Green’s matrix is intrinsically low rank.

Graphic for Figure 1.  without alt text

If we obtain the truncated SVD of the matrix by starting with a full SVD, the resulting algorithms would be impractical because of the large dimension of the matrix and the expense of preparing and storing the full matrix and computing its SVD. Several new linear algebra solvers take a quite different approach. Instead of accessing the full matrix, these new solvers merely require computation of matrix-vector products, involving the target matrix and several randomly selected vectors (typically vectors with Gaussian i.i.d. entries). Translated to the PDE solver setting, these matrix-vector multiplications amount to computing numerical solutions to PDEs with some random source terms, a task that may be practical if the number of such operations required is modest. The randomized SVD (rSVD) approach is one method of this type. It is equipped with a thorough analysis and achieves optimality in terms of computational efficiency. We make use of this method in the techniques described in the remainder of this article.

The main theme of our article, then, is the use of randomized SVD solvers to exploit the low-rank features of multiscale PDEs. We will describe two strategies both of which are divided into “offline” and “online” stages. The offline stage sees the preparation of either the solution space or the boundary-to-boundary map used in the domain decomposition, while the online stage singles out the specific solution for the given source . The two strategies are described in Section 4.1 and 4.2, respectively. In Section 5, we present the nonlinear extension utilizing manifold learning algorithms for reconstructing the low-rank features of the solution manifold. Prior to these discussions, we describe in Section 2 two algorithm classes — asymptotic preserving and numerical homogenization — for identifying the asymptotic limits of multiscale problems. As examples, we use the multiscale radiative transfer equation (RTE) and the elliptic equation with rough media. Section 3 explores the two main elements of our approaches: the numerical low-rank feature of multiscale PDEs and the randomized SVD solver for efficient reconstruction of low-rank operator/spaces. We conclude with a discussion of future work in Section 6.

2. Examples

Kinetic equations and elliptic equations with oscillating media are two examples of multiscale PDEs, for which computational schemes were developed separately. The specific features of these problems were incorporated into the design of asymptotic preserving schemes and numerical homogenization methods, respectively. We review these techniques and highlight the shared low-rank property of these two problems.

2.1. Kinetic equations and asymptotic preserving methods

Kinetic equations, which originate from statistical mechanics, describe the evolution of probability density for identical particles in phase space. A model equation, the radiative transfer equation (RTE), characterizes the evolution of photon density. In the steady state, this equation is

where is the light source, and the linear collision operator describes the interaction of photons with the optical media. The small parameter is encoded in this operator.

The operator defines several distinct regimes. In the optically thick regime, it is defined by

In this case, is the scattering coefficient that describes the possibility of a photon located at changing its velocity from to , and the parameter is called the Knudsen number, standing for the ratio of the mean free path to the typical domain length. When the medium is optically thick, the mean free path is small, with . This means the photon particles are scattered fairly often, and the system statistically achieves the equilibrium state, which can itself be characterized mathematically. One example is to observe light in atmosphere, where the average mean free path is about m, and the observation is conducted at the scale of km, leading to . By performing asymptotic expansion in terms of , the inhomogeneity in the velocity domain vanishes, and one can show that asymptotically approximates , a function without dependence on that solves a diffusion equation. We have the following result from BLP11.

Theorem 1.

Suppose that solves 4 with collision term being isotropic, that is, for some function . Let be bounded with smooth boundary, and . Assume that the boundary condition is

Then

where solves

with the boundary condition

where solves a proper boundary layer equation and can be obtained from .

This result indicates that the homogenized operator as is . Similar results, when fails to have the form of in the anisotropic optical media, are still available, but the explicit form of is no longer available.

A second regime of interest for is one in which the media is highly heterogeneous DG00:

In this case, the photons go through the media that oscillates at a small scale: For example, sunlight passing through a heavy cloud with a large number of small droplets or a laser beam passing through crystals. The amplitude of determines the photon scattering frequency. Since oscillates rapidly, photons also change rapidly between the high- and low-scattering regimes. On a large scale, the photons can be viewed approximately as scattering with an averaged frequency. A mathematical result is as follows DG00.

Theorem 2.

Let the conditions from Theorem 1 hold, and suppose that the collision term is defined in 9. Then

where solves

where for some . Furthermore, if is periodic in with period , then .

In special cases, such as under periodic or random ergodic conditions, the function can be computed explicitly. (There are also works that investigate the asymptotic limit of the RTE when the system is both highly oscillatory and in diffusion regime; see GM01.)

In both limiting regimes, the limiting equations 8 and 11 can be solved much more efficiently than the original equation 4. The discretization of 4 is constrained strongly by , due either to stability (as in 5) or accuracy (as in 9). By contrast, the solution varies smoothly, containing no -scale effects, so it can be obtained accurately by applying a discretization with mesh width to the asymptotic limiting equation. If the latter equation is available, computation of by this means is the recommended methodology.

Methods for kinetic equations are termed “asymptotic preserving” (AP) if they can relax the requirement that the discretization width satisfies yet still capture the asymptotic limits. Many different AP approaches have been proposed. For linear equations, existing AP methods rely on even-odd or micro-macro decomposition. For nonlinear equations, knowledge of the specific forms of the limits is usually required, and this knowledge is built into the solvers HJL17. As mentioned above, these specific forms are often not available, so many AP methods cannot be applied to a large set of multiscale kinetic equations. This observation begs the question: Knowing the existence of the limit, but not its particular form, can we still devise efficient methods for solving kinetic equations?

2.2. Elliptic equations and numerical homogenization

Another class of multiscale equations that has been investigated deeply is elliptic equations with highly oscillatory coefficient. These problems have the form

where is the scale on which the media oscillates. (The source term has no small-scale contribution.)

This equation is a model problem from petroleum engineering where it is crucial to precompute the underground flow before expensive construction of infrastructure takes place BL11. The problem is typically solved on kilometer-scale domains, but the heterogeneities in the media can scale at centimeters. Certain forms of this equation can be approximated effectively by an equation that can be solved efficiently. Suppose the media coefficient has the form , that is, it varies on two scales ( and ), and moreover is periodic with respect to the fast variable (the second argument in ). Then in the limiting regime as , the solution converges to that of a homogenized equation, with the media “smoothed-out,” as described in the following result All92.

Theorem 3.

Let solve 12 in the domain with zero boundary condition. Suppose is periodic with respect to the second argument. Then

where solves the following effective equation with zero boundary condition:

where , the effective media, can be computed from a cell problem (See Definition 2.1 in All92).

As in the previous section, when a limiting equation can be derived explicitly, the best course for obtaining a useful solution it to solve this equation directly, as the mesh width in the discretization scheme can be much larger than . See EH09 for a discussion of a reduced number of basis functions and E11 for computation of the effective media.

However, the validity and the specific form of the effective limit are known only in special cases like the one described in Theorem 3. In other cases, we seek a solver that relies on as little analytical knowledge as possible. An approach known as numerical homogenization has been investigated extensively. This approach is founded on two principles: a discretization scheme independent of , and a numerical solution scheme that captures the true limiting behavior of the solution on the discrete level. Variants of numerical homogenization include application of the -matrix, a purely algebraic technique Hac15; and a Bayesian approach that views the source , and hence the solution , as Gaussian fields Owh15, which further translates to game theory OS19. All these methods are successful, but they all implicitly rely on properties of the underlying elliptic equation. Can we devise an approach that applies to general problems with oscillatory media that exploits the low-rank property in the solution space, without using analytical structure explicitly?

3. A Unified Framework for Multiscale PDEs Based on Random Sampling

We have given several examples of multiscale models that arise in applications, and mentioned several algorithmic approaches that make use of the limiting equations, when available. We describe next the foundations of a unified scheme that captures asymptotic limiting behavior automatically, even when the asymptotic limits are unavailable. Our method exploits low-rank structure and uses random sampling to discover this structure. We describe the low-rank property in Section 3.1 and the randomized SVD method for revealing this structure in Section 3.2.

3.1. Numerical rank

We consider a bounded linear operator , which maps to a space , that is

In the PDE setting, is the solution operator that maps the boundary conditions and/or source term to the solution . The numerical rank of such an operator is defined as follows.

Definition 1 (Numerical rank).

The numerical -rank of is the rank of the lowest-rank operator within the -neighborhood of , that is,

In other words, is this smallest dimension of the range among all the operators within distance of .

When is the PDE solution map, then with low rank is also a linear map with a finite dimensionality. It can be viewed as the discrete version (or a matrix of dimension ) that approximates within accuracy. The definition suggests that if can be found, it is optimal in the sense of numerical efficiency. The concept is rather similar to the Kolmogorov -width, defined as follows.

Definition 2 (Kolmogorov -width).

Given the linear operator , the Kolmogorov -width is the shortest distance from its range to all -dimensional space, that is,

Indeed, the Kolmogorov -width and numerical rank are related by the following result CLLW20a.

Proposition 1.

For any linear operator , we have the following.

(a)

If the numerical -rank is , then .

(b)

If , then the numerical -rank is .

For the three examples presented in Section 2, the numerical ranks can be calculated from their limiting equations. For one-dimensional RTE in the diffusion regime, if we denote by and the solution operators of 4 and 8, respectively, then noting that can be approximated using grid points to achieve accuracy, when , the numerical rank is naturally . Without employing the knowledge of the existence of the limit, however, a brute-force discretization naturally requires degrees of freedom: for the upwind discretization in and for the discretization in , where depends on the particular numerical integral accuracy. Translating into Green’s-matrix language, this observation means that is represented by degrees of freedom but its range can be captured by a compressed Green’s matrix with just column vectors.

The same argument applies to the elliptic equation on a two-dimension domain with high oscillations. When second-order linear finite elements are used, with no knowledge of the limiting system, degrees of freedom are required, dropping to when the the limiting system is known. In other words, the full Green’s matrix requiring degrees of freedom can be well-represented using just column vectors.

In all these cases, the degrees of freedom for a given numerical method are substantially larger than the numerical rank of the problem. Thus, much of the information in these full-blown representations is redundant and compressible. A low-rank representation exists and yields a much more economical representation.

3.2. Random sampling in numerical linear algebra

Knowing the existence of the low-rank structure and finding such a structure are very different goals. The Kolmogorov -width is a concept developed in numerical PDEs, but it has made little impact in numerical PDEs for a simple reason: Traditional PDE solvers require a predetermined set of basis functions, while the Kolmogorov -width looks for “optimal” basis functions. How can an optimal basis be found without first forming the full basis? Translated to linear algebra, this question is about finding the dominant singular vectors in a matrix without forming the whole matrix. Specifically, if is known to be approximately low rank, meaning that there exists , a matrix with orthonormal columns with and

can we find without forming the full matrix ?

In linear algebra, it is well-known that is simply the collection of the first singular vectors of . Writing

where and contain the left/right singular vectors and contains the singular values, then is the first columns in .

The standard method for computing the SVD requires to be stored and computed with. But the celebrated randomized SVD (rSVD) method HMT11 captures the range of a given matrix by means of random sampling of its column space, which requires only computation of matrix-vector products involving and random vectors — operations that can be performed without full storage or knowledge of . Implementation of rSVD is easy and its performance is robust.

The idea behind the algorithm is simple. If matrix has approximate low rank , the matrix maps an -dimensional sphere to an -dimensional ellipsoid that is “thin:” of its axes are significantly larger than the rest. With high probability, vectors that are randomly sampled on the -dimensional sphere are mapped to vectors that lie mostly in a -dimensional subspace of — the range of . An approximation to can be obtained by projecting onto this subspace.

A precise statement of the performance of randomized SVD is as follows HMT11.

Theorem 4.

Let be an matrix. Define

where is a matrix of size with its entries randomly drawn from an i.i.d. normal distribution, where is an oversampling parameter. If , then the projection of onto the space spanned by , defined by

yields that with high probability, and

The result reconstructs the range of in a nearly optimal way. It is optimal in efficiency because to capture a rank- matrix, only matrix-vector products involving are required for the calculation of , and the oversampling parameter is typically quite modest. ( is a typical value.) The result is nearly optimal in accuracy as well. The error bound relies only on , which is expected to be smaller than . The decay profile of singular values do not affect the approximation accuracy.

If a low-rank approximation to the matrix is required, and not just an approximation of its range, another step involving multiplications with its transpose is needed. The full method is shown in Algorithm 1.

Algorithm 1.

Randomized SVD.

1:Given an matrix , target rank and oversampling parameter ;2:Set ;3:Stage A:4:Generate an Gaussian test matrix ;5:Form ;6:Perform the QR-decomposition of : 7:Stage B:8:Form ;9:Compute the SVD of the matrix ;10:Set ;11:Return: .

4. Random Sampling for Multiscale Computation

Here we describe how rSVD can be incorporated into multiscale PDE solvers to exploit the low-rank structure of these equations. Our procedure is composed of both offline and online stages. Low-rank structure is learned in the offline stage, while in the online stage, the solution for the given source / boundary term in 1 is extracted.

We consider in particular the following boundary value problem:

where is the boundary condition operator, the boundary associated with domain , and we now denote the source term (the boundary data) by . Our fundamental goal is to construct the low-rank approximation to the Green’s operator for 18. With this operator in hand, the solution can be computed for any value of the boundary conditions at a relatively small incremental cost.

If we apply rSVD to approximate directly, we need to compute products of this operator with random vectors. For problem 18, this operation corresponds to solving the problem with replaced by random boundary conditions. Solving even one such problem efficiently is a computationally challenging task. We use the domain decomposition framework.

We start by partitioning the domain into subdomains as follows:

where the patches overlap, in general. We denote by the boundary associated with . Furthermore, we identify the subregions that intersect with as follows:

and define the interior of the patch to be

For this particular partition of the domain, we define the partition-of-unity functions , , , , to have the following properties:

We choose a discretization that resolves the small scales in the solution, defining a mesh width . (The number of subdomains is independent of .) A typical decomposition is illustrated in Figure 2.

Figure 2.

Illustration of domain decomposition of a rectangular domain in 2D with overlap.

Graphic for Figure 2.  without alt text

How do we design the offline stage to “learn” the low-rank approximation? We propose two approaches that lead to two different kinds of algorithms. In the first approach we learn the optimal basis functions within each subdomain, while the second algorithm employs Schwarz iteration, preparing the boundary-to-boundary map in the offline stage. Other PDE solvers that utilize randomness can also be found in CEGL16BS18Mar20. In particular in CEGL16 the authors studied, specifically for elliptic type equations, the generalized eigenvalue problem of the stiffness and mass matrices, and give an error bound using the largest eigenvalue obtained offline.

4.1. Learning basis functions

In standard domain decomposition, the local discretized Green’s matrix is assembled from a “full” collection of basis functions in the patch . The global solution to 18, confined to each , is a linear combination of the columns . The coefficients of these combinations are chosen so that that the continuity conditions across patches and the exterior boundary condition are all satisfied. The complete process can be outlined as follows.

(1)

Offline stage: For , , …, , find

where each local function is a solution to 18 restricted to the subdomain , with fine grid and delta-function boundary conditions. That is,

where is the Kronecker delta that singles out the -th grid point on the boundary .

(2)

Online stage: The global solution is

with the support of each confined to , where is a vector of coefficients determined by the boundary conditions and continuity conditions across the patches.

The complete basis represented by has a low-rank structure that can be revealed using randomized SVD. Instead of using delta functions as the boundary conditions, we propose to obtain basis functions by setting random values on , as follows: