2023 Spring

The seminar of this semester is organized by Yuejia Zhang and Shengxiang Deng, and co-organized by the graduate student union in the School of Mathematical Sciences at Fudan. This section is partially sponsored by Xudong Li.

Past Presentations

2023-02-23 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand Consider coupled models for particulate flows, where the disperse phase is made of particles with distinct sizes. We are thus led to a system coupling the incompressible Navier–Stokes equations to the Vlasov–Fokker–Planck equations. For the model with random initial data near the global equilibrium, a uniform regularity is established in some suitable Sobolev spaces by using energy estimates and we prove that the energy decays exponentially in time by hypocoercivity arguments. We also prove that the generalized polynomial chaos stochastic Galerkin (gPC-sG) method has spectral accuracy, uniformly in time and the Knudsen number, and the error decays exponentially in time. A numerical scheme is designed to simulate the behavior of the multi-phase system. This scheme is asymptotic-preserving, which allows the efficient computation in both kinetic and hydrodynamic regimes. Numerical examples illustrate the accuracy and asymptotic behavior of the scheme, with several interesting applications.

2023-03-02 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand Introduced by the Optimal Transport theory, Wasserstein distance defined on the measure space has found applications for many fields such as machine learning, signal processing, seismic inversion, etc. As a linear programming problem, the algorithmic complexity of computation prevents their direct use on large scale datasets. Using the entropy regularization, the original problem is reduced to a matrix scaling one, which can be solved by the Sinkhorn iteration with $O(N^2)$ complexity. However, for a large dataset, big $N$ still leads to a computational bottleneck. By exploring the structure of matrix in the computation of Wasserstein-1 metric on mesh, we reduce the complexity of matrix-vector multiplication to $O(N)$ in the process of algorithm, and developed a new algorithm call Fast Sinkhorn, which achieves fast and stable computation of Wasserstein-1 metric.

2023-03-09 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand Strong variational sufficiency is a newly proposed property, which turns out to be of great use in the convergence analysis of multiplier methods. However, what this property implies for non-polyhedral problems remains a puzzle. In this talk, we will introduce the equivalence between the strong variational sufficiency and the strong second order sufficient condition (SOSC) for nonlinear semidefinite programming (NLSDP), without requiring the uniqueness of multiplier or any other constraint qualifications. Based on this characterization, the local convergence property of the augmented Lagrangian method (ALM) for NLSDP can be established under strong SOSC in the absence of constraint qualifications. Moreover, under the strong SOSC, we can apply the semi-smooth Newton method to solve the ALM subproblems of NLSDP as the positive definiteness of the generalized Hessian of augmented Lagrangian function is satisfied.

2023-03-16 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand Wasserstein-Fisher-Rao (WFR) distance is a family of metrics to gauge the discrepancy of two measures. Compared to the case for Wasserstein distance, the understanding of geodesics under spherical WFR is less clear . We develop a deep learning framework to compute geodesics under spherical WFR metric, and the learned geodesics can be adopted to generate weighted samples. Our framework can be beneficial for applications with given weighted samples, especially in Bayesian inference.

2023-03-23 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand In this talk, we discuss a unified primal-dual algorithm framework based on the augmented Lagrangian function for composite convex problems with conic inequality constraints. The new framework is highly versatile. First, it not only covers many existing algorithms such as PDHG, Chambolle-Pock (CP), GDA, OGDA and linearized ALM, but also guides us to design a new efficient algorithm called Simi-OGDA (SOGDA). Second, it enables us to study the role of the augmented penalty term in the convergence analysis. Interestingly, a properly selected penalty not only improves the numerical performance of the above methods, but also theoretically enables the convergence of algorithms like PDHG and SOGDA. Under properly designed step sizes and penalty term, our unified framework preserves the $O(1/N)$ ergodic convergence while not requiring any prior knowledge about the magnitude of the optimal Lagrangian multiplier. Linear convergence rate for affine equality constrained problem is also obtained given appropriate conditions. Finally, numerical experiments on linear programming, $\ell_1$ minimization problem, and multi-block basis pursuit problem demonstrate the efficiency of our methods.

2023-03-30 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand A wide range of applications in signal processing, machine learning, operations research, economics and mechanics consist of solving convex optimization problems in which the objective function is the sum of several terms of different nature. We consider structured convex optimization problems consisting the sum of three terms --- two nonsmooth terms, one of which is composed with a linear operator and both nonsmooth functions are proximal-friendly, and a smooth term with Lipschitzian gradient. To solve such problems, we propose and analyze a modified primal-dual splitting algorithm, denoted by MPD3O. MPD3O handles the smooth function by gradient evaluation, the nonsmooth functions by their proximity mappings and alleviate the computational burden in updating the primal variables by adding some extra costs to the updates of dual variables. Hence, MPD3O allows a much greater range of parameters to ensure convergence. Moreover, it does not need to calculate the spectral norm of the linear operator to determine the stepsizes. We establish global iterate convergence by following the famous Krasnoselskii-Mann theorem, as well as ergodic and nonergodic $\mathcal{O}(1/k)$ convergence rate results, where $k$ denotes the iteration counter. A connection between MPD3O and the Davis-Yin splitting method is also disclosed by using a technique proposed by Bertsekas, O'Connor and Vandenberghe. Numerical experiments on the constrained LASSO problem are conducted to show the efficiency of the proposed algorithm. In addition, a brief introduction will be provided on another work where the algorithm is applied in decentralized convex consensus optimization.

2023-04-06 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand It is imperative to accelerate the training of neural network force field such as Deep Potential, which usually requires thousands of images based on first-principles calculation and a couple of days to generate an accurate potential energy surface. To this end, we propose a novel optimizer named reorganized layer extended Kalman filtering (RLEKF), an optimized version of global extended Kalman filtering (GEKF) with a strategy of splitting big and gathering small layers to overcome the $\mathcal{O}(N^2)$ computational cost of GEKF. This strategy provides an approximation of the dense weights error covariance matrix with a sparse diagonal block matrix for GEKF. We implement both RLEKF and the baseline Adam in our $\alpha$Dynamics package and numerical experiments are performed on $13$ unbiased datasets. Overall, RLEKF converges faster with slightly better accuracy. For example, a test on a typical system, bulk copper, shows that RLEKF converges faster by both the number of training epochs ($\times 11.67$) and wall-clock time ($\times 1.19$). Besides, we theoretically prove that the updates of weights converge and thus are against the gradient exploding problem. Experimental results verify that RLEKF is not sensitive to the initialization of weights. The RLEKF sheds light on other AI-for-science applications where training a large neural network (with tons of thousands parameters) is a bottleneck.

2023-04-13 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand This talk reports on a joint work with Vahagn Nersesyan (NYU Shanghai). We discuss the global approximate controllability of the two-dimensional incompressible Navier--Stokes system driven by a physically localized and degenerate force. In other words, the fluid motion is regulated via four scalar control parameters that depend only on time and appear as coefficients in an effectively constructed driving force supported in a given subdomain. The task at hand is motivated by a well-known open problem due to A. A. Agrachev and a partial answer to his question shall be presented. Our idea consists of squeezing low mode controls into a small region, essentially by tracking their actions along the characteristic curves of a linearized vorticity equation. In this way, through explicit constructions and by connecting J.-M. Coron's return method with recent concepts from geometric control, the original controllability problem for the nonlinear Navier--Stokes system is reduced to one for a linear transport equation steered by a finite-dimensional force acting in the whole domain.

2023-04-20 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand The goal of quantum dynamics simulation is to accurately calculate the time-evolution operator of a given Hamiltonian. Our proposed method is based on the Cartan decomposition of the Lie algebra generated by the Hamiltonian. The Lie algebra is a mathematical structure that captures the algebraic properties of the Hamiltonian. The Cartan decomposition splits the Lie algebra into two parts: a maximal abelian subalgebra (MASA) and a complementary subspace. The MASA is a set of commuting observables that can be diagonalized simultaneously, while the complementary subspace contains non-commuting observables. The key idea of our method is to dynamically represent the time-evolution operator in terms of the MASA and the complementary subspace, such that the killing form, which is a measure of the exact decomposition, remains satisfied. This allows us to decompose the Hamiltonian evolution into smaller steps, which can be implemented using a series of quantum gates. Compared to other simulation methods, such as product formulas, our algorithm drastically improves simulation precision while using fewer quantum gates. This is because our method takes advantage of the underlying algebraic structure of the Hamiltonian, which allows us to optimize the simulation by using a smaller number of gates.In addition to providing exact circuits for a broad set of spin and fermionic models, our algorithm also provides broad analytic and numerical insight into optimal Hamiltonian simulations. This can help researchers design better simulation algorithms for a wide range of quantum systems.

2023-04-27 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand This talk reports on the performance of MLMC for deterministic and uncertain hyperbolic systems, where randomness is introduced either in the modeling parameters or in the approximation algorithms. MLMC is a well known variance reduction method widely used to accelerate Monte Carlo (MC) sampling. However, we demonstrate that for hyperbolic systems, whether MLMC can achieve a real boost turns out to be delicate. The computational costs of MLMC and MC depend on the interplay among the accuracy (bias) and the computational cost of the numerical method for a single sample, as well as the variances of the sampled MLMC corrections or MC solutions. We characterize three regimes for the MLMC and MC performances using those parameters, and show that MLMC may not accelerate MC and can even have a higher cost when the variances of MC solutions and MLMC corrections are of the same order.

2023-05-04 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand In this paper, we introduce a Homogeneous Second-Order Descent Method (HSODM) using the homogenized quadratic approximation to the original function. The merit of homogenization is that only the leftmost eigenvector of a gradient-Hessian integrated matrix is computed at each iteration. Therefore, the algorithm is a single-loop method that does not need to switch to other sophisticated algorithms, and is easy to be implemented. We show that HSODM has a global convergence rate of $O(\epsilon^{-3/2})$ to find an approximate second-order stationary point, and has a local quadratic convergence rate under the standard assumptions. The numerical results demonstrate the advantage of the proposed method over other second-order methods.

2023-05-11 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand A locally optimal preconditioned Newton--Schur method is proposed for solving symmetric elliptic eigenvalue problems. Firstly, the Steklov--Poincar\'e operator is used to project the eigenvalue problem on the domain $\Omega$ onto the nonlinear eigenvalue subproblem on $\Gamma$, which is the union of subdomain boundaries. Then, the direction of correction is obtained via applying a non-overlapping domain decomposition method on $\Gamma$. Four different strategies are proposed to build the hierarchical subspace $U_{k+1}$ over the boundaries, which are based on the combination of the coarse-subspace with the directions of correction. Finally, the approximation of eigenpair is updated by solving a local optimization problem on the subspace $U_{k+1}$. The convergence rate of the locally optimal preconditioned Newton--Schur method is proved to be $\gamma=1-c_{0}T_{h,H}^{-1}$, where $c_{0}$ is a constant independent of the fine mesh size $h$, the coarse mesh size $H$ and jumps of the coefficients; whereas $T_{h,H}$ is the constant depending on stability of the decomposition. Numerical results confirm our theoretical analysis.

2023-05-18 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand One-bit precoding is a promising way to achieve hardware-efficiency in massive MIMO systems and has gained growing research interests in recent years. However, the one-bit nature of the transmit signal poses great challenge to precoding design as well as performance analysis. In this talk, we will present some recent results on one-bit precoding. We will focus on both non-linear and linear-quantized precoding schemes. In particular, for non-linear precoding, we introduce a new negative $\ell 1$ penalty approach, which is based on an exact penalty model that penalizes the one-bit constraint into the objective with a negative $\ell 1$-norm term. The negative $\ell 1$ penalty approach achieves a better trade-off in complexity and symbol error rate (SER) performance than existing approaches. For linear-quantized precoding, we give an aysmptotic performance analysis for a wide class of precoders and derive the optimal precoder within the considered class. Different from existing Bussgang-decomposition-based analyzes, our analytical framework is based on random matrix theory (RMT), which is more rigorous and can be extended to more general cases.

2023-05-25 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand In this talk, based on a domain decomposition (DD) method, we propose an efficient two-level preconditioned Helmholtz--Jacobi--Davidson (PHJD) method for solving the algebraic eigenvalue problem resulting from the edge element approximation of the Maxwell eigenvalue problem. In order to eliminate the components in orthogonal complement space of the eigenvalue, we shall solve a parallel preconditioned system and a Helmholtz projection system together in fine space. After one coarse space correction in each iteration and minimizing the Rayleigh quotient in a small dimensional Davidson space, we finally get the error reduction of this two-level PHJD method as $c(H)\left(1-C\frac{\delta^2}{H^2}\right)$, where $C$ is a constant independent of the mesh size $h$ and the diameter of subdomains $H$, $\delta$ is the overlapping size among the subdomains, and $c(H)$ decreasing monotonically to $1$ as $H\rightarrow 0$, which means the greater the number of subdomains, the better the convergence rate. Numerical results supporting our theory are given.

2023-06-01 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand In this talk, our focus is on the spectral theories and the algorithms for the graph cut problems similar to the maxcut. We begin by investigating the dual Cheeger problem, denoted as $h^+$, and the modified dual Cheeger problem, denoted as $\widehat{h}^+$, which are regarded as the ``three-cut'' versions of the maxcut. Meanwhile, we introduce and develop the nonlinear spectral theory of the signless $1$-Laplacian on graphs and its combination with the graph $1$-Laplacian, which are closely related to the two problems. Then we present a local analysis of an inverse power method for determining $h^+$ and $\widehat{h}^+$, which enables an efficient implementation of the recursive spectral cut algorithm for the maxcut, yielding improved numerical results. Next, we explore the anti-Cheeger cut as a judicious correspondence to the classical maxcut, and develop a continuous iterative algorithm for it through an equivalent continuous formulation. It does not need rounding at all and has advantages that all subproblems have explicit analytic solutions. The objective function values are monotonically updated and the iteration points converge to a local optima in finite steps via an appropriate subgradient selection. Numerical experiments on Gset demonstrate the performance.

2023-06-08 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower [poster]

Abstract: Click to expand The multimarginal optimal transport problem with Coulomb cost arises in quantum physics and is vital in understanding strongly correlated quantum systems. Leveraging a Monge-like ansatz, we surmount its intrinsic curse of dimensionality and obtain a nonconvex optimization problem after employing discretization and $\ell_1$ penalty. In globally solving the nonconvex problem, its block structure suggests using splitting methods as local solvers, while the existing ones can get seriously afflicted with the poor scalabilities induced by the associated sparse-dense matrix multiplications. In this work, borrowing the tools from optimal transport, we develop novel splitting methods that favor highly scalable schemes for subproblems and are completely free of the full matrix multiplications after introducing entrywise sampling. Convergence and asymptotic properties are built on the theory of random matrices. We further combine the merits of the methods with and without sampling under a grid refinements-based framework, which is observed to produce high-quality solutions within short CPU time for large-scale contexts. The numerical results on several typical physical systems corroborate the effectiveness and better scalability of our approach, which also allows the first visualization for the approximate optimal transport maps between electrons in three-dimensional contexts.

2023-07-06 16:10:00 - 17:00:00 @ Rm 1601, Guanghua East Tower [poster]

Abstract: Click to expand In this project, we study the use of a machine learning-based estimator as a control variate for mitigating the variance of Monte Carlo sampling. We examine a prototype estimation problem that involves simulating the moments of a Sobolev function based on observations obtained from random quadrature nodes. Firstly, we establish an information-theoretic lower bound for the problem. We then study a specific quadrature rule that employs a nonparametric regression-adjusted control variate to reduce the variance of the Monte Carlo simulation. We demonstrate that this kind of quadrature rule can improve the Monte Carlo rate and achieve the minimax optimal rate under a sufficient smoothness assumption. Due to the Sobolev Embedding Theorem, the sufficient smoothness assumption eliminates the existence of rare and extreme events. Finally, we show that, in the presence of rare and extreme events, a truncated version of the Monte Carlo algorithm can achieve the minimax optimal rate while the control variate cannot improve the convergence rate.