2025 Spring
The seminar of this semester is organized by Qiang Wu and Ming Li, and co-organized by the graduate student union in the School of Mathematical Sciences at Fudan. This section is partially sponsored by Shanghai Key Laboratory for Contemporary Applied Mathematics.
2025-06-05 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
On a Shrink-And-Expand Technique for Symmetric Block Eigensolvers
- Speaker: Yuqi Liu (Fudan University)
- Advisor: Weiguo Gao and Meiyue Shao (Fudan University)
Abstract: Click to expand
In symmetric block eigenvalue algorithms, such as the subspace iteration algorithm and the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm, a large block size is often employed to achieve robustness and rapid convergence. However, using a large block size also increases the computational cost. Traditionally, the block size is typically reduced after convergence of some eigenpairs, known as deflation. In this work, we propose a non-deflation-based, more aggressive technique, where the block size is adjusted dynamically during the algorithm. This technique can be applied to a wide range of block eigensolvers, reducing computational cost without compromising convergence speed. We present three adaptive strategies for adjusting the block size, and apply them to four well-known eigensolvers as examples. Detailed theoretical analysis and numerical experiments are provided to illustrate the efficiency of the proposed technique. In practice, an overall acceleration of 20% to 30% is observed.
2025-06-12 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation
- Speaker: Zehao Li (Peking University)
- Advisor: Yijie Peng (Peking University)
Abstract: Click to expand
This article addresses the challenge of parameter calibration in stochastic models where the likelihood function is not analytically available. We propose a gradient-based simulated parameter estimation framework, leveraging a multi-time scale algorithm that tackles the issue of ratio bias in both maximum likelihood estimation and posterior density estimation problems. A nested simulation optimization structure is introduced, accompanied by comprehensive theoretical analyses, including strong convergence, asymptotic normality, convergence rates, and budget allocation strategies. These theoretical results provide crucial insights for algorithm design and hyperparameter selection. The framework is further extended to neural network training, offering a novel perspective on stochastic approximation in machine learning. Numerical experiments show that our algorithm can improve the estimation accuracy and save computational costs, making it effective for parameter estimation in stochastic systems.
Past Presentations
2025-02-20 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Sharp Asymptotic Stability of Blasius Profile in the Steady Prandtl Equation
- Speaker: Cheng Yuan (Fudan University)
- Advisor: Zhen Lei (Fudan University)
Abstract: Click to expand
In this talk, I present an asymptotic stability result concerning
the self-similar Blasius profiles $[\bar{u}, \bar{v}]$ of the
stationary Prandtl boundary layer equation. Initially demonstrated by
Serrin (1967, Proc.\ R.\ Soc.\ Lond), the profiles $[\bar{u}, \bar{v}]$
were shown to act as a self-similar attractor of solutions $[u, v]$ to
the Prandtl equation through the use of von Mises transform and maximal
principle techniques. Specifically, as $x \to \infty$,
$\|u - \bar{u}\|_{L^{\infty}_{y}} \to 0$. Iyer(2020, ARMA) employed
refined energy methods to derive an explicit convergence rate for initial
data close to Blasius. Wang and Zhang(2023, Math.\ Ann.) utilized barrier
function methods, removing smallness assumptions but imposing stronger asymptotic
conditions on the initial data. It was suggested that the optimal convergence rate
should be $\|u-\bar{u}\|_{L^{\infty}_{y}}\lesssim (x+1)^{-\frac{1}{2}}$,
treating the stationary Prandtl equation as a 1-D parabolic equation
in the entire space.
In our work, we establish that $\|u - \bar{u}\|_{L^{\infty}_{y}} \lesssim (x+1)^{-1}$.
Our proof relies on discovering nearly conserved low-frequency quantities and inherent
degenerate structures at the boundary, which enhance the convergence rate through iteration
techniques. Notably, the convergence rate we have demonstrated is optimal. We can find
special solutions of Prandtl's equation such that the convergence between the solutions
and the Blasius profile is exact, represented as $ (x+1)^{-1} $. This is a joint work
with Prof. Hao Jia and Prof. Zhen Lei.
2025-02-27 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Parallel Multi-Coordinate Descent Methods for Full Configuration Interaction
- Speaker: Yuejia Zhang (Fudan University)
- Advisor: Weiguo Gao and Yingzhou Li (Fudan University)
Abstract: Click to expand
Solving the time-independent Schrödinger equation gives us full access to the chemical
properties of molecules. Among all the ab-initio methods, full configuration interaction (FCI)
provides the numerically exact solution under a predefined basis set. However, the FCI
problem scales exponentially with respect to the number of bases and electrons and suffers
from the curse of dimensionality. We develop a multi-threaded parallel coordinate descent
full configuration interaction algorithm, for the electronic structure ground-state calculation
in the configuration interaction framework. The algorithm solves an unconstrained nonconvex
optimization problem, via a modified block coordinate descent method with a deterministic
compression strategy. CDFCI captures and updates appreciative determinants with different
frequencies proportional to their importance. We demonstrate the efficiency of the algorithm
on practical systems.
2025-03-06 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A BDF-Spectral Method for Nonlocal PDEs With Long Time Delay
- Speaker: Shuxun Shi (Fudan University)
- Advisor: Wenbin Chen (Fudan University)
Abstract: Click to expand
In this talk, a numerical method for a class of nonlocal PDEs with long time delay is designed.
The system involves a variable on $\Omega\times\mathbb{R}\times\mathbb{R}^{+}$, in which case for
$\Omega\subset\mathbb{R}^{d}$, a $(d+2)$-dimensional problem is to be solved numerically, which is
challenging, especially for $d=2$ or $d=3$. In this talk, we propose an effective numerical method:
BDF schemes and Fourier spectral method are applied for time and space discretization respectively,
and the long time delay term is treated by Laguerre spectral method. The unique solvability of the
numerical schemes is proved, and the energy upper bound of the numerical solution for the long time
is given by energy estimation. By applying the generalized Laguerre orthogonal projection, we
obtain the error estimate within finite final time for the fully discretization. We present some
numerical experiments to verify the energy bound and convergence order. Also, examples are given
to show how the solutions evolve and approach the global attractor.
2025-03-13 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Applications of Large Language Models in Formal Reasoning
- Speaker: Guoxiong Gao (Peking University)
- Advisor: Bin Dong (Peking University)
Abstract: Click to expand
Interactive Theorem Provers (ITPs), often referred to as formal languages, offer a reliable method
to eliminate errors in mathematical reasoning. Meanwhile, Large Language Models (LLMs) have shown
great potential to accelerate—and even automate—the formalization process. In this talk, we will
explore how LLMs are applied in key areas such as premise selection, tactic suggestion, auto-formalization,
and automated theorem proving. Additionally, we will discuss how training datasets for these tasks
are constructed, highlighting the impact of structural information on improving LLMs' performance
in Lean-related tasks, particularly in LeanSearch and our statement formalizer.
2025-03-20 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Computation of First Passage of Markov Additive Processes
- Speaker: Junxin Zhang (Fudan University)
- Advisor: Jungong Xue (Fudan University)
Abstract: Click to expand
In this talk, the computation of the matrix pair describing the first passage time of a Markov additive
process is considered. This pair of matrices is characterized as a solution to an integral matrix equation,
for which we develop an iterative method. At each step, it requires computing the extremal solution to
a mixed linear-quadratic matrix equation, which is accomplished by a quadratically convergent algorithm.
When all the jumps are of phase-type distribution, the integral matrix equation can be transformed into
a single mixed linear-quadratic matrix equation, and thus the pair of matrices can be computed with
quadratic convergence.
2025-03-27 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
AI for Physics: Learning Hamiltonian Systems and Conservation Laws
- Speaker: Jingdong Zhang (Fudan University)
- Advisor: Wei Lin (Fudan University)
Abstract: Click to expand
Accurately identifying and predicting dynamics from observational data with noise perturbations or data
missing is a significant challenge in the field of dynamical systems. In my talk, I will introduce the
Hamiltonian Neural Koopman Operator (HNKO), a novel approach that combines principles from Hamiltonian
mechanics with the learning of the Koopman operator. This framework not only sustains but also discovers
conservation laws automatically, leveraging my foundational knowledge of mathematical physics.
The effectiveness of the HNKO and its extensions are demonstrated across various representative physical
systems, even those with hundreds or thousands of degrees of freedom. The findings indicate that
incorporating prior knowledge of the underlying system and relevant mathematical theories into the learning
framework significantly enhances the ability of machine learning to address complex physical problems.
2025-04-03 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A Network Based Approach for Unbalanced Optimal Transport on Surfaces
- Speaker: Jiangong Pan (Tsinghua University)
- Advisor: Zuoqiang Shi (Tsinghua University)
Abstract: Click to expand
In this report, we present a neural network approach to address the dynamic unbalanced optimal transport
problem on surfaces with point cloud representation. For surfaces with point cloud representation, traditional
method is difficult to apply due to the difficulty of mesh generating. Neural network is easy to implement
even for complicate geometry. Moreover, instead of solving the original dynamic formulation, we consider the
Hamiltonian flow approach, i.e. Karush-Kuhn-Tucker system. Based on this approach, we can exploit mathematical
structure of the optimal transport to construct the neural network and the loss function can be simplified.
Extensive numerical experiments are conducted for surfaces with different geometry. We also test the method
for point cloud with noise, which shows stability of this method. This method is also easy to generalize to
diverse range of problems.
2025-04-10 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Numerical Analysis for Nonlinear Schrödinger Equations with Low Regularity or Singularity
- Speaker: Chushan Wang (National University of Singapore)
- Advisor: Weizhu Bao (National University of Singapore)
Abstract: Click to expand
The nonlinear Schrödinger equation (NLSE) arises from various applications in quantum physics and chemistry,
nonlinear optics, plasma physics, Bose--Einstein Condensates, etc. In these applications, it is necessary to
incorporate low-regularity or singular potential and nonlinearity into the NLSE. Typical examples of such potential
and nonlinearity include the discontinuous square-well potential, the singular Coulomb potential, the non-integer
power nonlinearity, and the logarithmic nonlinearity. Such low regularity and singularity pose significant
challenges in the analysis of standard numerical methods and the development of novel accurate, efficient,
and structure-preserving schemes.
In this talk, I will introduce several new analysis techniques to establish optimal error bounds for some widely
used numerical methods under optimally weak regularity assumptions. Based on the analysis, we also propose novel
temporal and spatial discretizations to handle the low regularity and singularity more effectively.
2025-04-17 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
$\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee
- Speaker: Wenye Li (Fudan University)
- Advisor: Ke Wei (Fudan University)
Abstract: Click to expand
Policy optimization refers to a family of effective algorithms which search in the policy space based on policy
parameterization to solve reinforcement learning problems. Inspired by the similar update pattern of softmax
natural policy gradient and Hadamard policy gradient, we propose to study a general policy update rule called
$\phi$-update, where $\phi$ refers to a scaling function on advantage functions. Under very mild conditions on $\phi$,
the global asymptotic state value convergence of $\phi$-update is firstly established. Then we show that the policy
produced by $\phi$-update indeed converges, even when there are multiple optimal policies. This is in stark contrast
to existing results where explicit regularizations are required to guarantee the convergence of the policy. The exact
asymptotic convergence rate of state values is further established based on the policy convergence. Lastly, we establish
the global linear convergence of $\phi$-update.
2025-04-24 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Near-Optimal Algorithms for Convex Simple Bilevel Optimization under Weak Assumptions
- Speaker: Xu Shi (Fudan University)
- Advisor: Rujun Jiang (Fudan University)
Abstract: Click to expand
This work considers the simple bilevel optimization problem, which involves minimizing a composite convex function over
the optimal solution set of another composite convex minimization problem. By reformulating this bilevel problem as finding
the left-most root of a nonlinear equation and introducing a novel dual approach for the subproblems, we efficiently obtain
an $(\epsilon, \epsilon)$-optimal solution. The proposed methods achieve near-optimal complexity of $\tilde{\mathcal{O}}(1/\sqrt{\epsilon})$
for both the upper- and lower-level objectives under mild assumptions, aligning with the optimal complexity bounds of
first-order methods in unconstrained smooth or composite convex optimization when ignoring logarithmic terms.
2025-05-08 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Uniform-in-Time Estimates on the Size of Chaos for Interacting Particle Systems
- Speaker: Pengzhi Xie (Fudan University)
- Advisor: Shanjian Tang (Fudan University)
Abstract: Click to expand
For any weakly interacting particle system with bounded kernel, we give uniform-in-time estimates of the $L^2$ norm
of correlation functions, provided that the diffusion coefficient is large enough. When the condition on the kernels
is more restrictive, we can remove the dependence of the lower bound for diffusion coefficient on the initial data
and estimate the size of chaos in a weaker sense. Based on these estimates, we may study fluctuation around the
mean-field limit.
2025-05-15 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A Fast Direct Solver for Nonuniform Discrete Fourier Transform of Type 3
- Speaker: Jingyu Liu (Fudan University)
- Advisor: Weiguo Gao and Yingzhou Li (Fudan University)
Abstract: Click to expand
Nonuniform discrete Fourier transform (NUDFT) and its inverse are widely used in various fields of scientific computing.
In this talk, we introduce a novel fast direct inversion method for type 3 NUDFT. The proposed method approximates the
type 3 NUDFT matrix as a product of a type 2 NUDFT matrix and an HSS matrix, where the type 2 NUDFT matrix is further
decomposed as the product of an HSS matrix and uniform DFT matrix. Based on the decomposition of the type 3 NUDFT matrix,
both matrix forward application and backward inversion could be accomplished in quasi-linear complexity. Our fast backward
inversion can serve as a fast direct solver or as an efficient preconditioner. Additionally, we provide an error bound
for the approximation under specific sample distributions. Numerical results are presented to verify the relevant
theoretical properties and demonstrate the efficiency of the proposed methods.
2025-05-22 15:00:00 - 16:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Optimal Approximation and Generalization Analysis for Deep Neural Networks for Solving Partial Differential Equations
- Speaker: Yahong Yang (Pennsylvania State University)
- Mentor: Wenrui Hao (Pennsylvania State University)
Abstract: Click to expand
Neural networks have become powerful tools for solving Partial Differential Equations (PDEs), with wide-ranging applications in engineering, physics, and biology. In this talk, we explore the performance of deep neural networks in solving PDEs, focusing on two primary sources of error: approximation error, and generalization error. The approximation error captures the gap between the exact PDE solution and the neural network’s hypothesis space. Generalization error arises from the challenges of learning from finite samples. We begin by analyzing the approximation capabilities of deep neural networks, particularly under Sobolev norms, and discuss strategies to overcome the curse of dimensionality. We then present generalization error bounds, offering insight into when and why deep networks can outperform shallow ones in solving PDEs.
2025-05-22 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Generalization Error Estimate of a Machine Learning Method for Solving High Dimensional Schrödinger Eigenvalue Problems
- Speaker: Hao Yu (Chinese Academy of Sciences)
- Advisor: Pingbing Ming (Chinese Academy of Sciences)
Abstract: Click to expand
The high-dimensional Schrödinger eigenvalue problem plays a crucial role in various fields, such as computational chemistry,
condensed matter physics and quantum computing. Though classical numerical methods have achieved great success in solving
low-dimensional PDEs and eigenvalue problems, a major challenge persists: the curse of dimensionality. Recently, significant
progress has been made in applying deep neural networks to solve PDEs and Schrödinger eigenvalue problems. In this talk, we
introduce a machine learning method for computing eigenvalues and eigenfunctions of the Schrödinger operator with Dirichlet
boundary conditions. The eigenvalues are deep in the spectrum. The cut-off function technique is employed to construct trial
functions that precisely satisfy the Dirichlet boundary conditions. This approach outperforms the standard boundary penalty method,
as demonstrated by the numerical tests. Under the assumption that the eigenfunctions belong to a spectral Barron space, we derive
a dimension-free convergence rate of the generalization error bound of the method, and all constants in the error bounds grow at
most polynomially. This assumption is verified by proving a new regularity result for the eigenfunctions when the potential lies
in an appropriate spectral Barron space. Moreover, we prove a sharp accumulation rate of the generalization error and extend the
generalization bound to the normalized penalty method, which is widely used in practice.
2025-05-29 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
An Analysis for Reasoning Bias of Language Models with Small Initialization
- Speaker: Junjie Yao (Shanghai Jiao Tong University)
- Advisor: Zhiqin Xu (Shanghai Jiao Tong University)
Abstract: Click to expand
Transformer-based Large Language Models (LLMs) have revolutionized Natural Language Processing by demonstrating exceptional performance
across diverse tasks. This study investigates the impact of the parameter initialization scale on the training behavior and task preferences
of LLMs. We discover that smaller initialization scales encourage models to favor reasoning tasks, whereas larger initialization scales
lead to a preference for memorization tasks. We validate this reasoning bias via real datasets and meticulously designed anchor functions.
Further analysis of initial training dynamics suggests that specific model components, particularly the embedding space and self-attention
mechanisms, play pivotal roles in shaping these learning biases. We provide a theoretical framework from the perspective of model training
dynamics to explain these phenomena. Additionally, experiments on real-world language tasks corroborate our theoretical insights. This work
enhances our understanding of how initialization strategies influence LLM performance on reasoning tasks and offers valuable guidelines for
training models.