2025 Fall
The seminar of this semester is organized by Shibo Zeng and Yongle Xie, and co-organized by the graduate student union in the School of Mathematical Sciences at Fudan. This section is partially sponsored by Shanghai Key Laboratory for Contemporary Applied Mathematics.
2025-12-25 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
An efficient numerical algorithm for backward stochastic PDEs based on the splitting-up method
- Speaker: Juzhou Li (Fudan University)
- Advisor: Shanjian Tang (Fudan University)
Abstract: Click to expand
We present an efficient numerical method for solving backward stochastic partial differential equations (BSPDEs),
a class of problems notoriously challenging due to the curse of dimensionality because of spatial discretization.
Our approach leverages a splitting technique to decompose the original BSPDE into simpler subproblems,
effectively alleviates the curse of dimensionality. The optimal first-order strong convergence rate
is derived when applied to a general class of non-linear BSPDEs. This convergence rate holds
for the solution $u$ itself, and its spatial gradient $\nabla u$, as well as the process $q$.
The robustness result is also given. Numerical experiments are also provided to validate the theoretical results.
Past Presentations
2025-09-18 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Sharp Estimates for Optimal Multistage Group Partition Testing
- Speaker: Guojiang Shao (Fudan University)
- Advisor: Qi Zhang (Fudan University)
Abstract: Click to expand
In multistage group testing, the tests within the same stage are
considered nonadaptive, while those conducted across different
stages are adaptive. Especially, when the pools within the same
stage are disjoint, meaning that the entire set is divided into
several disjoint subgroups, it is referred to as a multistage
group partition testing problem, denoted as the $(n, d, s)$
problem, where $n$, $d$, and $s$ represent the total number of
items, defectives, and stages respectively. This paper presents
exact solutions for the $(n,1,s)$ and $(n,d,2)$ problems for the
first time. Furthermore, we develop a general dynamic programming
framework for the $(n,d,s)$ problem, which allows us to derive the
sharp estimation of upper and lower bounds.
2025-09-25 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Reinforcement Learning for Heterogeneous DAG Scheduling with
Weighted Cross-Attention
- Speaker: Ruisong Zhou (Peking University)
- Advisor: Zaiwen Wen (Peking University)
Abstract: Click to expand
Efficient scheduling of directed acyclic graphs (DAGs) in
heterogeneous environments is challenging due to diverse resource
capacities and intricate dependencies. In practice, scalability
across environments with varying resource pools, task types, and
other settings, alongside rapid schedule generation, complicates
these challenges. We propose WeCAN, an end-to-end reinforcement
learning framework excelling in heterogeneous DAG scheduling
featuring task-resource compatibility. WeCAN rapidly generates
schedules through single-pass network inference. Leveraging the
weighted cross-attention layer, WeCAN utilizes all available
environment information while preserving scalability across
diverse heterogeneous environments. Moreover, we introduce a
criterion to analyze the optimality gap inherent in list
scheduling based methods, revealing barriers preventing these
methods from consistently finding optimal solutions. The skip
action introduced in our framework addresses this gap. Our
approach delivers robust performance and scalability,
outperforming state-of-the-art methods across diverse datasets.
2025-10-09 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Utilizing Causal Network Markers to Identify Tipping Points ahead
of Critical Transition
- Speaker: Shirui Bian (Fudan University)
- Advisor: Wei Lin (Fudan University)
Abstract: Click to expand
Early-warning signals of delicate design are always used to
predict critical transitions in complex systems, which makes it
possible to render the systems far away from the catastrophic
state by introducing timely interventions. Traditional signals
including the dynamical network biomarker (DNB), based on
statistical properties such as variance and autocorrelation of
nodal dynamics, overlook directional interactions and thus have
limitations in capturing underlying mechanisms and simultaneously
sustaining robustness against noise perturbations. This paper
therefore introduces a framework of causal network markers (CNMs)
by incorporating causality indicators, which reflect the
directional influence between variables. Actually, to detect and
identify the tipping points ahead of critical transition, two
markers are designed: CNM-GC for linear causality and CNM-TE for
non-linear causality, as well as a functional representation of
different causality indicators and a clustering technique to
verify the system's dominant group. Through demonstrations using
benchmark models and real-world datasets of epileptic seizure, the
framework of CNMs shows higher predictive power and accuracy than
the traditional DNB indicator. It is believed that, due to the
versatility and scalability, the CNMs are suitable for
comprehensively evaluating the systems. The most possible
direction for application includes the identification of tipping
points in clinical disease.
2025-10-16 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Phenomenon-Driven Deep Learning Theory: From Implicit
Regularization in Matrix Factorization to Loss Spike Mechanisms in
Adam
- Speaker: Zhiwei Bai (Shanghai Jiao Tong University)
- Advisor: Yaoyu Zhang, Zhiqin Xu (Shanghai Jiao Tong University)
Abstract: Click to expand
Deep neural networks, as highly nonlinear complex systems, present
formidable theoretical challenges. Phenomenon-driven
research—grounded in meticulous observation and carefully designed
experiments to discover intrinsic system patterns—offers a crucial
gateway to understanding these complex systems. This talk presents
our recent advances in deep learning generalization and
optimization theory through a phenomenon-driven approach. One of
the most counterintuitive phenomena in modern machine learning is
that neural networks maintain excellent generalization despite
overparameterization. Understanding implicit regularization
mechanisms in overparameterized models has become essential to
deep learning theory. Matrix factorization models, as an important
subclass, provide an ideal testbed for studying implicit
regularization. This talk first reviews the generalization puzzle,
and introduces our discovery of a fundamental structural property
of loss landscapes: the Embedding Principle, which reveals an
elegant inheritance relationship between critical points across
networks of different scales. Building on this, we analyze matrix
factorization training dynamics from a model-data decoupling
perspective, elucidating when, how, and why different implicit
regularization effects (low rank, low nuclear norm) emerge,
providing a unified understanding of this system. This talk also
presents another phenomenon-driven study: loss spike—a sudden and
sharp surge in the loss function that subsequently subsides. These
spikes are observed across a wide range of network architectures
and datasets, yet their underlying mechanisms remain elusive.
While previous studies attributed loss spikes to complex loss
landscape geometry, we find they originate from Adam's adaptive
preconditioning mechanism. Specifically, when gradients in certain
layers gradually diminish during training, the adaptive mechanism
persistently pushes the maximum eigenvalue of the preconditioned
Hessian above the stability threshold, triggering sustained
instability. This result provides a novel theoretical perspective
for understanding and controlling loss spike behavior.
2025-10-23 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Adaptive-Growth Randomized Neural Networks for PDEs: Algorithms
and Numerical Analysis
- Speaker: Haoning Dang (Xi'an Jiaotong University)
- Advisor: Fei Wang (Xi'an Jiaotong University)
Abstract: Click to expand
Randomized neural network (RaNN) methods have been proposed for
solving various partial differential equations (PDEs),
demonstrating high accuracy and efficiency. However, initializing
the fixed parameters remains a challenging issue. Additionally,
RaNNs often struggle to solve PDEs with sharp or discontinuous
solutions. In this talk, we propose a novel approach called
Adaptive-Growth Randomized Neural Network (AG-RaNN) to address
these challenges. We introduce growth strategies that expand the
neural network, making it wider and deeper to improve the accuracy
of the numerical solution. A key feature of AG-RaNN is its
adaptive strategy for determining the weights and biases of newly
added neurons, enabling the network to expand in both width and
depth without requiring additional training. Instead, all weights
and biases are generated constructively, significantly enhancing
the network's approximation capabilities compared to conventional
randomized neural network methods. In addition, a domain splitting
strategy is introduced to handle the case of discontinuous
solutions. A comprehensive theoretical analysis of RaNN methods is
also presented, covering approximation, statistical, and
optimization errors.
2025-10-30 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Convection-Diffusion Equation: A Theoretically Certified Framework
for Neural Networks
- Speaker: Tangjun Wang (Tsinghua University)
- Advisor: Zuoqiang Shi (Tsinghua University)
Abstract: Click to expand
Differential equations have demonstrated intrinsic connections to
network structures, linking discrete network layers through
continuous equations. Most existing approaches focus on the
interaction between ordinary differential equations (ODEs) and
feature transformations, primarily working on input signals. In
this paper, we study the partial differential equation (PDE) model
of neural networks, viewing the neural network as a functional
operating on a base model provided by the last layer of the
classifier. Inspired by scale-space theory, we theoretically prove
that this mapping can be formulated by a convection-diffusion
equation, under interpretable and intuitive assumptions from both
neural network and PDE perspectives. This theoretically certified
framework covers various existing network structures and training
techniques, offering a mathematical foundation and new insights
into neural networks. Moreover, based on the convection-diffusion
equation model, we design a new network structure that
incorporates a diffusion mechanism into the network architecture
from a PDE perspective. Extensive experiments confirm the
effectiveness of the proposed model.
2025-11-06 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
An adaptive Hermite spectral method for the Boltzmann equation
- Speaker: Jie Wu (Peking University)
- Advisor: Sihong Shao (Peking University)
Abstract: Click to expand
In this talk, we will propose an adaptive Hermite spectral method
for the three-dimensional velocity space of the Boltzmann equation
guided by a newly developed frequency indicator. For the
homogeneous problem, the indicator is defined by the contribution
of high-order coefficients in the spectral expansion. For the
non-homogeneous problem, a Fourier-Hermite scheme is employed,
with the corresponding frequency indicator formulated based on
distributions across the entire spatial domain. The adaptive
Hermite method includes scaling and $p$-adaptive techniques to
dynamically adjust the scaling factor and expansion order
according to the indicator. Numerical experiments cover both
homogeneous and non-homogeneous problems in up to three spatial
dimensions. Results demonstrate that the adaptive method
substantially reduces $L^2$ errors at negligible computational
cost, and the $p$-adaptive method achieves time savings of up to
74\%.
2025-11-13 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Explicit Global Convergence Rates of BFGS without Line Search
- Speaker: Jianjiang Yu (Fudan University)
- Advisor: Weiguo Gao, Luo Luo (Fudan University)
Abstract: Click to expand
We present new theoretical results for the BFGS method with an
adaptive step size~[Gao and Goldfarb, Optimization Methods and
Software, 34(1):194-217, 2019], showing explicit two-phase global
convergence: a linear phase at rate $\mathcal{O}((1 -
1/\varkappa)^{k})$ and a superlinear phase at
$\mathcal{O}((\varkappa/k)^{k})$, where $k$ is the iteration
counter and $\varkappa$ is the condition number. In contrast,
classical analyses establish asymptotic convergence only, and
recent non-asymptotic results mainly address local convergence
under the unit step size or global guarantees with line search. We
further propose a smoothness-aided variant that takes a larger
adaptive step by leveraging the gradient Lipschitz continuity,
thereby accelerating early convergence. These results provide the
first explicit non-asymptotic global characterization of BFGS
without line search.
2025-11-20 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Deep Learning Approaches to Bayesian Phylogenetic Inference
- Speaker: Tianyu Xie (Peking University)
- Advisor: Cheng Zhang (Peking University)
Abstract: Click to expand
Reconstructing the evolutionary relationships among species, i.e.,
phylogenetic inference, has been one of the central problems in
computational biology. With a phylogenetic prior and evolutionary
substitution likelihood model, this problem is formulated as
Bayesian phylogenetic inference of the posterior distribution over
phylogenetic trees. Previous approaches often leverages
Monte-Carlo type approaches, e.g., MCMC, which can suffer from
slow convergence and local mode trapping in practice. In this
talk, we discuss how to integrate variational inference with deep
learning as a powerful solution to Bayesian phylogenetic
inference. Specifically, we develop an autoregressive
probabilisitc model called ARTree and its accelerated version to
modeling the tree topologies, and a semi-implicit hierarchical
construction for the branch lengths. We also introduce
representation learning for phylogenetic trees to provide
high-resolution representations that are ready-to-use for
downstream tasks. These deep learning approaches to Bayesian
phylogenetic inference achieve state-of-the-art inference
accuracies and inspire broader follow-up innovations.
2025-12-04 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
High Precision Computation of Floquet Multipliers and Subspaces
using Periodic Polynomial Eigenvalue Problem
- Speaker: Yehao Zhang (Fudan University)
- Advisor: Yangfeng Su (Fudan University)
Abstract: Click to expand
This report introduces a machine learning method that combines
Tensor Neural Networks (TNN) with homogenization theory for
solving elliptic multiscale equations. The core advantage of TNN
lies in its unique tensor structure, which allows the computation
of high-dimensional neural network function integrals to be
reduced to one-dimensional integrals. This enables the design of
highly accurate high-dimensional integration methods, whose
computational complexity scales only polynomially with the number
of dimensions. Leveraging this feature, we design a high-precision
solver for multiscale problems. Specifically, the original problem
is first transformed via homogenization into a series of cell
problems and a homogenized equation. These are then solved
separately using TNN-based methods. Unlike conventional machine
learning methods that rely on Monte Carlo sampling, our approach
employs deterministic numerical integration, achieving high
computational accuracy. In particular, for cases where the
multiscale coefficients depend on both fast and slow variables,
the corresponding cell problems are defined on high-dimensional
domains; the TNN-based approach enables efficient and accurate
computation for such cases compared to traditional methods,
thereby extending the applicability of homogenization techniques.
We also generalize this approach to elliptic multiscale eigenvalue
problems.
2025-12-11 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A simple method to overcome oversmoothing
- Speaker: Weiqi Guan (Fudan University)
- Advisor: Bobo Hua (Fudan University)
Abstract: Click to expand
In this lecture, we empirically investigate the distinct long-term
behaviors of prenorm and postnorm attention-based graph neural
networks. We observe that prenorm models, while do not occur
oversmoothing, are prone to the curse of depth. In contrast,
postnorm models exhibit the opposite behavior. To mitigate
oversmoothing, we propose a simple and efficient approach that
incorporates Laplacian energy multiplication prior to the
diffusion step. Both theoretical analysis and empirical results
demonstrate that our method effectively alleviates oversmoothing.
2025-12-18 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
High-Precision Seismology: Seismic-petrophysical Probabilistic
Inversion
- Speaker: Jin Wen (Tsinghua University)
- Advisor: Dinghui Yang (Tsinghua University)
Abstract: Click to expand
The primary objective of seismic-petrophysical inversion is to
predict reservoir rock and fluid properties from observed
data—most notably elastic parameters—which are fundamental to
exploration geophysics. This prediction task reformulates the
inverse problem as a conditional (posterior) probability model
conditioned on the observed data. Within the Bayesian framework,
physical constraints inform the construction of the likelihood
function, while data constraints guide the development of the
prior model. A central challenge lies in building high-fidelity
mathematical models grounded in both physics and data. An
additional challenge is the need for algorithmic innovation to
solve these complex models efficiently. For instance, although
common probabilistic inversion algorithms offer global convergence
capabilities, their computational cost is often orders of
magnitude higher—typically thousands of times—than that of
lower-precision, locally convergent seismic inversion methods.
This substantial computational burden has severely limited the
industrial deployment of high-precision seismology. In this talk,
I will present our latest research on both modeling strategies and
algorithmic developments for seismic-petrophysical probabilistic
inversion, and demonstrate the effectiveness of these approaches
in real industrial applications. Moreover, the rapid rise of deep
learning has, to some extent, reshaped the research paradigm of
traditional seismology. Accordingly, neural network models and
deep learning-based inversion algorithms will also be featured in
this report.