2021 Fall
The seminar of this semester is organized by Hanxiang Shen and Sihan Mao, and co-organized by the graduate student union in the School of Mathematical Sciences at Fudan. This section is partially sponsored by
Ke Wei.
Past Presentations
2021-09-13 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Geometric Deformation on Objects: Unsupervised Image
Manipulation via Conjugation
- Speaker: Changqing Fu (PSL Research University)
- Advisor: Laurent D. Cohen (PSL Research University)
Abstract: Click to expand
This presentation addresses the problem of image manipulation
and generation under user-specific geometric constraints.
Key challenges for solving this problem include (i) Limited
training database which causes low generalization ability.
(ii) Hand-crafted sketch input producing irregular structure.
and (iii) Handling noise to balance between image quality and
data privacy. To address these problems, a novel two-stage
approach is proposed. Interactive image deformation is
performed through editing on contours. This is performed in
the latent sparse edge space with both color and gradient
information. The output of editing is then fed into a
multi-scale representation of the image to recover quality
output. The model is flexible in terms of transferability
and training efficiency. Joint work with Prof. L. D. Cohen.
2021-09-27 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A Distributed Optimization Approach to Dominant Singular Value
Decompositions
- Speaker: Lei Wang (ICMSEC)
- Advisor: Xin Liu (ICMSEC)
Abstract: Click to expand
In this talk, we propose and study a distributed algorithm for
computing dominant (or truncated) singular value decompositions
(SVD) of large and distributed data matrices by solving
an optimization problem with orthogonality constraints. We
consider a centralized network in which each node privately
holds a subset of columns and only exchanges “safe”
information with a center server, directly or indirectly,
in a collaborative effort to calculate a dominant SVD for
the whole matrix. In the framework of alternating direction
methods of multipliers (ADMM), we propose a novel formulation
for building consensus by equalizing subspaces spanned by
splitting variables instead of equalizing the variables
themselves. This technique greatly relaxes feasibility
restrictions and accelerates convergence significantly, while
at the same time yielding simple subproblems. We design several
algorithmic features, including a low-rank multiplier formula
and mechanisms for controlling subproblem solution accuracies,
to increase the algorithm’s computational efficiency and
reduce its communication overhead. More importantly, the
possibility appears remote, if possible at all, for a malicious
node to be able to uncover the data stored in another node
through publicly shared quantities in the algorithm, which
is not the case in many existing distributed or parallelized
algorithms. We present convergence analysis results, including
a worst-case complexity estimate, for our specialized nonconvex
ADMM algorithm, and extensive experimental results indicating
that the proposed algorithm, while safely guarding data
privacy, has a strong potential to deliver a cutting-edge
performance, especially when communication costs are high.
2021-10-11 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Rank-One Network: An Effective Framework for Image Restoration
- Speaker: Shangqi Gao (Fudan University)
- Advisor: Xiahai Zhuang (Fudan University)
Abstract: Click to expand
The principal rank-one (RO) components of an image represent
the self-similarity of the image, which is an important
property for image restoration. However, the RO components
of a corrupted image could be decimated by the procedure
of image denoising. We suggest that the RO property should
be utilized and the decimation should be avoided in image
restoration. To achieve this, we propose a new framework
comprised of two modules, i.e., the RO decomposition and RO
reconstruction. The RO decomposition is developed to decompose
a corrupted image into the RO components and residual. This
is achieved by successively applying RO projections to the
image or its residuals to extract the RO components. The RO
projections, based on neural networks, extract the closest
RO component of an image. The RO reconstruction is aimed to
reconstruct the important information, respectively from the
RO components and residual, as well as to restore the image
from this reconstructed information. Experimental results
on four tasks, i.e., noise-free image super-resolution (SR),
realistic image SR, gray-scale image denoising, and color image
denoising, show that the method is effective and efficient
for image restoration, and it delivers superior performance
for realistic image SR and color image denoising.
2021-10-18 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
On the Curse of Memory in Recurrent Neural Networks: Approximation
and Optimization Analysis
- Speaker: Zhong Li (Peking University)
- Advisor: Weinan E (Beijing Institute of Big Data Research)
Abstract: Click to expand
Recurrent neural networks (RNNs) are among the most frequently
employed methods to build machine learning models on temporal
data. Despite its ubiquitous applications, many fundamental
theoretical questions remain to be answered. We study the
approximation properties and optimization dynamics of RNNs
when applied to learn input-output relationships in temporal
data. We consider the simple but representative setting of
using continuous-time linear RNNs to learn from data generated
by linear relationships. Mathematically, the latter can be
understood as a sequence of linear functionals. We prove a
universal approximation theorem of such linear functionals and
characterize the approximation rate. Moreover, a fine-grained
dynamical analysis of training linear RNNs by gradient methods
is performed. A unifying theme uncovered is the non-trivial
effect of memory, a notion that can be made precise in our
framework, on both approximation and optimization. When there
is long-term memory in the target, it takes a large number of
neurons to approximate it. Moreover, the training process will
suffer from severe slow downs. In particular, both of these
effects become exponentially more pronounced with increasing
memory - a phenomenon we call the “curse of memory”. These
analyses represent a basic step towards a concrete mathematical
understanding of new phenomenons that may arise in learning
temporal relationships using recurrent architectures.
2021-10-25 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
An Adaptive High Order Method for Finding Third-Order Critical Points
of Nonconvex Optimization
- Speaker: Xihua Zhu (SUFE)
- Advisor: Bo Jiang (SUFE)
Abstract: Click to expand
The optimization methods for computing higher-order critical points
of nonconvex problems attract growing research interest [1, 13–15]
recently, as they are able to exclude the so-called degenerate saddle
points and reach a solution with better quality. Despite theoretical
developments in [1, 13–15], the corresponding numerical experiments
are missing. This paper proposes an implementable higher-order method,
named adaptive high order method (AHOM), to find the third-order
critical points. AHOM is achieved by solving an “easier”
subproblem and incorporating the adaptive strategy of parametertuning
in each iteration of the algorithm. The iteration complexity of the
proposed method is established. Some preliminary numerical results
are provided to show that AHOM can escape from the degenerate saddle
points, where the second-order method could possibly get stuck.
2021-11-01 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
The progress of modern dialogue system: generative, retrieval,
task-oriented
- Speaker: Wei Chen (Fudan University)
- Advisor: Zhongyu Wei (Fudan University)
Abstract: Click to expand
In this talk, we will introduce our research on generative, retrieval
and task-oriented dialogue system respectively. In generative
dialogue system, we propose a new dialog pre-training framework
called DialogVED, which introduces continuous latent variables into
the encoder-decoder pre-training framework to increase the relevance
and diversity of responses. We pre-train DialogVED on large-scale
dialogue corpus, achieving SOTA results on multiple downstream tasks.
In retrieval-based dialogue system, we propose a fine-to-coarse
distillation model based on contextual matching for coarse-grained
response selection in open-domain conversations. Extensive experimental
results on two self-constructed datasets show that the proposed
methods achieve a significant improvement over all evaluation metrics
compared with traditional baseline methods. In task-oriented dialogue
system, we propose a diagnosis-oriented dialogue system framework,
and introduce a large-scale medical dialogue corpus DialoIMC with
multi-level fine-grained annotations. We establish three evaluation
tracks on DialoIMC and report a set of benchmark results for each
track, which shows the usability of the dataset and sets a baseline
for future studies.
2021-11-08 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Random batch particle methods for the homogeneous Landau equation
- Speaker: Yijia Tang (Shanghai Jiao Tong University)
- Advisor: Shi Jin (Shanghai Jiao Tong University)
Abstract: Click to expand
Landau equation is a fundamental integro-differential equation
describing the evolution of the distribution for charged particles in
plasma physics. In this talk, I will introduce random batch particle
methods for efficiently solving the homogeneous Landau equation. The
methods are stochastic variations of the particle methods proposed
by Carrillo et al. using the random batch strategy. The collisions
only take place inside the small but randomly selected batches so
that the computational cost is reduced from $O(N^2)$ to $O(N)$ per
time step. Meanwhile, these methods can preserve the conservation
of mass, momentum, energy and the decay of entropy.
2021-11-15 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
A fast offline/online forward solver for stationary inverse transport
equation with large datasets
- Speaker: Jingyi Fu (Shanghai Jiao Tong University)
- Advisor: Min Tang (Shanghai Jiao Tong University)
Abstract: Click to expand
It is of great interests to solve inverse stationary radiative
transport equation (RTE) with very large data sets. The standard way
is to formulate the inverse problem into an optimization problem,
but the bottle-neck is that one has to solve the forward problem over
and over again which is time consuming. In this paper, we propose
an offline/online solver for RTE based on the Tailored Finite Point
Method (TFPM) proposed in \cite{ShiA} and \cite{HanTwo}. TFPM for RTE
is uniformly convergent with respect to the mean free path and valid
up to the boundary and interface layers. Two cases are considered,
one is to solve the RTE with fixed scattering and absorption cross
sections, while the boundary conditions vary; the other is when
cross sections vary in a small domain and the boundary conditions
change for a lot of times. In these two cases, the solver can be
decomposed into offline/online stages. The cost at offline stage is
comparable to classical methods, while the cost at online stage is
much lower. One only needs to calculate the offline stage once and
update the online stage when varying the parameters. Our proposed
solver is much cheaper when one needs to solve RTE with multiple
right hand sides or when the cross sections vary in a small domain,
thus can accelerate the speed of inverse RTE problems.
2021-11-22 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Optimal Gradient Tracking for Decentralized Optimization
- Speaker: Zhuoqing Song (Fudan University)
- Advisor: Lei Shi (Fudan University)
Abstract: Click to expand
In this talk, we focus on minimizing the sum of smooth and strongly
convex local objective functions stored in a distributed manner across
the nodes of an undirected network. We propose an Optimal Gradient
Tracking (OGT) method which is the first single-loop gradient-type
method optimal in both gradient computation and communication
complexities. The development of OGT has two steps. First, we propose a
new accelerated gradient tracking method termed ``Snapshot” Gradient
Tracking (SS-GT). Inspired by the variance reduction methods Katyusha
and L-Katyusha, SS-GT combines ``snapshot” points and ``negative
momentums” with the classical gradient tracking and outperforms
previous single-loop accelerated gradient tracking methods. SS-GT is
of independent interest and can be extended to more general settings
such as time-varying graphs and directed graphs. Second, we develop
a technique termed Loopless Chebyshev Acceleration (LCA) which
can be implemented ``looplessly’’ and achieve similar effects
with inner loops of Chebyshev acceleration. The LCA technique can
accelerate many other gradient tracking based methods with respect
to the graph condition number.
2021-12-06 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Modeling and Decoupling Systemic Risk into Endopathic and
Exopathic Competing Risks
- Speaker: Jingyu Ji (Fudan University)
- Advisor: Zhengjun Zhang (UW–Madison)
Abstract: Click to expand
In this talk, we propose a new nonlinear time series model:
autoregressive conditional accelerated Fréchet (AcAF) model
and introduce two new endopathic and exopathic competing
risk measures for better learning risk patterns, decoupling
systemic risk, and making better risk management. We establish
the probabilistic properties of stationarity and ergodicity
of the AcAF model. Statistical inference is developed
through conditional maximum likelihood estimation. The
consistency and asymptotic normality of the estimators are
derived. Simulation demonstrates the efficiency of the proposed
estimators and the AcAF model's flexibility in modeling
heterogeneous data. Empirical studies on the stock returns
and the cryptocurrency trading show the superior performance
of the AcAF model in terms of the identified risk patterns,
enhancing the understanding of the systemic risks of a market
and their causes, and making better risk management possible.
2021-12-13 16:10:00 - 17:00:00 @ Rm 1801, Guanghua East Tower
[poster]
- Title:
Stochastic adaptive control of stabilization and synchronization
- Speaker: Xiaoxiao Peng (Fudan University)
- Advisor: Wei Lin (Fudan University)
Abstract: Click to expand
This result is completed by Dr.Zhou Shijie affiliated to
Dr.Lin Wei. Here, we design a scenario for stochastic adaptive
control. Firstly, we theoraticlly prove the feasibility on
stabilization, synchronization and parameter identification in terms
of multi-dimension Brownian motion. And the criterion for specific
noise is provided here. At last, we verify its feasibility by giving
some typical examples of dynamical systems.