Traffic forecasting is one of the most popular spatio-temporal
tasks in the field of machine learning. A prevalent approach
in the field is to combine graph convolutional networks and
recurrent neural networks for the spatio-temporal processing.
There has been fierce competition and many novel methods
have been proposed. In this paper, we present the method
of spatio-temporal graph neural controlled differential equation (STG-NCDE). Neural controlled differential equations
(NCDEs) are a breakthrough concept for processing sequential data. We extend the concept and design two NCDEs: one
for the temporal processing and the other for the spatial processing. After that, we combine them into a single framework.
We conduct experiments with 6 benchmark datasets and 20
baselines. STG-NCDE shows the best accuracy in all cases,
outperforming all those 20 baselines by non-trivial margins.
@article{einstein1950meaning,abbr={AJP},bibtex_show={true},title={The meaning of relativity},author={Einstein, Albert and Taub, AH},journal={American Journal of Physics,},volume={18},number={6},pages={403--404},year={1950},publisher={American Association of Physics Teachers,}}
Owing to the remarkable development of deep
learning technology, there have been a series of efforts to build
deep learning-based climate models. Whereas most of them utilize
recurrent neural networks and/or graph neural networks, we
design a novel climate model based on the two concepts, the
neural ordinary differential equation (NODE) and the diffusion
equation. Many physical processes involving a Brownian motion
of particles can be described by the diffusion equation and as a
result, it is widely used for modeling climate. On the other hand,
neural ordinary differential equations (NODEs) are to learn a
latent governing equation of ODE from data. In our presented
method, we combine them into a single framework and propose
a concept, called neural diffusion equation (NDE). Our NDE,
equipped with the diffusion equation and one more additional
neural network to model inherent uncertainty, can learn an
appropriate latent governing equation that best describes a given
climate dataset. In our experiments with two real-world and one
synthetic datasets and eleven baselines, our method consistently
outperforms existing baselines by non-trivial margins.
@article{einstein1950meaning,abbr={AJP},bibtex_show={true},title={The meaning of relativity},author={Einstein, Albert and Taub, AH},journal={American Journal of Physics,},volume={18},number={6},pages={403--404},year={1950},publisher={American Association of Physics Teachers,}}
preprints
2021
Preprint
PDE-Regularized Neural Networks for Image Classification
Differential equations can be used to design neural networks. For instance, neural
ordinary differential equations (neural ODEs) can be considered as a continuous
generalization of residual networks. In this work, we present a novel partial
differential equation (PDE)-based approach for image classification, where we
learn both a PDE’s governing equation for image classification and its solution
approximated by our neural network. In other words, the knowledge contained
in the learned governing equation can be injected into the neural network which
approximates the PDE solution function. Owing to the recent advancement of
learning PDEs, the presented novel concept, called PR-Net, can be implemented.
Our method shows comparable (or better) accuracy and robustness for various
datasets and tasks in comparison with neural ODEs and Isometric MobileNet
V3. Thanks to the efficient nature of PR-Net, it is suitable to be deployed in
resource-scarce environments, e.g., deployed instead of MobileNet.
2021
arXiv
Toward Compact Deep Neural Networks via Energy-Aware Pruning
Despite of the remarkable performance, modern deep
neural networks are inevitably accompanied with a significant amount of computational cost for learning and deployment, which may be incompatible with their usage on
edge devices. Recent efforts to reduce these overheads involves pruning and decomposing the parameters of various layers without performance deterioration. Inspired by
several decomposition studies, in this paper, we propose a
novel energy-aware pruning method that quantifies the importance of each filter in the network using nuclear-norm
(NN). Proposed energy-aware pruning leads to state-of-the
art performance for Top-1 accuracy, FLOPs, and parameter reduction across a wide range of scenarios with multiple network architectures on CIFAR-10 and ImageNet after fine-grained classification tasks. On toy experiment, despite of no fine-tuning, we can visually observe that NN not
only has little change in decision boundaries across classes,
but also clearly outperforms previous popular criteria. We
achieve competitive results with 40.4/49.8% of FLOPs and
45.9/52.9% of parameter reduction with 94.13/94.61% in
the Top-1 accuracy with ResNet-56/110 on CIFAR-10, respectively. In addition, our observations are consistent for
a variety of different pruning setting in terms of data size
as well as data quality which can be emphasized in the stability of the acceleration and compression with negligible
accuracy loss.