Video-based high-density crowd analysis and prediction has been a long-standing topic in computer vision. It is notoriously difficult due to, but not limited to, the lack of high-quality data and complex crowd dynamics. Consequently, it has been relatively under studied. In this paper, we propose a new approach that aims to learn from in-the-wild videos, often with low quality where it is difficult to track individuals or count heads. The key novelty is a new physics prior to model crowd dynamics. We model high-density crowds as active matter, a continumm with active particles subject to stochastic forces, named 'crowd material'. Our physics model is combined with neural networks, resulting in a neural stochastic differential equation system which can mimic the complex crowd dynamics. Due to the lack of similar research, we adapt a range of existing methods which are close to ours for comparison. Through exhaustive evaluation, we show our model outperforms existing methods in analyzing and forecasting extremely high-density crowds. Furthermore, since our model is a continuous-time physics model, it can be used for simulation and analysis, providing strong interpretability. This is categorically different from most deep learning methods, which are discrete-time models and black-boxes.
Abstract
Resources
-
Learning Extremely High Density Crowds as Active Matters.
 
The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
2025
Conference
 Paper    Code   BibTex @inproceedings{he2025learning, author = {Feixiang He and Jiangbei Yue and Jialin Zhu and Armin Seyfried and Dan Casas and Julien Pettre and He Wang}, title = {Learning Extremely High Density Crowds as Active Matters}, booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2025} }
, , , , , , . 
Acknowledgement
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 899739 CrowdDNA.