Abstract | ||
---|---|---|
With the growth of urban population, crowd analysis has become an important and necessary task in the field of computer vision. The goal of crowd counting, which is a subfield of crowd analysis, is to count the number of people in an image or a zone of a picture. Due to the problems like heavy occlusions, perspective and luminous intensity variations, it is still extremely challenging to achieve crowd counting. Recent state-of-the-art approaches are mainly designed with convolutional neural networks to generate density maps. In this work, Multi-Dilation Network (MDNet) is proposed to solve the problem of crowd counting in congested scenes. The MDNet is made up of two parts: a VGG-16 based front end for feature extraction and a back end containing multi-dilation blocks to generate density maps. Especially, a multi-dilation block has four branches which are used to collect features in different sizes. By using dilated convolutional operations, the multi-dilation block could obtain various features while the maximum kernel size is still 3 x 3. The experiments on two challenging crowd counting datasets, UCF_CC_50 and ShanghaiTech, have shown that the proposed MDNet achieves better performances than other state-of-the-art methods, with a lower mean absolute error and mean squared error. Comparing to the network with multi-scale blocks which adopt larger kernels to extract features, MDNet still gains competitive performances with fewer model parameters.
|
Year | DOI | Venue |
---|---|---|
2019 | 10.1145/3338533.3366687 | MMAsia '19: ACM Multimedia Asia
Beijing
China
December, 2019 |
Field | DocType | ISBN |
Computer vision,Dilation (morphology),Computer science,Artificial intelligence,Crowd counting | Conference | 978-1-4503-6841-4 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shuheng Wang | 1 | 0 | 1.01 |
Hanli Wang | 2 | 865 | 69.10 |
Qinyu Li | 3 | 9 | 5.27 |