Adversarial Attacks and Defenses

Enhancing Adversarial Examples Via Self-Augmentation

Lifeng Huang, Wenzi Zhuang, Chengying Gao, Ning Liu

Intro: Recently, adversarial attacks pose a challenge for the security of Deep Neural Networks, which motivates researchers to establish various defense methods. However, do current defenses achieve real security enough? To answer the question, we propose self-augmentation method (SA) for circumventing defenders to transferable adversarial examples. Concretely, self-augmentation includes two strategies: (1) self-ensemble, which applies additional convolution layers to an existing model to build diverse virtual models that be fused for achieving an ensemble-model effect and preventing overfitting; and (2) deviation-augmentation, which based on the observation of defense models that the input data is surrounded by highly curved loss surfaces, thus inspiring us to apply deviation vectors to input data for escaping from their vicinity space. Notably, we can naturally combine self-augmentation with existing methods to establish more transferable adversarial attacks. Extensive experiments conducted on four vanilla models and ten defenses suggest the superiority of our method compared with the state-of-the-art transferable attacks.

International Conference on Multimedia & Expo (ICME), 2021 (*oral)
[Paper] [Code]

Universal Physical Camouflage Attacks on Object Detectors

Lifeng Huang, Chengying Gao, Yuyin Zhou, Changqing Zou, Cihang Xie, Alan Yuille, Ning Liu

Intro: In this paper, we study physical adversarial attacks on object detectors in the wild. Previous works on this matter mostly craft instance-dependent perturbations only for rigid and planar objects. To this end, we propose to learn an adversarial pattern to effectively attack all instances belonging to the same object category (e.g., person, car), referred to as Universal Physical Camouflage Attack (UPC). Concretely, UPC crafts camouflage by jointly fooling the region proposal network, as well as misleading the classifier and the regressor to output errors. In order to make UPC effective for articulated non-rigid or non-planar objects, we introduce a set of transformations for the generated camouflage patterns to mimic their deformable properties. We additionally impose optimization constraint to make generated patterns look natural to human observers. To fairly evaluate the effectiveness of different physical-world attacks on object detectors, we present the first standardized virtual database, AttackScenes, which simulates the real 3D world in a controllable and reproducible environment. Extensive experiments suggest the superiority of our proposed UPC compared with existing physical adversarial attackers not only in virtual environments (AttackScenes), but also in real-world physical environments.

Computer Vision and Pattern Recognition (CVPR), 2020
[Paper] [Project Page]

G-UAP: Generic Universal Adversarial Perturbation that Fools RPN-based Detectors

Xing wu, Lifeng Huang, Chengying Gao*

Intro: Our paper proposed the G-UAP which is the first work to craft universal adversarial perturbations to fool the RPN-based detectors. G-UAP focuses on misleading the foreground prediction of RPN to background to make detectors detect nothing.

Asian Conference on Machine Learning (ACML), 2019
[Paper]

Back to top

Crowd Counting

Scale-aware Progressive Optimization Network

Ying Chen, Lifeng Huang, Chengying Gao, Ning Liu

Intro: Crowd counting has attracted increasing attention due to its wide application prospect. One of the most essential challenge in this domain is large scale variation, which impacts the accuracy of density estimation. To this end, we propose a scale-aware progressive optimization network (SPO-Net) for crowd counting, which trains a scale adaptive network to achieve high-quality density map estimation and overcome the variable scale dilemma in highly congested scenes. Concretely, the first phase of SPO-Net, band-pass stage, mainly concentrates on preprocessesing the input image and fusing both high-level semantic information and low-level spatial information from separated multi-layer features. And the second phase of SPO-Net, rolling guidance stage, aims to learn a scale-adapted network from multi-scale features as well as rolling training manner. For better learning local correlation of multi-size regions and reducing redundant calculations, we introduce different supervisions with analogy objective in each rolling, refer to as progressive optimization strategy. Extensive experiments on three challenging crowd counting datasets (ShanghaiTech, UCF_CC_50 and UCF-QNRF) not only demonstrate the efficacy of each part in SPO-Net, but also suggest the superiority of our proposed method compared with the state-of-the-art approaches.

ACM MultiMedia (ACM MM), 2020
[Paper]

Self-Bootstrapping Pedestrian Detection in Downward-Viewing Fisheye Cameras Using Pseudo-Labeling

Kaishi Gao, Qun Niu, Haoquan You, Chengying Gao

Intro: Downward-viewing fisheye cameras have attracted much attention in surveillance systems due to the wide coverage and less occlusion. However, pedestrian detection in downward-viewing fisheye cameras remains an open problem due to a lack of large-scale labeled dataset as existing datasets are usually based on oblique-viewing perspective cameras. Furthermore, it's time-consuming to label a downward-viewing fisheye dataset manually. To address this, we propose a self-bootstrapping pedestrian detection method, which automatically pseudo-labels downward-viewing fisheye images by making full use of spatial and temporal consistency of pedestrians in the cameras to promote the accuracy of pedestrian detection. We segment the downward-viewing fisheye images into two regions and propose the pseudo-labeling methods for them progressively: a cyclic fine-tuned detector for the oblique region and a visual tracking method for the vertical region. Combining the pseudo-labels from two regions, we fine-tune the detection network for better accuracy. Experimental results show that the proposed approach reduces time consumption by about 95% compared with labor-intensive manual labeling while it still reaches competitive and comparable Average Precision (AP).

International Conference on Multimedia & Expo (ICME), 2020
[Paper]

Scale-Aware Rolling Fusion Network for Crowd Counting

Ying Chen, Chengying Gao, Zhuo Su, Xiangjian He, Ning Liu

Intro: Due to wide application prospects and various challenges such as large scale variation, inter-occlusion between crowd people and background noise, crowd counting is receiving increasing attention. In this paper, we propose a scale-aware rolling fusion network (SRF-Net) for crowd counting, which focuses on dealing with scale variation in highly congested noisy scenes. SRF-Net is a two-stage architecture that consists of a band-pass stage and a rolling guidance stage. Compared with the existing methods, SRF-Net achieves better results in retaining appropriate multi-level features and capturing multi-scale features, thus improving the quality of density estimation maps in crowded scenarios with large scale variation. We evaluate our method on three popular crowd counting datasets (ShanghaiTech, UCF_CC_50 and UCF-QNRF), and extensive experiments show its outperform over the state-of-the-art approaches.

International Conference on Multimedia & Expo (ICME), 2020
[Paper]

ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding

Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, and Hefeng Wu

Computer Vision and Pattern Recognition (CVPR), 2019
[Paper]

Weak-structure-aware visual object tracking with bottom-up and top-down context exploration

Liu Ning, Liu Chang, Wu Hefeng*, and Zhu Hengzheng

Signal Processing: Image Communication (SPIC), 2018
[Paper]

Hierarchical Ensemble of Background Models for PTZ-based Video Surveillance



IEEE Transactions on Cybernetics (TCYB), 2015
[Paper]

Back to top