Dual-Branch Network Fused With Two-Level Attention Mechanism for Clothes-Changing Person Re-Identification

Dual-Branch Network Fused With Two-Level Attention Mechanism for Clothes-Changing Person Re-Identification

Yong Lu, Ming Zhe Jin
Copyright: © 2023 |Pages: 14
DOI: 10.4018/IJWSR.322021
Article PDF Download
Open access articles are freely available for download

Abstract

Clothes-changing person re-identification is a hot topic in the current academic circles. Most of the current methods assume that the clothes of a person will not change in a short period of time, but they are not applicable when people change clothes. Based on this situation, this paper proposes a dual-branch network for clothes-changing person re-identification that integrates a two-level attention mechanism and captures and aggregates fine-grained person semantic information in channels and spaces through a two-level attention mechanism and suppresses the sensitivity of the network to clothing features by training the clothing classification branch. The method does not use auxiliary means such as human skeletons, and the complexity of the model is greatly reduced compared with most methods. This paper conducts experiments on the popular clothes-changing person re-identification dataset PRCC and a very large-scale cross-spatial-temporal dataset (LaST). The experimental results show that the method in this paper is more advanced than the existing methods.
Article Preview
Top

Introduction

Person re-identification technology, a key technology within intelligent surveillance systems, is regarded as an image retrieval problem. Person re-identification technology is a necessary technology for intelligent surveillance systems in public places for instances like locating criminals. It can also be applied to intelligent security, epidemiological investigations, and intelligent transportation. Through all-weather monitoring, the technology can prevent the occurrence of crimes like theft and robbery, locate lost persons, and assist intelligent transportation systems in completing the automatic dispatching of people, vehicles, and roads.

When monitoring large amounts of data, traditional manual processing methods are inefficient and costly. The person re-identification technology can improve such problems by quickly locating and tracking the target. This saves labor costs, improves the accuracy of detection, and has a high application value in intelligent monitoring systems.

Person re-identification aims is to search for a targeted person via surveillance videos at different locations and times. Due to factors like the limitations of technology, most of the current research on person re-identification assume that the target’s clothes are unchanged (Huang et al., 2018; Jin et al., 2022; Li et al., 2018). Thus, it uses the color, texture, and other features of the clothes as discriminant conditions. However, the problem of changing clothes is unavoidable when re-identifying a person over an extended time. There is also the problem of changing clothes in some short-term scenarios. For example, suspects usually change clothes to avoid identification and tracking. The original method will no longer be applicable in the clothes-changing scenario because people may be wrongly matched if wearing similar clothes. To address the issue, this article studies problems related to clothes-changing person re-identification.

To avoid the interference of clothes, some clothes-changing re-identification methods attach modal inputs along with the input image (Chao et al., 2019; Chen et al., 2021; Qian et al., 2020; Shu et al., 2021; Yang et al., 2019). These include three-dimensional (3D) shapes, bones, and contour (Chao et al., 2019; Chen et al., 2021; Qian et al., 2020). However, these methods often require additional models to capture multimodal information. This, in turn, increases the complexity of the model. In fact, original images contain rich clothing-independent information, which is largely underutilized.

This article aims to better mine information unrelated to clothes in the image. Thus, it adds a two-level attention module to the model, acting on the features extracted by the backbone network in space and channel, respectively. Then, it obtains a multi-scale fine-grained attention map. The module can more effectively capture the semantic information of persons in the channel and space, as well as eliminate the influence of irrelevant background as it focuses on features related to an individual. In view of the influence of the clothes feature, this article sets up a clothes classification branch. It also suppresses the sensitivity of the model to clothes features by training this branch. Experiments on popular datasets show that the proposed method is competitive (Shu et al., 2021; Yang et al., 2019).

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2023)
Volume 19: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 18: 4 Issues (2021)
Volume 17: 4 Issues (2020)
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing