update gaitedge and use data_in_use
This commit is contained in:
@@ -4,11 +4,12 @@ This [paper](https://arxiv.org/abs/2203.03972) has been accepted by ECCV 2022.
|
||||
|
||||
## Abstract
|
||||
Gait is one of the most promising biometrics to identify individuals at a long distance. Although most previous methods have focused on recognizing the silhouettes, several end-to-end methods that extract gait features directly from RGB images perform better. However, we demonstrate that these end-to-end methods may inevitably suffer from the gait-irrelevant noises, i.e., low-level texture and colorful information. Experimentally, we design the **cross-domain** evaluation to support this view. In this work, we propose a novel end-to-end framework named **GaitEdge** which can effectively block gait-irrelevant information and release end-to-end training potential Specifically, GaitEdge synthesizes the output of the pedestrian segmentation network and then feeds it to the subsequent recognition network, where the synthetic silhouettes consist of trainable edges of bodies and fixed interiors to limit the information that the recognition network receives. Besides, **GaitAlign** for aligning silhouettes is embedded into the GaitEdge without losing differentiability. Experimental results on CASIA-B and our newly built TTG-200 indicate that GaitEdge significantly outperforms the previous methods and provides a more practical end-to-end paradigm.
|
||||
|
||||

|
||||
|
||||
## CASIA-B*
|
||||
Since the silhouettes of CASIA-B were obtained by the outdated background subtraction, there exists much noise caused by the background and clothes of subjects. Hence, we re-annotate the
|
||||
silhouettes of CASIA-B and denote it as CASIA-B*. Refer to [here](../../datasets/CASIA-B*/README.md) for more details.
|
||||
silhouettes of CASIA-B and denote it as CASIA-B*. You can visit [this link](http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp) to apply for CASIA-B*. More details about CASIA-B* can be found in [this link](../../datasets/CASIA-B*/README.md).
|
||||
|
||||
## Performance
|
||||
| Model | NM | BG | CL | TTG-200 (cross-domain) | Configuration |
|
||||
|:----------:|:----:|:----:|:----:|:----------------------:|:----------------------------------------------:|
|
||||
@@ -17,3 +18,24 @@ silhouettes of CASIA-B and denote it as CASIA-B*. Refer to [here](../../datasets
|
||||
| GaitEdge | 98.0 | 96.3 | 88.0 | 53.9 | [phase2_gaitedge.yaml](./phase2_gaitedge.yaml) |
|
||||
|
||||
***The results here are higher than those in the paper because we use a different optimization strategy. But this does not affect the conclusion of the paper.***
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@inproceedings{yu2006framework,
|
||||
title={A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition},
|
||||
author={Yu, Shiqi and Tan, Daoliang and Tan, Tieniu},
|
||||
booktitle={18th International Conference on Pattern Recognition (ICPR'06)},
|
||||
volume={4},
|
||||
pages={441--444},
|
||||
year={2006},
|
||||
organization={IEEE}
|
||||
}
|
||||
|
||||
@article{liang2022gaitedge,
|
||||
title={GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality},
|
||||
author={Liang, Junhao and Fan, Chao and Hou, Saihui and Shen, Chuanfu and Huang, Yongzhen and Yu, Shiqi},
|
||||
journal={arXiv preprint arXiv:2203.03972},
|
||||
year={2022}
|
||||
}
|
||||
```
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
data_cfg:
|
||||
dataset_name: CASIA-B*
|
||||
dataset_root: your_path
|
||||
data_in_use: [true, false, false, false]
|
||||
dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
|
||||
num_workers: 1
|
||||
remove_no_gallery: false
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
data_cfg:
|
||||
dataset_name: CASIA-B*
|
||||
dataset_root: your_path
|
||||
data_in_use: [false, false, true, true]
|
||||
dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
|
||||
num_workers: 1
|
||||
remove_no_gallery: false
|
||||
@@ -19,7 +20,6 @@ evaluator_cfg:
|
||||
type: InferenceSampler
|
||||
frames_all_limit: 720
|
||||
transform:
|
||||
- type: NoOperation
|
||||
- type: BaseRgbTransform
|
||||
- type: BaseSilTransform
|
||||
|
||||
@@ -27,7 +27,6 @@ loss_cfg:
|
||||
- loss_term_weight: 1.0
|
||||
type: BinaryCrossEntropyLoss
|
||||
log_prefix: bce
|
||||
kld: false
|
||||
|
||||
model_cfg:
|
||||
model: Segmentation
|
||||
@@ -45,6 +44,7 @@ scheduler_cfg:
|
||||
gamma: 0.1
|
||||
milestones: # Learning Rate Reduction at each milestones
|
||||
- 10000
|
||||
- 15000
|
||||
- 20000
|
||||
scheduler: MultiStepLR
|
||||
|
||||
@@ -54,9 +54,9 @@ trainer_cfg:
|
||||
log_iter: 100
|
||||
restore_ckpt_strict: true
|
||||
restore_hint: 0
|
||||
save_iter: 10000
|
||||
save_iter: 5000
|
||||
save_name: Segmentation
|
||||
total_iter: 30000
|
||||
total_iter: 25000
|
||||
sampler:
|
||||
batch_shuffle: true
|
||||
batch_size:
|
||||
@@ -66,6 +66,5 @@ trainer_cfg:
|
||||
sample_type: fixed_unordered
|
||||
type: TripletSampler
|
||||
transform:
|
||||
- type: NoOperation
|
||||
- type: BaseRgbTransform
|
||||
- type: BaseSilTransform
|
||||
@@ -1,6 +1,7 @@
|
||||
data_cfg:
|
||||
dataset_name: CASIA-B*
|
||||
dataset_root: your_path
|
||||
data_in_use: [false, true, true, true]
|
||||
dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
|
||||
num_workers: 1
|
||||
remove_no_gallery: false # Remove probe if no gallery for it
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
data_cfg:
|
||||
dataset_name: CASIA-B*
|
||||
dataset_root: your_path
|
||||
data_in_use: [false, true, true, true]
|
||||
dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
|
||||
num_workers: 1
|
||||
remove_no_gallery: false # Remove probe if no gallery for it
|
||||
|
||||
@@ -1,2 +1,21 @@
|
||||
# CASIA-B\*
|
||||
CASIA-B\* is a re-segmented version of CASIA-B processed by Liang et al. The extra import of CASIA-B* owes to the background subtraction algorithm that CASIA-B uses for generating the silhouette data tends to produce much noise and is outdated for real-world applications nowadays. We use the up-to-date pretreatment strategy to re-segment the raw videos, i.e., the deep pedestrian track and segmentation algorithms. As a result, CASIA-B\* consists of the cropped RGB images, binary silhouettes, and the height-width ratio of the obtained bounding boxes. Please refer to [GaitEdge](../../configs/gaitedge/README.md) for more details. If you need this sub-set, please apply with the instruction mentioned in [http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp]. In the Email Subject, please mark the specific dataset you need, i.e., Dataset B*.
|
||||
## Introduction
|
||||
CASIA-B\* is a re-segmented version of CASIA-B processed by Liang et al. The extra import of CASIA-B* owes to the background subtraction algorithm that CASIA-B uses for generating the silhouette data tends to produce much noise and is outdated for real-world applications nowadays. We use the up-to-date pretreatment strategy to re-segment the raw videos, i.e., the deep pedestrian track and segmentation algorithms. As a result, CASIA-B\* consists of the cropped RGB images, binary silhouettes, the height-width ratio of the obtained bounding boxes and the aligned silhouettes. Please refer to [GaitEdge](../../configs/gaitedge/README.md) for more details. If you need this sub-set, please apply with the instruction mentioned in [http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp]. In the Email Subject, please mark the specific dataset you need, i.e., Dataset B*.
|
||||
|
||||
## Data structure
|
||||
```
|
||||
casiab-128-end2end/
|
||||
001 (subject)
|
||||
bg-01 (type)
|
||||
000 (view)
|
||||
000-aligned-sils.pkl (aligned sils, nx64x44)
|
||||
000-ratios.pkl (aspect ratio of bounding boxes, n)
|
||||
000-rgbs.pkl (cropped RGB images, nx3x128x128)
|
||||
000-sils.pkl (binary silhouettes, nx128x128)
|
||||
......
|
||||
......
|
||||
......
|
||||
```
|
||||
|
||||
## How to use
|
||||
By default, it loads all file directory information like other datasets before training starts. If you need to use some of these data separately, such as `aligned-sils`, then you can use the `data_in_use` parameter in `data_cfg` lexicographically, *i.e.* `data_in_use: [true, false, false, false]`.
|
||||
|
||||
@@ -15,9 +15,8 @@ class Segmentation(BaseModel):
|
||||
def forward(self, inputs):
|
||||
ipts, labs, typs, vies, seqL = inputs
|
||||
del seqL
|
||||
# ratios = ipts[0]
|
||||
rgbs = ipts[1]
|
||||
sils = ipts[2]
|
||||
rgbs = ipts[0]
|
||||
sils = ipts[1]
|
||||
# del ipts
|
||||
n, s, c, h, w = rgbs.size()
|
||||
rgbs = rgbs.view(n*s, c, h, w)
|
||||
|
||||
Reference in New Issue
Block a user