update gaitedge and use data_in_use

2022-07-19 17:34:20 +08:00
parent 4b681fb9bd
commit 8edfc80326
7 changed files with 66 additions and 24 deletions
@@ -4,11 +4,12 @@ This [paper](https://arxiv.org/abs/2203.03972) has been accepted by ECCV 2022.

 ## Abstract
 Gait is one of the most promising biometrics to identify individuals at a long distance. Although most previous methods have focused on recognizing the silhouettes, several end-to-end methods that extract gait features directly from RGB images perform better. However, we demonstrate that these end-to-end methods may inevitably suffer from the gait-irrelevant noises, i.e., low-level texture and colorful information. Experimentally, we design the **cross-domain** evaluation to support this view. In this work, we propose a novel end-to-end framework named **GaitEdge** which can effectively block gait-irrelevant information and release end-to-end training potential Specifically, GaitEdge synthesizes the output of the pedestrian segmentation network and then feeds it to the subsequent recognition network, where the synthetic silhouettes consist of trainable edges of bodies and fixed interiors to limit the information that the recognition network receives. Besides, **GaitAlign** for aligning silhouettes is embedded into the GaitEdge without losing differentiability. Experimental results on CASIA-B and our newly built TTG-200 indicate that GaitEdge significantly outperforms the previous methods and provides a more practical end-to-end paradigm.
-
 ![img](../../assets/gaitedge.png)
+
 ## CASIA-B*
 Since the silhouettes of CASIA-B were obtained by the outdated background subtraction, there exists much noise caused by the background and clothes of subjects. Hence, we re-annotate the
-silhouettes of CASIA-B and denote it as CASIA-B*. Refer to [here](../../datasets/CASIA-B*/README.md) for more details.
+silhouettes of CASIA-B and denote it as CASIA-B*. You can visit [this link](http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp) to apply for CASIA-B*. More details about CASIA-B* can be found in [this link](../../datasets/CASIA-B*/README.md).
+
 ## Performance
 |    Model   |  NM  |  BG  |  CL  | TTG-200 (cross-domain) |                  Configuration                 |
 |:----------:|:----:|:----:|:----:|:----------------------:|:----------------------------------------------:|
@@ -17,3 +18,24 @@ silhouettes of CASIA-B and denote it as CASIA-B*. Refer to [here](../../datasets
 |  GaitEdge  | 98.0 | 96.3 | 88.0 |          53.9          | [phase2_gaitedge.yaml](./phase2_gaitedge.yaml) |

 ***The results here are higher than those in the paper because we use a different optimization strategy. But this does not affect the conclusion of the paper.***
+
+## Citation
+
+```bibtex
+@inproceedings{yu2006framework,
+  title={A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition},
+  author={Yu, Shiqi and Tan, Daoliang and Tan, Tieniu},
+  booktitle={18th International Conference on Pattern Recognition (ICPR'06)},
+  volume={4},
+  pages={441--444},
+  year={2006},
+  organization={IEEE}
+}
+
+@article{liang2022gaitedge,
+  title={GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality},
+  author={Liang, Junhao and Fan, Chao and Hou, Saihui and Shen, Chuanfu and Huang, Yongzhen and Yu, Shiqi},
+  journal={arXiv preprint arXiv:2203.03972},
+  year={2022}
+}
+```
@@ -2,6 +2,7 @@
 data_cfg:
  dataset_name: CASIA-B*
  dataset_root: your_path
+  data_in_use: [true, false, false, false]
  dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
  num_workers: 1
  remove_no_gallery: false
@@ -2,6 +2,7 @@
 data_cfg:
  dataset_name: CASIA-B*
  dataset_root: your_path
+  data_in_use: [false, false, true, true]
  dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
  num_workers: 1
  remove_no_gallery: false
@@ -19,7 +20,6 @@ evaluator_cfg:
    type: InferenceSampler
    frames_all_limit: 720
  transform:
-    - type: NoOperation
    - type: BaseRgbTransform
    - type: BaseSilTransform

@@ -27,7 +27,6 @@ loss_cfg:
  - loss_term_weight: 1.0
    type: BinaryCrossEntropyLoss
    log_prefix: bce
-    kld: false

 model_cfg:
  model: Segmentation
@@ -45,6 +44,7 @@ scheduler_cfg:
  gamma: 0.1
  milestones: # Learning Rate Reduction at each milestones
    - 10000
+    - 15000
    - 20000
  scheduler: MultiStepLR

@@ -54,9 +54,9 @@ trainer_cfg:
  log_iter: 100
  restore_ckpt_strict: true
  restore_hint: 0
-  save_iter: 10000
+  save_iter: 5000
  save_name: Segmentation
-  total_iter: 30000
+  total_iter: 25000
  sampler:
    batch_shuffle: true
    batch_size:
@@ -66,6 +66,5 @@ trainer_cfg:
    sample_type: fixed_unordered
    type: TripletSampler
  transform:
-    - type: NoOperation
    - type: BaseRgbTransform
    - type: BaseSilTransform
@@ -1,6 +1,7 @@
 data_cfg:
  dataset_name: CASIA-B*
  dataset_root: your_path
+  data_in_use: [false, true, true, true]
  dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
  num_workers: 1
  remove_no_gallery: false # Remove probe if no gallery for it
@@ -1,6 +1,7 @@
 data_cfg:
  dataset_name: CASIA-B*
  dataset_root: your_path
+  data_in_use: [false, true, true, true]
  dataset_partition: ./datasets/CASIA-B*/CASIA-B*.json
  num_workers: 1
  remove_no_gallery: false # Remove probe if no gallery for it
@@ -1,2 +1,21 @@
 # CASIA-B\*
-CASIA-B\* is a re-segmented version of CASIA-B processed by Liang et al. The extra import of CASIA-B* owes to the background subtraction algorithm that CASIA-B uses for generating the silhouette data tends to produce much noise and is outdated for real-world applications nowadays. We use the up-to-date pretreatment strategy to re-segment the raw videos, i.e., the deep pedestrian track and segmentation algorithms. As a result, CASIA-B\* consists of the cropped RGB images, binary silhouettes, and the height-width ratio of the obtained bounding boxes. Please refer to [GaitEdge](../../configs/gaitedge/README.md) for more details. If you need this sub-set, please apply with the instruction mentioned in [http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp]. In the Email Subject, please mark the specific dataset you need, i.e., Dataset B*.
+## Introduction
+CASIA-B\* is a re-segmented version of CASIA-B processed by Liang et al. The extra import of CASIA-B* owes to the background subtraction algorithm that CASIA-B uses for generating the silhouette data tends to produce much noise and is outdated for real-world applications nowadays. We use the up-to-date pretreatment strategy to re-segment the raw videos, i.e., the deep pedestrian track and segmentation algorithms. As a result, CASIA-B\* consists of the cropped RGB images, binary silhouettes, the height-width ratio of the obtained bounding boxes and the aligned silhouettes. Please refer to [GaitEdge](../../configs/gaitedge/README.md) for more details. If you need this sub-set, please apply with the instruction mentioned in [http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp]. In the Email Subject, please mark the specific dataset you need, i.e., Dataset B*.
+
+## Data structure
+```
+casiab-128-end2end/
+    001 (subject)
+        bg-01 (type)
+                000 (view)
+                    000-aligned-sils.pkl (aligned sils, nx64x44)
+                    000-ratios.pkl (aspect ratio of bounding boxes, n)
+                    000-rgbs.pkl (cropped RGB images, nx3x128x128)
+                    000-sils.pkl (binary silhouettes, nx128x128)
+            ......
+        ......
+    ......
+```
+
+## How to use
+By default, it loads all file directory information like other datasets before training starts. If you need to use some of these data separately, such as `aligned-sils`, then you can use the `data_in_use` parameter in `data_cfg` lexicographically, *i.e.* `data_in_use: [true, false, false, false]`.
@@ -15,9 +15,8 @@ class Segmentation(BaseModel):
    def forward(self, inputs):
        ipts, labs, typs, vies, seqL = inputs
        del seqL
-        # ratios = ipts[0]
-        rgbs = ipts[1]
-        sils = ipts[2]
+        rgbs = ipts[0]
+        sils = ipts[1]
        # del ipts
        n, s, c, h, w = rgbs.size()
        rgbs = rgbs.view(n*s, c, h, w)