Title

Integrating near and long-range evidence for visual detection

Abstract

Abstract

This thesis presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method. Inspired by the Generalized Hough Transform, HoughNet determines the presence of an object at a certain location by the sum of the votes cast on that location. Votes are collected from both near and long-distance locations based on a log-polar vote field. Thanks to this voting mechanism, HoughNet is able to integrate both near and long-range, class-conditional evidence for visual recognition, thereby generalizing and enhancing current object detection methodology, which typically relies on only local evidence. On the COCO dataset, HoughNet’s best model achieves $46.4$ $AP$ (and $65.1$ $AP_{50}$), performing on par with the state-of the-art in bottom-up object detection and outperforming most major one-stage and two-stage methods. We further validate the effectiveness of our proposal in other visual detection tasks, namely, video object detection, instance segmentation, 3D object detection, keypoint detection for human pose estimation and whole-body human pose estimation, face detection and an additional ``labels to photo’’ image generation task, where the integration of our voting module consistently improves performance in all cases.
>
> In order to show the effectiveness of our proposal on whole-body human pose estimation task, we developed a bottom-up, one-stage method called HPRNet. In HPRNet, we build a hierarchical regression mechanism, where we define each of the whole-body keypoints with a relative location (i.e. offset) to a specific point on the person box.
>
> In the context of this thesis we also propose a one-stage, anchor-free object detector, PPDet, which integrates short-range interactions through voting. PPDet sum-pools predictions stemming from individual features into a single prediction which allows the model to reduce the contributions of non-discriminatory features during training.

Zoom Link:
https://zoom.us/j/93980464135?pwd=ZUh0ZURBYU1Dd1Z3SG9yTG9rdGViQT09

Supervisor(s)

Supervisor(s)

NERMIN SAMET

Date and Location

Date and Location

2021-09-08 09:00:00

Category

Category

PhD_Thesis