Title

RANKING BASED LOSSES FOR HUMAN POSE ESTIMATION

Abstract

Abstract

Human pose estimation is a fundamental task in computer vision that involves identifying human joints (keypoints) in images. A widely adopted approach is to formulate it as a regression or classification problem, predicting heatmaps that represent the locations of each keypoint. Heatmap-based methods have become the dominant paradigm, achieving state-of-the-art performance across multiple benchmark datasets. However, these methods face three challenges. First, the commonly used Mean Squared Error (MSE) loss penalizes all deviations in the heatmap equally, which does not necessarily lead to improved joint localization, as it fails to emphasize accurate peak sharpening and precise localization of joints. Second, heatmaps exhibit class-wise imbalance, which leads to biased gradient signals during training. Third, there exists a misalignment between the training objective and the evaluation metric (i.e., mean Average Precision), which can hinder performance. To address these limitations, this thesis introduces ranking-based loss functions tailored for heatmap-based human pose estimation. The proposed losses, Spatial-RS Loss and Instance-Sort Loss, explicitly aim to rank the ground truth joint locations higher than all other candidates and to align confidence scores with localization quality. Theoretically and empirically, these losses are demonstrated to outperform standard alternatives such as MSE and KL-Divergence. The proposed approach is evaluated on both one-dimensional and two-dimensional heatmap representations across three widely-used human pose estimation benchmarks: COCO, CrowdPose, and MPII. To the best of our knowledge, our work is the first to introduce loss functions that explicitly optimize for mAP in pose estimation. Our methods achieve state-of-the-art results, including 79.9 mAP on COCO-val using ViTPose-H, a vision transformer model, and an improvement of 1.5 AP over the baseline with SimCC ResNet-50, reaching 73.6 AP on the same dataset.

Supervisor(s)

Supervisor(s)

MUHAMMED CAN KELES

Date and Location

Date and Location

2025-08-27 11:30:00

Category

Category

MSc_Thesis