Visual Analysis of Human Pose Estimation under Frame Degradation Using MediaPipe and ViTPose

Authors

Keywords:

Human Pose Estimation, MediaPipe, ViTPose, Rotation, Low Resolution, Model Performance

Abstract

Human Pose Estimation (HPE) is a significant task in most computer vision applications. However, in the presence of visually degraded inputs, such as low-resolution or rotated video frames, its accuracy tends to reduce. This paper compared two frequently applied pose estimation models including MediaPipe (MP) and ViTPose in terms of their performance on carefully chosen frames extracted from three of our daily videos. In order to emulate non-optimal conditions, we used three kinds of visual filters on the videos, that is, loosy video compression (approximately 70% of the original size), clockwise 90-degree rotation, and 180-degree rotation. Then we used the original frames and compared them with their filtered counterparts using visual overlays of the predicted landmarks. Our results assist in shedding some light on the model reaction to such changes, as they provide a visual representation that could be used to explain anomalies in performance regarding different circumstances. These observations have been pivotal in determining the weakness of HPE systems in unpredictable environments and future opportunities to enhance pose estimation models with a view of their wider and real-life applications.

Downloads

Download data is not yet available.

Published

2025-12-11

How to Cite

Elshami, N. E., Salah, A., Abdellatif, A., & Mohsen, H. (2025). Visual Analysis of Human Pose Estimation under Frame Degradation Using MediaPipe and ViTPose. International Journal of Computers and Informatics (Zagazig University), 9, 41–54. Retrieved from https://www.ijci.zu.edu.eg/index.php/ijci/article/view/132