Sign Language Recognition (SLR) models rely heavily on advances reached by the Human Action Recognition (HAR). One of the simplest and most dimensional-reduced modality is the skeleton joints and limbs represented with key-point landmarks and edges connecting these landmarks. These skeletons can be obtained by pose estimation, depth maps or motion capture. For HAR, models are usually interested in less granularity of pose estimation, compared to SLR, where it is highly important the landmark estimation of not only the pose and body but the facial gestures, hands and fingers. In this work, we compare three whole-body estimation libraries/models that are gaining attraction in the SLR task. We first find their relation by identifying common keypoints in their landmark structure and analyzing their quality. Then, we complement this analysis by comparing their annotations in three sign language datasets with videos of different quality, background, and region (Peru and USA). We test a sign language recognition model to compare the quality of the annotations provided by these libraries/models.