Zsolt Varga (LinkedIn)
This thesis proposes the use of deep similarity learning, specifically distance metric learning with a Siamese neural network architecture, to embed human poses into a lower dimensional space for similarity comparison. The goal is to create a map between the original input and the embedding such that the Euclidean distance is small for similar data points and large for dissimilar data points in the embedding space. The approach is shown to be effective in creating a semantic similarity-based human pose embedding that outperforms traditional approaches. The results demonstrate that using these embeddings leads to better classification performance and faster convergence during training. This approach has implications for creating systems that require non-trivial similarity measures, such as invariance to sidedness and the position of body parts, and can serve as input to further models. Overall, this thesis contributes to the development of more advanced techniques for human pose understanding and has potential applications in healthcare, education, fitness, and other fields.