Unifying Part Detection and Association for Recurrent Multi-Person Pose Estimation

Published in , 2019

We propose a joint model of human joint detection and association for 2D multi-person pose estimation (MPPE). The approach unifies training of joint detection and association without a need for further processing or sophisticated heuristics in order to associate the joints with people individually. The approach consists of two stages, where in the first stage joint detection heatmaps and association features are extracted, and in the second stage, whose input are the extracted features of the first stage, we introduce a recurrent neural network (RNN) which predicts the heatmaps of a single person joints in each iteration. In addition, the network learns a stopping criterion in order to halt once it has identified all individuals in the image. This approach allowed us to eliminate several heuristic assumptions and parameters needed for association which do not necessarily hold true. Additionally, such an end-to-end approach allows the final objective to be known and directly optimized over during training. We evaluated our model on the challenging MSCOCO and OCHuman datasets and obtained a significant improvement over the baseline, particularly in challenging scenes with severe occlusions or outliers.

Authors: R. Briq, A. Doering, J. Gall

Download here

Convolutional Simplex Projection Network for Weakly Supervised Semantic Segmentation (CSPN)

Published in British Machine Vision Conference (BMVC), 2018

Weakly supervised semantic segmentation has been a subject of increased interest due to the scarcity of fully annotated images. We introduce a new optimization approach for solving weakly supervised semantic segmentation with deep Convolutional Neural Networks (CNNs). The method introduces a novel layer which applies simplex projection on the output of a neural network using area constraints of class objects. The proposed method is general and can be seamlessly integrated into any CNN architecture. Moreover, the projection layer allows strongly supervised models to be adapted to weakly supervised models effortlessly by substituting ground truth labels. Our experiments have shown that applying such an operation on the output of a CNN substantially improves the accuracy of the baseline architecture and allows for faster convergence.

Authors: R. Briq, M. Moeller, J. Gall

Download here

Online Robust Learning Using the Radon Point

MSc thesis, University of Bonn, 2017

This thesis analyzes a novel approach for model synchronization in distributed online learning from noisy data streams. The proposed approach combines weak hypotheses that have been computed locally by replacing them with their Radon point (akin to the center point, or median in 1-dim). These hypotheses may be learned by a wide range of online learning algorithms thus making the approach black-box. The work encompasses both the theoretical and empirical aspects of the method. The theoretical analysis focuses on proving probabilistic guarantees on the error bound. We show that on noise-free streams, the approach satisfies strong probabilistic error guarantees within the framework of PAC (Probably Approximately Correct) learning. Additionally, under strict assumptions, it provides a method for converting regret bounds of standard online learning algorithms to PAC error bounds. The empirical part focuses on evaluating the practical aspect of the approach. It shows that the proposed approach outperforms state of the art approaches on noisy data streams.

Download here