Understanding semantic meaning from hand gestures is a challenging but essential task in human-robot interaction scenarios. In this paper we present a baseline evaluation of the Innsbruck Multi-View Hand Gesture (IMHG) dataset  recorded with two RGB-D cameras (Kinect). As a baseline, we adopt a probabilistic appearance-based framework  to detect a hand gesture and estimate its pose using two cameras. The dataset consists of two types of deictic gestures with the ground truth location of the target, two symbolic gestures, two manipulative gestures, and two interactional gestures. We discuss the effect of parallax due to the offset between head and hand while performing deictic gestures. Furthermore, we evaluate the proposed framework to estimate the potential referents on the Innsbruck Pointing at Objects (IPO) dataset .