User simultaneous detection and tracking is an issue at the core of human–robot interaction (HRI). Several methods exist and give good results; many use image processing techniques on images provided by the camera. The increasing presence in mobile robots of range-imaging cameras (such as structured light devices as Microsoft Kinects) allows us to develop image processing on depth maps. In this article, a fast and lightweight algorithm is presented for the detection and tracking of 3D clusters thanks to classic 2D techniques such as edge detection and connected components applied to the depth maps. The recognition of clusters is made using their 2D shape. An algorithm for the compression of depth maps has been specifically developed, allowing the distribution of the whole processing among several computers. The algorithm is then applied to a mobile robot for chasing an object selected by the user. The algorithm is coupled with laser-based tracking to make up for the narrow field of view of the range-imaging camera. The workload created by the method is light enough to enable its use even with processors with limited capabilities. Extensive experimental results are given for verifying the usefulness of the proposed method.