This paper presents a study on natural language to sign language translation with human-robot interaction application purposes. By means of the presented methodology, the humanoid robot TEO is expected to represent Spanish sign language automatically by converting text into movements, thanks to the performance of neural networks. Natural language to sign language translation presents several challenges to developers, such as the discordance between the length of input and output data and the use of non-manual markers. Therefore, neural networks and, consequently, sequence-to-sequence models, are selected as a data-driven system to avoid traditional expert system approaches or temporal dependencies limitations that lead to limited or too complex translation systems. To achieve these objectives, it is necessary to find a way to perform human skeleton acquisition in order to collect the signing input data. OpenPose and skeletonRetriever are proposed for this purpose and a 3D sensor specification study is developed to select the best acquisition hardware.