Continuous Goal-Directed Actions: Advances in Robot Learning [Online]
Sobresaliente "Cum Laude"
Robot Programming by Demonstration (PbD) has several limitations. This thesis proposes a solution to the shortcomings of PbD with an inspiration on Goal-Directed imitation applied to robots. A framework for goal imitation, called Continuous Goal-Directed Actions (CGDA), has been designed and developed. This framework provides a mechanism to encode actions as changes in the environment. CGDA learns the objective of the action, beyond the movements made to perform it. With CGDA, an action such as “painting a wall” can be learned as “the wall changed its color a 50% from blue to red”. Traditional robot imitation paradigms such as PbD would learn the same action as ”move joint i 30 degrees, then joint j 43 degrees...”.

This thesis’ main contribution is innovative in providing a framework able to measure and generalize the effects of actions. It also innovates by creating metrics to compare and reproduce goal-directed actions. Reproducing actions encoded in terms of goals allows a robot-configuration independence when reproducing tasks. This innovation allows to circumvent the correspondence problem (adapting the kinematic parameters from humans to robots).

CGDA can complement current kinematic-focused paradigms, such as PbD, in robot imitation. CGDA action encoding is centered on the changes an action produces on the features of objects altered during the action. The features can be any measurable characteristic of the objects such as color, area, form, etc. By tracking object features during human action demonstrations, a high dimensional feature trajectory is created. This trajectory represents a finely-grained sequence of object temporal states during the action. This trajectory is the main resource for the generalization, recognition and execution of actions in CGDA.

Around this presented framework, several components have been added to facilitate and improve the imitation. Na¨ıve implementations of robot learning frameworks usually assume that all the data from the user demonstrations has been correctly sensed and is relevant to the task. This assumption proves wrong in most humandemonstrated learning scenarios. This thesis presents an automatic demonstration and feature selection process to solve this issue. This machine learning pipeline is called Dissimilarity Mapping Filtering (DMF). DMF can filter both irrelevant demonstrations and irrelevant features.

Once an action is generalized from a series of correct human demonstrations, the robot must be provided a method to reproduce this action. Robot joint trajectories are computed in simulation using evolutionary computation through diverse proposed strategies. This computation can be improved by using human-robot interaction. Specifically, a system for robot discovery of motor primitives from random human-guided movements has been developed. These Guided Motor Primitives (GMP) are combined to reproduce goal-directed actions.

To test all these developments, experiments have been performed using a humanoid robot in a simulated environment, and the real full-sized humanoid robot TEO. A brief analysis about the cyber safety of current robots is additionally presented in the final appendices of this thesis.

Keywords: robot learning, humanoid robots, goal-directed actions, motor primitives, feature selection, demonstration selection, cryptobotics.

Universidad Carlos III de Madrid