A robust perception system is crucial for natural human–robot interaction. An essential capability of these systems is to provide a rich representation of the robot’s environment, typically using multiple sensory sources. Moreover, this information allows the robot to react to both external stimuli and user responses. The novel contribution of this paper is the development of a perception architecture, which was based on the bio-inspired concept of endogenous attention being integrated into a real social robot. In this paper, the architecture is defined at a theoretical level to provide insights into the underlying bio-inspired mechanisms and at a practical level to integrate and test the architecture within the complete architecture of a robot. We also defined mechanisms to establish the most salient stimulus for the detection or task in question. Furthermore, the attention-based architecture uses information from the robot’s decision-making system to produce user responses and robot decisions. Finally, this paper also presents the preliminary test results from the integration of this architecture into a real social robot.