User-guided framework for scene generation using diffusion models
2024 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC)
Parede de Coura/Portugal

In recent decades, one branch of robotics has focused on emulating specific human capabilities like manipulation, perception, and navigation to accomplish daily tasks. However, the human manner to solve these task extends beyond these conventional capabilities, including more complex faculties, such as imagination, which allows among other things to predict scenarios. This research introduces a novelty architecture designed to implement imaginative capacities in robots through the utilisation of Large Diffusion Models. The proposed framework comprises a perception stage, followed by a solution generation, and a final-stage selection algorithm. With this architecture the robot reach a reasoning able to imagine new situations with and without context, depending on the input information given. To achieve the best relation between performance and execution time, different diffusion models were systematically evaluated under a zero-shot configuration. Also, we make tests to prove the accuracy of the method and how useful users find this application. These were made for various scenes and different applications like rearrangement objects, setting the table for dinner or ordering a messy desktop.

First page22
Last page27