The rearrangement of objects is an essential task in daily human life. Subconsciously, humans break down such tasks into three components: perception, reasoning, and execution, which are automatically resolved. This process represents a significant challenge for robots, as they must apply complex logic to treat all the information and successfully execute the task. In this research, we propose a solution to perform this task in a human-like manner. For that purpose, we developed a modular framework that provides the capability to observe and understand the scene, imagine the best solution and execute it, following human-like reasoning. This is done by combining a zero-shot deep learning model for perception, a zero-shot large diffusion model to provide an ordered and realistic final scene and a Learning from Demonstration algorithm for execution. To test the performance, we conducted several experiments to check the correct resolution of 2D rearrangement tasks. For that purpose, we have tested the feasibility of the final generated scene, the ability to generate the trajectory by means of human demonstrations and, finally, we have carried out experiments with two different robots in a simulated and a real environment. The results obtained prove the adaptability of our framework to different environments, objects and robots. Moreover, the success rate of solutions provided and the error in the position and orientation demonstrate a significant advance in the accuracy and effectiveness of solving rearrangement tasks.