A key component of any embodied AGI system is the ability to understand the nature of its immediate surroundings given a stream of raw data input from its real or simulated sensors. Data sets which can be used to evaluate the performance of recovery of spatial structure from raw data can be found at the following sites:
Radish contains data mainly obtained from lasers and sonar.
Rawseeds multi-sensor data sets including ground truth and benchmark problems.
The advantage of using such data sets is that they permit a fair evaluation of the performance of different systems, and do not require the researcher to possess either an elaborate simulation environment or a physical robot.
Any AGI system claiming to have animal or human-like perception capability should be expected to perform well on these problems.