An iCub Evolving Reaching and Grasping Skills
Tomassino Ferrauto and Stefano Nolfi
1. Introduction
In this document we describe how a simulated iCub robot provided with a neural network controller can evolve integrated reaching and grasping capabilities that enable it to reach a ball located in varying positions over a table (see Fig. 1), grasp it, handle and elevate it. Beside the difficulties concerning the need to control an articulated arm with many DOFs, this represents a rather challenging task since it requires to interact with physical objects (including a sphere that can easily roll away from the robot's peripersonal space) and requires to integrate three interdependent behaviour (reaching, grasping and lifting). We also explain how this experiment can be replicated by using FARSA
Fig. 1. The robot and the environment. The yellow squares on the table show the areas where the object can be located.
2. Method
The sensory system (Fig 2, bottom layer) includes two neurons that encode the offset between the sphere and the hand over the visual plane (dx, dy), four neurons that encode the current angular position of the pitch and yaw DOFs of the neck (n0, n1) and of the torso (t0, t1), and 9 sensory neurons that binarily encode whether the 5 tactile sensors located over the fingertips (Rf1, Rf2, Rf3, Rf4, Rf5) and the 4 tactile sensors located over the palm (Rp1, Rp2, Rp3, Rp4) are stimulated.
The motor system (Fig.2, top layer) includes, two motor neurons that controls the desired angular position of pitch and yaw DOFs of the torso (T0 and T1), seven motor neurons that controls the desired angular position of the 7 corresponding DOFs of the right arm and wrist (RA0, RA1, RA2, RA3, RA4, RA5 and RA6) and a right-hand motor (RF0) that controls the desired angular position of all joints of the hand (except the fingers abduction/adduction). This means that all fingers extend/flex together.
Fig. 2. The architecture of robot's neural controller. The connection weights and biases and of the neural circuit shown on the left are manually set and fixed. All other connection weights and biases are adapted.
At the beginning of each trial the sphere is placed in a random position inside one of four square areas with a side of 4 cm (see Fig.1). The first two of these areas are located in front of the iCub at a distance of 25 cm and 35 cm, the other two are located 10 cm on the left and on the right side and at a distance of 30cm. Each trial last 300 time steps plus 10 additional time steps during which the plane is removed to verify whether or not the ball is hold by the robot.
The fitness function include 6 weighted components that reward the robots for the ability to bringing their hand near the object, touching the object with the palm, opening the fingers far from the object, closing the finger near the object, closing the finger around the object, holding and elevating the object. These fitness components have been introduced to increase the robots' evolvability (i.e. the probability that random variations might lead to performance improvements) and to channel the adaptive process toward the acquisition of abilities that constitute a prerequisite for the development of a capacity to master the adaptive task .
To support the evolution of robust behaviours while minimizing the simulation costs, the number of trials is initially set to 4 and is then increased to 8, 12, 16, 20, 24 and 28 as soon as an evolving robot successfully grasps and holds the objects during 50%, 60%, 70%, 80%, 90% and 100% of the trials. Five replications of the experiment lasting 2000 generations were ran.
2. Results
By analysing the obtained results we observed that in all replications of the experiment the evolved robots display an ability to reach, grasp and hold spherical objects located in varying positions. In the case of the best replication of the experiment, the best robot displays a rather robust capability that allows it to successfully carry on the task in 77% of the trials. This represents a remarkable results in consideration of the rigidity of the robot body and of the difficulties of physically interacting with spherical objects that can easily roll away from the peripersonal space of the robot.
Video 1. The behavior displayed by one of the best evolved robots
The visual inspection of the behavioural solutions displayed by these robots (see Video 1) can allow us to appreciate the importance played by the integration between the required elementary behaviours (i.e. reaching, grasping, and lifting) and by the way in which they are combined over time. Indeed, the way in which the best evolved robots reach the object by bending the torso toward the table and by carefully pressing the ball over the table so to block it, while the fingers are wrapped around the object, clearly demonstrates the importance of the fact that the reaching and the grasping abilities have been co-evolved so to serve a common function.
3. Replicating the experiment with FARSA
This study is included between the exemplificative experiments of FARSA a free open software tool for Autonomous Robotics Simulation and Analysis.
To replicate and vary it you need to install the FARSA tool run the Total99 graphic interface, load the .ini project configuration file contained in the "GraspExperiment/conf" directory, and finally run the experiment by clicking on the "Create/configure experiment" icon. If the program does not find the .dll plugin file of the experiment load it manually from the directory in which it is located with the file->load_plugin command or include the right directory in the Plugin Path graphic window, so that the program can find it automatically.
The example includes the results of five replications of the evolutionary process. So the first thing you might want to do is to open the statistic viewer widget (from the View Menu) and then press the "Load All Stat" button to see how the fitness varied throughout generations in the five replications of the experiment. You can then choose an individual to test by opening the "Individual to test" widget, selecting a B0SX.gen file (where X is the seed of the replication) and then the best individual of a specific generation (from 0 to 2000) by clicking over the generation number. The following table shows, for each replication of the experiment, the success percentage of the best individual and its generation. The experiment replications are identified by the seed number.
Seed number | 13 | 14 | 15 | 16 | 17 |
---|---|---|---|---|---|
Success percentage | 77% | 74% | 69% | 66% | 50% |
Generation | 1526 | 1529 | 1923 | 1920 | 1644 |
Please notice that the performance might slightly vary for different operating systems and different installation due to small differences between different versions of the newton library. Then you can open the RenderWorld graphic widget and observe the behaviour of the selected robots with the Test->Selected individual command. You can also observe the architecture and the parameters of the network by opening the Evonet->Nervous_System graphic widget and the activation state of the neurons over time by opening the Evonet->Neurons_Monitor graphic widget.
You can observe the parameters in the "Parameters" panel of the Total99 graphic interface. In case you want to run a new evolutionary experiment you can change the seed parameter and any other paremeter you like after loading the .ini file but before configuring the experiment. Once the parameters are set you can then configure the experiment and run the evolutionary process through the graphic interface (with the command action->evolve) or better you can ran the evolutionary process in bath mode with the "total99 --batch --file=configuration.ini --action=evolve" command.
The source code of the experiment plugin is included in the directory of the experiment described above. You can browse and vary the source code too. For more detailed information on FARSA see the Farsa documentation.
References
Massera G., Ferrauto T., Gigliotta O., Nolfi S. (2013) Designing adaptive humanoid robots through the FARSA open-source framework. Submitted.