This article was submitted to Bio-Inspired Robotics, a section of the journal Frontiers in Robotics and AI
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
We reach walking optimality from a very early age by using natural supports, which can be the hands of our parents, chairs, and training wheels, and bootstrap a new knowledge from the recently acquired one. The idea behind bootstrapping is to use the previously acquired knowledge from simpler tasks to accelerate the learning of more complicated ones. In this paper, we propose a scaffolded learning method from an evolutionary perspective, where a biped creature achieves stable and independent bipedal walking while exploiting the natural scaffold of its changing morphology to create a third limb. The novelty of this work is speeding up the learning process with an artificially recreated scaffolded learning. We compare three conditions of scaffolded learning (free, time-constrained, and performance-based scaffolded learning) to reach bipedalism, and we prove that a performance-based scaffold, which is designed by the walking velocity obtained, is the most conducive to bootstrap the learning of bipedal walking. The scope of this work is not to study bipedal locomotion but to investigate the contribution from scaffolded learning to a faster learning process. Beyond a pedagogical experiment, this work presents a powerful tool to accelerate the learning of complex tasks in the Robotics field.
Scaffolding is a learner-centered teaching method based on the constructivist learning theory, aiming at cultivating the problem-solving ability and autonomous learning ability of the students. Pedagogy explains it as providing small-step clues or hints (scaffolds) for students to learn step by step to discover and solve problems gradually. This method leads to students mastering the knowledge to be learned, improving their problem-solving ability, and eventually growing into independent learners. Vygotsky, a famous psychologist in the former Soviet Union, derived this teaching idea from the “zone of proximal development” theory (
In this article, we present a scaffolded learning method for bipedalism to bootstrap its ontogenetic development with gradual morphological and control changes (
The general idea of scaffolded learning can be exemplified with a child on training wheels
Genetic Algorithms (GA) (
As the pseudo-code shown in Algorithm 1, we choose the population size
Virtual Model Control (VMC), developed by
The virtual model in our simulation. We attach linear springs and dampers to the hip position of the individual as the granny walker mechanism, to maintain a constant height, and the dog-track bunny mechanism applies a virtual force in the forward horizontal direction to obtain the desired velocity. In addition, it has a torsional spring and a rotary damper acting on the hip joint to keep the upper body straight in the standing phase, while in the step phase, the hip joint only has a torsional spring with
Virtual Model Parameters.
Parameter | Description |
---|---|
|
Half size of the body in |
|
Half size of the body in |
|
Half size of the body in |
|
Half size of the foot in |
|
Half size of the foot in |
|
Half size of the foot in |
|
Femur length |
|
Tibia length |
|
Spring stiffness in |
|
Spring stiffness in |
|
Spring stiffness in |
|
Spring stiffness in the step phase |
|
Damping coefficient in |
|
Damping coefficient in |
|
Damping coefficient in |
|
Offset of the natural length of the virtual spring in the stand phase |
|
Offset of the natural length of the virtual spring in the step phase |
|
Velocity threshold of the transition from the double support to the single support |
In the standing phase, both feet are on the floor. We use the forward kinematics from the foot coordinate frame
Then we can calculate the Jacobian by first-order partial derivatives of the pose with respect to each variable.
Due to the spring-damper components, we can obtain these forces by the following control laws.
In the step phase, one leg needs to swing to take a step forward, so the joint torques of two legs are different. For the one whose foot is on the floor, the joint torques of the ankle, the knee, and the hip are the same as the standing phase, while for the swing leg, the joint torques of the ankle and the knee become zero, and the joint torque of the hip
We choose a finite state machine (
The finite state machine in the bipedal walking algorithm. There are three states during the walking cycle to allow transitions of different virtual components. The first state is the double support, which means both feet are contacting the floor, and the second state of left support means only the left foot contacting the floor. Similarly, the third state right support only has the right foot on the floor. The arrow indicates the conditions that need to be met for the transition. “Delay” means the delay time to allow for the swing leg to fall to the ground, so that the single support phase can transit to the double support phase. “Stable” means the creature has the possibility to walk forward, while in our work, we set a velocity threshold parameter
Transitions of Walking State Machine.
State | Trigger Event | Virtual Component a |
---|---|---|
Double Support | Delay after left or right support | VC1 & VC2 |
Left Support | Move right foot forwards | VC1 & VC3 |
Right Support | Move left foot forwards | VC1 & VC4 |
aGranny walker (VC1), Dogtrack bunny (VC2), Swing the right leg (VC3), Swing the left leg (VC4).
Simbody is a high-performance, open-source C++ library providing sophisticated treatment of articulated multibody systems with particular attention to the needs of biomedical simulations. It is useful for predictive dynamic simulations of diverse biological systems such as neuromuscular biomechanical models and coarse-grained biomolecular modelling. It is also well suited to related simulation domains such as robotics, avatar simulations, and controls, and provides real-time capabilities that make it useful for interactive scientific simulations and virtual worlds (
The simulation was conducted in a DELL OptiPlex 7,060 series desktop with Ubuntu 18.04 system, i7-8,700 processor, 12 threads, and 32 GB internal memory. We will terminate the learning process of the bipedal walking if the individual falls, while the upper body falls to below one half of the total body height, and it overruns a time limit of 15 seconds. The foot collision and ground reaction forces are realized by the Simbody simulator using
Our ultimate goal in this work is to evolve independent bipedal walking in 15 seconds, where the fitness function aims to maximize the forward walking speed (the forward walking distance traveled divided by the total leg length). Here, we refer to the walking Froude number (
It is important to note that in a simulation environment creatures can have drastically different sizes in length. It is equally easy to make a 1 m creature and a 1 km creature, and it would be very unfair if one step from the bigger creature was longer than 100 steps from the shorter one. In this case, the use of Froude number keeps an even field between creatures with different body structures.
Since we start the simulation at a supported tripod walking, there will be tripod walking individuals, bipedal walking individuals, and even individuals with alternating gaits. We use this behaviour as a metric to measure the performance of bipedal walkers. Also, we observe the growth rate of the fitness and the degree of body length decay as the other two performance metrics. We conduct the following three cases in different body length constraints in 4,000 generations. To verify the repeatability and the reliability of our results, we decide to do three replicates for each case.
According to the current fitness, the genetic algorithm will choose the appropriate combination of body parameters and control parameters, benefiting from the algorithm. We let the body length evolve freely without additional restrictions. The body length constraint is set to as long as possible, which can provide as a scaffold to the biped creature, so that we set the upper bound of the body length as 1.8 m based on the leg length of the initial individual, while the lower bound is 0.05 m and equals to the diameter of the leg.
In order to analogize the gradual reduction of the external stability support for the body during the development of bipedal walking, we keep the lower bound as 0.05 m and restrict the body length by shortening the upper bound of the body length proportionally as the generations increase. The formula of the upper bound is as follow:
Considering the limitation of the number of generations, it is likely that the learning of bipedal walkers cannot be better searched. Therefore, we have balanced exploitation and exploration and designed a scaffold that limits the upper bound of the body length according to the current performance. The lower bound of the body length remains 0.05 m, which allows the algorithm to explore better at the beginning and focus on exploitation to maximize the performance. As for the calculation of the upper bound, we set a maximum operation between the performance-based scaffold and 0.05 (see 8). Therefore, the upper bound will never be less than the lower bound.
After finishing all three cases, we started with a comparison between the fitness of the best biped and tripod from our simulations, as shown in figure 4
The fitness and snapshots of the best biped and best tripod creatures.
Individuals of bipeds and tripods with different leg lengths, body sizes and foot sizes evolved in the simulation. At the beginning of the generation, the length of the foot is very large, so that the foot has enough contact area with the ground, which benefits the individual to maintain stability. When the individual reaches the optimal bipedal gait
We obtained the gait information from the relative Center of Gravity (CoG) position, the relative horizontal velocity, and the relative vertical velocity for both creatures, shown in
Gait analysis of best biped and best tripod. We pre-processed all data and divided it by the leg length to eliminate the inherent advantage of the biped creature with longer leg length than the tripod creature. The values have oscillations due to characteristics of the virtual model and the data captured at the center of the body. About the unit,
Parameters for the body length and the leg length over generations of the fast bipedal winner in the performance-based scaffolded case. The green curve and the orange curve are the tibia length and the femur length parameters, respectively. Their ranges are from 0.05 to 0.9 m, and they are set to mutate freely. The light-blue curve is the body length parameter which is forced to decrease based on the current performance observed, and the original range is from 0.05 to 1.8 m. Here, we squeeze the display range to 0.9 and use a smoothness function to draw the data clearly, which is conducive to investigate the changes of body parameters. The red line labels the place where the tibia length starts gradually increasing with the support of the body length.
The results described above were based on the best creatures, and those motivated us to create three cases (described at section 2.5) to understand the mechanisms leading to that difference. Initially, we wanted to identify the biped and tripod creatures in our simulation, and we plot
The scatter map of all three cases in three different trials with different body lengths at every bump of the current fitness. Case 1 is the body length with free constraint during generations, Case 2 is the constraint body length with decreasing value during generations, and Case 3 is the constraint body length based on the best fitness obtained during generations. Red stands for tripods, green for a hybrid bipedal-tripods, and blue for bipeds. Here, bipedal-tripod is the transition morphology between bipeds and tripods, while a creature walks supported with a long body occasionally. Since the simulations started from a tripod individual with the longest body length, the scatter map is red in the beginning. By growing with different body length constraint mechanisms, most individuals become bipeds (blue) at the end.
Results for all three cases within 4,000 generations. The median values of the brown dashed, purple dotted, and black dashdotted lines stand for free scaffolded learning case (Case 1), time-constrained scaffolded learning case (Case 2) and performance-based scaffolded learning case (Case 3), respectively. The shaded regions represent the standard deviation of each case. The variance of Case 2 and Case 3 are greater than Case 1 because we restrict the range of the body length of Case 2 and 3 based on the time and the current fitness, while we let the body length of Case 1 evolve freely.
In a comparison between Case 1 and Case 3 we can notice, from
During the learning process of the best creature observed, the tripodal gait phase started with a long body and a short leg length, as shown in
In the first stage, the long body/short leg and its tripodal gait guaranteed the system to be stable to form a simple tripod control. Naturally, with an ever decreasing support, the short legs transition to a bipedal gait with a robust controller, and this triggers an increase in leg length to reach higher fitness values with an upright posture. This gait analysis allowed us to hypothesize on the internal mechanism of a scaffolded learning approach and strongly agreed with the work from
We proposed three cases of the scaffolded learning method in this paper. From the results shown in
We can take the human ontogenetic development for the performance of a cognitive task as a child as the example of scaffolded learning cases. One is that parents deliberately do not interfere with the learning of their children, as seen in free scaffolded learning (Case 1), another is that parents slowly reduce their assistance for this child based on their age, as seen in time-constrained scaffolded learning (Case 2), and the other is that as this child performs this task parents adjust their support based on their perceived performance, as seen in performance-based scaffolded learning (Case 3). Broadening to pedagogical applications, Al Mamun et al. (2020) provides a positive example of how to implement inquiry-based learning in an online environment, considering the lack of direct teacher or peer support. However, they mentioned that recent research rises more attention as challenges increase when adopting a free scaffold in the self-regulated learning environment without direct support from teachers. Therefore, only by choosing a suitable method can we effectively accelerate the learning process, which is in agreement with our work of physical robots (
In this paper, we introduced a scaffolded learning method on a creature capable of adapting its body and controller, hence bootstrapping a bipedal controller from a stable tripodal gait. Our results show that scaffolded learning with the optimal parameters is more productive than leaving a system free to learn independently. It is only true when the appropriate incentives behind scaffolded learning exist, effectively shortening the learning process with a performance-based scaffold, while a time-constrained scaffold is worse than the free learning case. Although bipedal walking can be reached through robust control methods, the study that we present here does not focus on the walking itself but on the capacity to use what is already known to bootstrap the unknown. We introduce a scaffolded learning method that accelerates the learning process, which can be combined with any learning method to improve the learning rate. We believe that the findings of this study are meaningful for machine learning in general, as our methods are not bound to genetic algorithms or one experiment, and could be adapted to different learning methods and different systems.
We would like to emphasize that this is the first time that such scaffolded learning method is used artificially, although pedagogy and cognitive scientists have observed animals and babies using scaffolds to support their learning processes, such as bike riders using training wheels or babies learning to stand while supporting themselves with chairs and sofas. In addition, this paper is not about locomotion, genetic algorithm, virtual model control, nor finite state machine, but about scaffolded learning being used to speed up a learning process, which can be used in any process and with any kind of learning algorithm. We use bipedal locomotion and tripod locomotion as a proof of concept for scaffolded learning. It could have been manipulation, jumping, standing or any other behavior that can have its initial steps supported by something. We propose the use of a structure combined with the software part, leaving the readers free to use a scaffolding method of their choice. As the field of Robotics suffers from the curse of dimensionality and the Reality Gap, our proposed method should be used on robots for faster deployment of learning algorithms and a bottom-up construction of this knowledge base. The same concept explained herein could be transposed to a simulation-scaffolded reality, with the eventual removal of the training wheels to reproduce a reality-compatible behavior. As is the case with humans, robots should also be capable of using their previously acquired knowledge to aid their learning of complex tasks. After all, if Newton could see further, it was by standing on the shoulder of giants.
The original contributions presented in the study are included in the article/
Written informed consent was obtained from the minor(s) legal guardian/next of kin for the publication of any potentially identifiable images or data included in this article.
JZ conceived the study, designed the study, carried out the simulations, and wrote the manuscript. CR participated in the design of the study and helped build the simulator. FI contributed to the idea and the final manuscript. AR directed the project, contributed to the idea, helped draft the manuscript, and critically revised the manuscript. All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
This work is supported by the National Natural Science Foundation of China (project number 61850410527), and the Shanghai Young Oriental Scholars (project number 0830000081).
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The Supplementary Material for this article can be found online at: