Leonardo F. Urbano
The resulting controller is compared to a modern approach of solving the problem of taskability. The results of each of the 2 robot controllers in 20 trials on 3 different obstacle array configurations were recorded and compiled. Path diagrams were generated to show the success difference of each robot controller in each task scenario.
Obstacle negotiation and task-achieving behavior under the taskable adaptive robot controller showed a dramatic increase in performance when compared to the taskable non-adaptive controller. Some circumstances were completely impossible for the non-adaptive robot controller to negotiate. Experiment results found the controller developed in this project to be superior to alternative methods.
The contribution of this project is a taskable simple onboard adaptive robot controller agent that is a simple solution to many complex control problems without the aid of a powerful host computer. It places the problem of complex control back into the environment: complex behavior is the result of a complex environment, not a complex controller.
Almost all approaches to robot learning involve the use of processor-hungry algorithms that exploit the benefit of expensive and powerful host computers serving as the primary control computer (G. Castellano ’97, S. Nolfi ’97) for a tethered or simulated robot (C. Lin, L. Wang ’97). Most learning approaches involve countless simulations in a computer-generated environment that often take many thousands of trials and errors before a proper robot controller is filtered and finally downloaded to be used on a robot (T. Fujii ’96). On most systems this is where adaptation and learning stops. In all these approaches adaptation rarely takes place onnboard small robots.
This paper describes a new architecture for controlling real mobile robots. The controller is a strongly modified subsumption controller developed by Professor Rodney A. Brooks at MIT (R. Brooks, ’85, ’86, ’89, ’90). The application being a flexible architecture modeled after the brain functions of simple biological neural networks to research the nature of cognition. The project aims at developing a taskable simple onboard adaptive robot controller suitable for controlling a small robot in real environments that renders the onboard active control agent adaptive to environmental conditioning signals in small time frames in real-world situations.
Rodney Brooks and the MIT MOBOT Lab provided a way of combining real-time control from fused non-related sensor data to replicate intelligence. Instead of making judgements about sensor validity, their method uses a strategy in which sensors are dealt with only implicitly in that they initiate behaviors. This method employs an architecture that responds directly to environmental circumstances without planning. It does not react to a computer-generated world model, it reacts to the world. The best model of reality is reality itself. This new approach is called the subsumption architecture. Brooks demonstrated the importance of relying on real-world embodiments of intelligence and helped motion this project away from computer-simulated experiments.
This project is the continuation of a project submitted last year, which aimed at developing the Simple Onboard Adaptive Robot Controller and proving its efficiency. Last years work proved lucrative in solving the major problem and engineering goals and also created new problems, which functioned as a springboard for this year’s work.
The problem, hypothesis an engineering goals of the project submitted last year with results obtained by the experiments showed that the SOAR controller is an effective alternative to robot learning on small processor, low-powered, and tiny-sized real robots.
This phase of the project lies in making the SOAR controller taskable and goal-seeking. The way it previously existed was a blind control system with no immediate purpose. This paper documents the tests of a home-made robot’s hardware and software functioning together under a SOAR controller and how it has proven to be successful at producing an emergent behavior that can be characterized as performing a task. The finished product is a taskable SOAR controller.
The goal of this research was to develop a taskable, simple onboard adaptive robot controller (SOAR) algorithm suitable for operation on a small portable robot that learns over very small time intervals in real-world situations. A relatively simple algorithm avoids retrogressing into the traditional approach to artificial intelligence where plan/action schemes dominated all processor functions seriously inhibiting performance in circumstances that demanded swift adaptation in small time frames. The taskable SOAR controller was designed under these engineering constraints:
Complex taskable behavior should emerge from the resulting combination of several small simple behaviors. Complex behavior is not the result of a complex controller. Rather, it is the result of a complex environment.
3.0 Methodology, Materials and Experiment Procedure
The project called for a real physical robot that would remain unchanged throughout experiments. The only variable in this project is the controller. A small autonomous mobile robot was constructed at home from a small budget of less than $150. Schematics and wiring diagrams have been included in Appendix A.
Several obstacle arrays were constructed that took advantage of the sensor emplacements on the robot. Three different scenarios were built for the robot to interact with. A subsumption controller was chosen to function as the control for the experiment since it met the engineering goals more than any other style of control. Two programs were written: a subsumption controller and a simple onboard adaptive robot controller, both of which were developed to perform a specific task. Each controller was timed with a digital clock to negotiate each array of obstacles and completing the task.
The experiments were conducted in the following manner: both controllers separately engaged the three obstacle arrays with the computer being reset (RAM discharged) before each trial. As soon as the robot finished the task, the robot was stopped, reset, and placed back at the start position so the robot could engage the obstacle array and complete the task again. Both controllers engaged the array 20 times each to establish the reliability of the data.
The robot is designed with two bump sensors located on the left and right periphery of its square body and three forward-facing light sensors that deliver light level readings to the microcontroller. Each time the robot bumped into the obstacle array it would back up and turn away from the stimulus. Less time is wasted when there are the fewest bumps. Whichever best negotiated the obstacle arrays and completed the task in the best time was deemed the most effective controller.
3.1 The Subsumption Controller
The subsumption architecture is built in layers. Each layer gives the system a set of pre-wired behaviors. The higher levels build upon the lower levels to create more complex behaviors. The behavior of the system as a whole is the result of many interacting simple behaviors. The layers operate asynchronously.
In this project, three simple behaviors work in uniform to produce the overall emergent intelligent behavior. Avoid( ), follow_light( ), cruise( ) and grab( ) are arranged in a subsumption architecture. Each finite-state-machine responds to input signals from specific sensors. The presence of an input signal on each data line will excite a behavior and inhibit or subsume other behaviors. Suppose the robot was following a light and struck an obstacle via bump sensors: the avoid( ) behavior would inhibit the signals of the follow_light( ) behavior by exciting the corresponding suppression node, resulting in a block of all output traffic on that line completely.
The result is a high-speed, decentralized delegation of processing power that lends itself to controlling a robot that wanders around searching and following light while avoiding obstacles with the task of finding a small object to return to a user-defined location.
3.2 The Simple Onboard Adaptive Robot (SOAR) Controller
The robot interacts with the real world. Following simple rules like “if left_obstacle then turn_right” and “if right_obstacle then turn_left” the robot exhibits an obstacle avoiding behavior. In the presence of uneven lighting, the robot follows rules “if left_light is greater than right_light then turn_left” and “if right_light is greater than left_light then turn_right” and exhibits a light following behavior.
This alone is insufficient for the robot to maximize the productivity it experiences when interacting with complex environments. It needs to learn. To exemplify the crux of the problem: imagine 2 walls forming a 90-degree angle that the robot approaches head-on. Following the rules prescribed above, the robot will get stuck in an infinite-loop, bouncing back and forth between the two walls. Similar problems occur when the biological nature of sensory perception is not incorporated into robot control.
The early psychologist William James explains that when two elementary brain processes have been active together or in immediate succession, one of them, on reoccurring, tends to propagate it’s excitement to the other. This philosophy forms the main premise for the theory this project demonstrates: break down the robot obstacle-avoiding behavior into functional processes, and the acquisition of sensor data (obstacle impact) excites or inhibits the state of the motor driver circuit – a direct mapping of sensing to action. It is biologically illogical for the relationship between sensors and actuators to remain fixed regardless of how many times it is excited. The subsumption architecture addresses this by regressing into a centralized governing learning agent that awards positive and punishes negative feedback from actions. It is ironic that an architecture which gained it’s inspiration from the poor performance of centralized control systems reverts back to centralized control systems to solve problems it cannot.
Human and animal nervous systems are mainly centralized in physical appearance (i.e. the brain branching into neurons) though they are far from centralized in function. Neural activity operates in parallel. The idea in this project is that there is no governing learning agent that regulates all activities. All learning is dependent on the amount and frequency of sensor data in each sensor-actuator FSM and is independent of other sensor-actuator FSMs. When we configure each sensor-actuator pair to excite or inhibit responses we see learning that is unambiguous. The SOAR controller in this project is developed with the appropriate behaviors for the task, which is programmed directly into the design of the overall agent.
4.0 Experiment Results and Data
In order to prove the taskable nature of the SOAR controller, the robot must perform a task successfully. A simple task of finding a payload, grabbing the payload, and bringing the payload to a location was used. The robot is fitted with 2 backward-facing gripping fingers, 3 forward-facing light sensors, 2 forward/side-facing bump whiskers and payload detecting whiskers. The payload was a small medicine box sitting on top of a larger tin box. A bright lamp was situated on top of the tin box. This would attract the robot toward the tin box, eventually causing the robot’s payload detectors to touch the tin box, resulting in acquisition of the medicine box. Once the robot grabbed the medicine box, the lamp would turn off and another lamp situated at the beginning of the obstacle array would turn on, attracting the robot to the goal location. The robot must negotiate Plexiglas plates that block the robot’s path and are at the same time transparent to maintain a fix on the light source.
A dark red marker was cut in half and fixed directly in the center of the underside of the robot. The robot was placed on a large piece of brown paper so that as the robot moved its path would be drawn on the paper. Since the controller is the variable, and the controller drives the robot, then one can use the robot path to directly rate the performance of the controller. Data is represented as a path superimposed on a birds-eye view of the obstacle/task array. Red paths are when the robot is looking for the payload, blue paths are when the robot has acquired the payload and is returning it.
4.1 Experiment Results – Subsumption Controller VS Taskable SOAR Controller – SCENARIO 1
In the first scenario, the subsumption controller (Figure 1) struck the obstacle array approximately 15 times and completed the task, successfully retrieving the payload and bringing it back to the start location in 145 seconds. The SOAR controller struck the obstacle array approximately 11 times and completed the task, successfully retrieving the payload and bringing it back to the start location in 133 seconds.
4.2 Experiment Results – Subsumption Controller VS Taskable SOAR Controller – SCENARIO 2
In the second scenario, the subsumption controller struck the obstacle array approximately 116 times and completed the task, successfully retrieving the payload but failed at bringing it back to the start location after 300 seconds. The SOAR controller struck the obstacle array approximately 19 times and completed the task, successfully retrieving the payload and bringing it back to the start location in 107 seconds.
4.3 Experiment Results – Subsumption Controller VS Taskable SOAR Controller – SCENARIO 3
In the third scenario, the subsumption controller struck the obstacle array approximately 22 times and completed the task, successfully retrieving the payload and bringing it back to the start location after 117 seconds. The SOAR controller struck the obstacle array approximately 15 times and completed the task, successfully retrieving the payload and bringing it back to the start location in 87 seconds.
5.0 Discussion and Theory
If early psychology has taught us anything during the ages where phrenology, structuralism and functionalism ruled, it is that subjective introspection is not the path to understanding mental processes. For almost any complex system there ought to exist models with which to tinker. Granted that humans serve as excellent models of human cognition, it is a high hurdle to jump when scientists want to adjust major values and experimental conditions without killing the living test model. Robots have served as excellent modern intelligence models, employing almost all dimensions of sensation and motorization, and crawl closer and closer to defining the absolute components of high and low-level cognition. This project contributed a new model that appears to reflect the highly adaptive and rapid learning of simple excitatory/inhibitory neural networks found in lower life forms.
There is still much to learn. Any work, which contributes to the overall relevant understanding of a topic is certainly significant to the scientific community. This work is important because it contributes a new perspective, a new approach, and a new idea about the way brains and biological neural networks function. This project illustrates a new, elegant, and economical approach to modeling taskable biological mental processes with the use of a robot.
The reason neural networks, artificial evolution, genetic algorithms and reinforcement learning were not used is because they are, by their design, time-consuming processes that are not suitable for adaptation in small time frames. Neural networks are often used for pattern replication or filtering. They take lots of power, both processor and battery-wise.
Artificial evolution takes thousands of generations to perfect a controller that is perfect for a specific situation. However, once the controller is filtered and has been evolved to maximum performance, it cannot be changed once downloaded on the machine, and thus, is restricted to performing only within the limits defined by the final algorithmic product. Also, when we talk about robot control and evolutionary or genetic algorithms, maximum performance is only maximum performance in a particular circumstance. If the environment is changed dramatically, then the evolved controller, which cannot adapt quickly (and most of the time cannot adapt at all) will fail.
Reinforcement learning algorithms get much praise in simulated environments. One major problem is the inaccuracy of sensor data resulting in very noisy conditioning signals. Reinforcement learning agents tend to be extremely centralized and are usually only capable of rewarding or punishing one particular action at a time.
The most serious disadvantage with reinforcement learning is that the policy, which judges actions, is not adaptive at all. For example consider the 6-legged walker that learns to walk by counting the number of checkerboard squares it passes as it instructs random leg movements. It is a given that over time the robot will filter out detrimental leg movements from beneficial leg movements so long as beneficial leg movements result in the movement of more squares, indicating forward motion. If we take that robot off it’s pre-designed laboratory floor and put it in a forest where there exists no square boxes to guide the learning process, the robot falls, or never learns to maximize it’s productivity from the data it receives. This is both unrealistic and impractical for modeling human and animal cognition, as sensors should not be rigged to function only when “walls are white” but rather should be rigged because “something is in the way.” Or rather, the general and more abundant of sensory stimuli should excite the learning agent and not the more specific and less filterable.
5.1 Data Analysis
In Scenario 1, both the subsumption controller and the SOAR controller engage all obstacles and maintain a fix on the light source while homing in on the payload. Both controllers are able to find the tin box and detect the payload. Both controllers were able to retrieve the payload and deliver it back to the start location. However, in all 20 trials, the SOAR controller struck the array 4 times less than the subsumption controller did. The constant stimulation of the obstacle-avoidance behavior FSM results in wider turn angles. This helps the robot avoid the large obstacles.
In Scenario 2, both the subsumption controller and the SOAR controller engage all obstacles and maintain a fix on the light source while homing in on the payload. Both controllers are able to find the tin box and detect the payload. However, only the SOAR controller successfully delivered the payload back to the start location. The subsumption controller became trapped in a 90° corner formed by two obstacles. This causes the robot to infinitely bounce back and forth from left to right trying to avoid the obstacles it detected on each side. The subsumption controller never escaped from the robot trap. The SOAR controller expanded its angle of incidence each time there was an impact because of behavior excitement, thus causing the robot to escape. This evidence purports that a taskable SOAR controller is superior to a taskable subsumption controller when configured in this fashion.
In Scenario 3, both the subsumption controller and the SOAR controller negotiate all obstacles and navigate toward the light source while homing in on the payload. Both controllers were able to find the tin box and detect the payload as well as return it to the start location. However, once again, the SOAR controller struck the array less than the subsumption controller did. Just like the phenomenon in Scenario 1, the SOAR controller’s excitation distributor caused wider angle motions with every recurrent and corresponding behavioral stimulus.
By analyzing the data obtained from performing the experiment documented in this paper and making a comparative statement, it is safe to say that the taskable SOAR controller outperformed the traditional modern-day approach to robust, layered and decentralized autonomous robot control. It is a viable alternative to taskable robot learning on small, autonomous, low-power, small-processor systems.
Robots have almost always dominated the industrial factory setting in automobile assembly lines and nuclear power plant servants. It becomes increasingly important, in order to fuel more research, that robots become economically affordable and that they reach the average consumer by the turn of the century.
Recently robots have made good headway into the toy and entertainment market. But the abilities of such simple low-level logic systems currently on the market make robots seem “dumb” and almost useless. The concept of robot learning is near non-existent when it comes to average consumer robotics, the primary reasons being the engineering goals prescribed earlier.
The SOAR controller robot and its upcoming predecessors function as exceptionally safe, cute and fun children’s toys. A robot toy that learns is a very plausible, initial application that will fuel these systems to advance into more industrial and automated settings; with the ultimate goal being the full integration of robots into the more social aspects of society.
The problem, hypothesis an engineering goals of this project with results obtained by the experiments show that the taskable SOAR controller is an effective alternative to taskable robot learning on small processor, low-powered, and tiny-sized real robots.
The taskable SOAR controller avoids retrogressing into the traditional approach to artificial intelligence where action proceeds planning. It negotiates with circumstances that demand swift adaptation with as little computation as possible. In the experiments conducted, learning and adapting wasn’t a time-consuming, processor-intensive task but rather part of a sensory-motor stimulus-response network that modified itself every time sensors triggered behaviors. In this project, no central governing learning agent was used. Instead, the nature of the relationship between the robot and its environment causes it to naturally adapt. Since most complex learning algorithms derive their data from simulated environments, real-world performance is often inconsistent with simulated performance. In this project all experiments were done in the real world, not in computer-simulated environments. Instances where there was no sensor input (i.e. actuator/motor control signals issued) had negligible impact on adaptation. In this project, complex tasking learning behavior emerged from the resulting combination of several smaller and simpler behaviors. This supports the claim that complex behavior is not the result of a complex controller. Rather, it is the result of a complex environment.
APPENDIX A – Basic Wiring, Component Functions
Photosensor – Three photosensors are mounted on top of the robots body. They each consist of a single cadmium sulfide cell, 1 .1mf capacitor, 1 220ohm resistor and 3 wires: 1 to +5v logic, 1 to ground and 1 as a signal line to the STAMP II processor. The stamp sends voltage through the resistor, then counts how long it takes to discharge, which is determined by the light level. In this fashion, the stamp can measure light levels.
Motor Driver – L239D Chip – The robot uses a single L239D motor driver chip to drive the motors from a signal supplied by the stamp. Four wires are connected from the L239Ds inputs to 4 I/O pins on the STAMP II. The four output pins on the L239D are hooked up to the motors, 2 leads each. By throwing input 1 high and 2 low, a motor hooked up to outputs 1 and 2 will spin. Reversing 1 and 2 will reverse the motors spin. The same applies to inputs 4 and 3, and outputs 4 and 3. The motor driver’s supply is a +6v alkaline battery source, which is isolated from the computers +9v (converted to 5v) logic supply.
Bump Switches – the robot has 3 bump switches mounted behind a “floating” bumper ring. Each switch is wired to both +5v and to an I/O pin on the stamp. The stamp registers an impact with an obstacle by getting a +5v spike at the specified I/O pin.
BIBLIOGRAPHY / REFERENCES
Brooks, Rodney A. (1985) "A Robust Layered Control System For a Mobile Robot" Artificial Intelligence Memo 864, September 1985
Brooks, Rodney A. (1990) "The Role Of Learning in Autonomous Robots" MIT Artificial Intelligence Laboratory Press
Brooks, Rodney A. (1990) “Elephants Don't Play Chess" MIT Artificial Intelligence Laboratory, Robotics and Autonomous Systems 6, pages 3-15
Brooks, Rodney A., Breazel, C., Ierie, R., Kemp, C. C., Marajanovic M., Scassellati, B., Willamson, M. M., (1998) “Alternative Essences of Intelligence” MIT Artificial Intelligence Lab, American Association for Artificial Intelligence,
Lin, C., Wang, L., (1997) "Intelligent Collision Avoidance by Fuzzy Logic Control,” Robotics and Autonomous Systems 20 pages 61-83
Brooks, Rodney A., Steign L. A., (1994) "Building Brains for Bodies,” Autonomous Robots, 1, pages 7-25
Castellano, G., Attolico, G., Distante, A., (1997) “Automatic Generation of Fuzzy Rules for Reactive Robot Controllers,” Robotics and Autonomous Systems 22 pages 133-149
Fujii, T., Asama, H., Numers, T., Fujita, T., Kaetsu, H., Endo, I.,(1996) “Co-evolution of a multiple autonomous robot system and its working environment via intelligent local information storage,” Robotics and Autonomous Systems 19, pages 1-13
Salomon, R., (1997) "The evolution of different neuronal control structures for autonomous agents," Robotics and Autonomous Systems 22 (1997) 199-213
Nolfi, S., (1997) "Evolving non-trivial behaviors on real robots: a garbage collecting robot," Robotics and Autonomous Systems 22 (1997) 187-190
This work was done at home under the supervision and with the much-appreciated funds and infinite support of my parents. I am very grateful for the support, time, guidance and inspiration of Mrs. Andrea Negri, Mrs. Kelly Mackey, Mr. Andrew Scotti, Mrs. Carol Hynes and especially Mr. Tim Stoddard for his electrical engineering expertise.
I would also like to thank my dog, Copper, for keeping me company on those late nights while working over a hot soldering iron during the construction phase of this project.