Carnegie Mellon Robotics Institute
This sensor generates heading information required to steer a robotic vehicle by "watching" the road. The processing performed on chip is ALVINN (Autonomous Land Vehicle In a Neural Network), a neural network trained to drive without human intervention on public highways. Circuitry for neural computations is integrated with a photosensor array using VLSI in order to directly sense road-image information.
Image-based control of a vehicle at high speeds is a demanding real-time task. While an image sensor generates vast amounts of data, only a small fraction of the information is relevant. Human drivers use their experience to extract needed information from what they see. The ALVINN neural network provides a similar capability, extracting information required to stay on the road from converted intensity images. Through a training process, the network learns to filter out image details not relevant to driving. However, current implementations of ALVINN rely on conventional sense-then-process vision methods that must needlessly digitize, transfer and process full video frames.
VLSI technology provides the opportunity to integrate the imaging and computation required by the ALVINN task. The resulting computational sensor intelligently extracts relevant information from raw image input at the point of sensing. The bottleneck between image input and computer, present in traditional system implementations, is eliminated. Local processing of image information reduces system latency while increasing data throughput --- meeting the fundamental requirements of real-time robotic-vision tasks. In addition, computational sensors are compact, rugged and cost-effective because they are implemented on a monolithic silicon substrate.
Prior to ALVINN-on-a-chip, significant bandwidth and computation were wasted transferring and processing image data from video cameras. As a result, system throughput was limited to only 10 frames / second. Much higher frame rates are required to obtain further gains in the speed and performance of the driving task. Latency is another serious problem alleviated by a VLSI implementation. Applications, like ALVINN, are sensitive to the real-time nature of the images, and excessive latency limits system stability. When video cameras and frame stores are used, the image data available to update vehicle heading is that taken by the camera several frames back. While pipelining can improve system throughput, the latency in an imaging system built around a frame store cannot be eliminated.
VLSI integration of the ALVINN system provides a practical, yet challenging, application which combines and builds on our expertise in computational sensors, real-time connectionist image processing and autonomous mobile systems. An intelligent, rapidly programmable sensor for neural-network based imaging that is fast, cost-effective, and compact will be the result. Our strategy is to simultaneously advance the technology of neural-network based imaging as we further investigate the potential of VLSI-based computational sensors.
|The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University.|
Contact Us | Update Instructions