Researchers at the Institute for Robotics and Intelligent Machines (IRIM) of the Georgia Institute of Technology have recently proposed a new framework for aggressive driving using only a monocular camera, IMU sensors and wheel speed sensors. Their approach, presented in a paper pre-published on arXiv, combines deep learning-based road detection, particle filters and model predictive control (MPC).
"Understanding the edge cases of autonomous driving is becoming very important," Paul Drews, one of the researchers who carried out the study, told TechXplore. "We chose aggressive driving, as this is a good proxy for collision avoidance or mitigation required by autonomous vehicles."
The term 'aggressive driving' refers to instances in which a ground vehicle operates near the speed limits of handling and often with high sideslip angles, as required in rally racing. In their previous work, the researchers investigated aggressive driving using high-quality GPS for global position estimation. This approach has several limitations, for instance, it requires expensive sensors and excludes GPS-denied areas.
The researchers previously achieved promising results with a vision-based (non-GPS-based) driving solution, based on regressing a local cost map from monocular camera images and using this information for MPC-based control. However, treating each input frame separately led to crucial learning challenges due to the limited field of view and low vantage point of the camera mounted on a ground vehicle, which made it difficult to generate cost maps that were effective at high speed.
"Our main objective for this work is to understand how vision can be used as the primary sensor for aggressive driving," Drews said. "This affords interesting challenges because the visual processing must meet stringent time requirements. This allows us to explore algorithms that are tightly coupled between perception and control."
In this new study, the researchers addressed the limitations of their previous work, introducing an alternative approach for autonomous high-speed driving in which a local cost map generator in the form of a video-based deep neural network model (i.e. LSTM) is used as the measurement process for a particle filter state estimator.
Essentially, the particle filter uses this dynamic observation model to localize in a schematic map and MPC is used to drive aggressively based on this state estimate. This aspect of the framework allowed them to obtain a global position estimate against a schematic map without using GPS technology, while also improving the accuracy of cost map predictions.
"We take a direct approach to autonomous racing by learning the intermediate cost map directly from monocular images," Drews explained. "This intermediate representation can then be directly used by model predictive control, or can be used by a particle filter to approach GPS state based aggressive performance."
Drews and his colleagues evaluated their framework using the 1:5 test vehicle on AutoRally, an open source platform from aggressive autonomous driving. With their approach, they found that the vehicle could operate reliably at the friction limits on a complex dirt track, reaching speeds above 27 mph (12 m/s).
"I think we have shown two things in this study," Drews said. "First, that by directly regressing a cost map from images, we can both use it directly and use it for localization to enable aggressive driving at the limits of handling. Second, that temporal information is very important in a difficult driving scenario such as this."
The study carried out by Drew and his colleagues demonstrate the advantages of combining MPC with state estimation and learned perception. In the future, their framework could pave the way for a more robust and cost-effective aggressive autonomous driving on complex tracks.
"We would now like to further enhance this method with learned attention and extend it to obstacles and unknown environments," Drews said.