Large-scale Deep Learning for Autonomous Driving
Episode title: Advancing Autonomous Vehicle Development Using Distributed Deep Learning with Adrien Gaidon - TWiML Talk #269
Source : TWiML Talk . Listen to the podcast
Guests: Adrien Gaidon (@adnothing) , Machine Learning Lead at Toyota Research Institute
Technical level: 🚵♂️- technical
How to go very large-scale with Deep Learning? How to resolve the main painpoints of Deep Learning when applied to Autonomous Vehicle. Adrien Gaidon, Machine learning expert and friend of Beautifeye walks you through all these amazing challenges and explains how Toyota is reinventing itself.
The 3 main points
Toyota phenomenal challenge: from car manufacturer, to Software and Robots maker. Toyota sees itself as a company selling robots powered by the smartest Software in the world. While full autonomy is still far ahead, robotization of car has already happened and it will only intensify.
With Distributed training on the cloud you can reduce training time from weeks to a couple of hours. By carefully profiling and debugging your Deep Learning framework of choice, and with a few tweaks to your data provisioning pipeline, you can dramatically reduce time and costs!
Deep learning name of the game when applied to Autonomous Driving : Small networks, high resolution. Autonomous driving, as a use case, it can only afford a very low computational budget but it needs great prediction capabilities: the car needs to be able to look from far and predict with limited Hardware what to do. Super resolution images and “economic” DL models are the solution.
The 4 main pain-points when you go very - large scale with Deep Learning:
You run out of disk space: from RAM, to EBS to a fully distributed file system (Adrien uses BGFS) to download the data only once and provision machines self managed.
You must embrace devOps: from jupiter notebooks, to Docker and contained, pre-configured environments to speed up set-up time and enable fast experimentation.
You must rewrite code of Deep Learning frameworks such as Pytorch: they simple don’t scale up. Tip from Adrien: look for bottlenecks at the beginning of your data pipeline (data loads and provisioning).
You hit algorithms issues. (e.g. large-batch SGD causes generalization problems even if you apply warm start and linear scaling).