Speeding up Deep Neural Networks on the Jetson TX1
In recent years, Deep Learning (DL) showed new top performances in almost all computer vision tasks that are important for automotive and robotic applications. In these applications both space and power are limited resources. Therefore, there is a need to apply DL approaches on a small and power ecient device, like the NVIDIA Jetson TX1 with a powerful GPU onboard. In this paper, we analyze the Jetson's suitability by benchmarking the run-time of DL operations in comparison to a high performance GPU. Exemplary, we port a topperforming DL-based person detector to this platform. We explain the steps necessary to signicantly speed up this approach on the device.
Use and reproduction:
All rights reserved