Under the Hood of a Self-Driving Taxi

A look at compute and other core self-driving car systems

Oliver Cameron
Voyage

--

Sense, Plan, Act

A self-driving car traditionally follows the paradigm of Sense, Plan, Act. The car senses the environment around it, utilizing sensors like LIDAR, radar and cameras. The car plans the path from point A to point B, using sensor information and other contextual information. The car then acts, executing the path that was planned by controlling its steering and speed.

To give a car the ability to Sense, Plan & Act (SPA) requires a complex system of hardware and software, all of which works together in (hopeful) harmony to form a self-driving car. You might be familiar with some of the surface level components in isolation (things like cameras, LIDAR etc.), but equally important is all the plumbing necessary to make them all sing together. It’s that plumbing, or the core, that we’ll cover in this post.

The Voyage Architecture

This is what a Voyage self-driving taxi looks like beneath the surface, which is a fairly standardized architecture for a multi-sensor self-driving car. Each component plays a huge role. Today we’re going to walk through our compute, power, and drive-by-wire kit.

The system overview for our Voyage self-driving taxi

Compute

Compute very much serves the Plan component of SPA. The brains behind Homer, the first self driving taxi at Voyage, is a Gigabyte AORUS motherboard with an Intel Core i7–7700K Kaby Lake Quad-Core 2.4GHz processor and NVIDIA Titan X GPU. To make sure sensors have an ample data pipe, the machine has 64GB of RAM and 3TB of mass storage distributed across three Solid State Drives for redundancy.

Wired in

This powerful computing box runs an Ubuntu distribution of Linux, utilizes Docker containers to manage system environments, and the Robot Operating System (ROS) for quick prototyping of perception, motion planning, and controls nodes. ROS is an incredibly versatile robotics middleware that abstracts away all of the complexities of message passing, timing, data structures (for things like point clouds, camera frames, and obstacles), threading, and data recording. While Ubuntu (and therefore, ROS) is not a sufficient real-time operating system (RTOS) that is required for production self-driving cars, it is an incredible tool for prototyping algorithms and getting them tested in real world conditions as fast as possible. It’s critical to minimize the time between ideas on the whiteboard and cars on the road here at Voyage, and these tools help us do just that.

Want to try ROS on real point cloud data from an actual self-driving car? Check out this open repository on GitHub, put together by one of our engineers.

ROS nodes are essentially mini-programs that run independently from each other, but generally have many interconnections. One node may be responsible for reading raw data from a Velodyne LIDAR over an ethernet interface and turning it into a PointCloud2 message. This message, which consists of an array of three-dimensional points and their metadata, can then be ‘published’ over the ROS network and consumed by any number of other nodes. One of these consumers, or ‘subscribers’, could be responsible for fitting the live incoming point cloud to an existing map for localization, and another node may be running clustering algorithms to detect and track objects. These nodes then publish their own output to the network, which could get consumed by motion planning algorithms even farther down the line. At a high level, this is exactly how the Voyage cars operate. Data is consumed from raw sensors (LIDAR, radar, RTK GPS, cameras, CAN bus messages, etc), processed by a huge collection of smaller nodes that all communicate with each other, and then finally outputs actual control signals to the throttle, brake, and steering wheel through our drive-by-wire units.

Data collection in action

One of the most valuable aspects of using ROS while in development is data recording and playback. When nodes are communicating, they use specific channels, or ‘topics’, and these topics can be automatically recorded to disk for future analysis. In fact, whenever a Voyage car is out on the road, we are recording every single second of data. These topics act as namespaces to remove the possibility of data collisions, but also serve as a super detailed log of exactly what every part of the vehicle was thinking at any given time. This allows us to recreate any condition on our laptops in the office, instead of driving around outside hoping for the same circumstances to happen again. This also allows us to test new algorithms in our office, rather than in the passenger seat, saving our engineers a ton of time.

Generally, most of our computation tasks onboard Homer and the rest of the Voyage family are the responsibility of the CPU. This is because most tasks are generally linear, and require one solution to be found (Ex: what obstacle do these new LIDAR points belong to? Should I make this left turn before the next right turn?) before moving onto solving the next problem. While multi-threading is an option, and is utilized extensively to process large amounts of data, there are several problems that are solved better with human-like intelligence approaches (think massively parallel). Enter the Titan Xp GPU.

If you’re reading this blog post, you likely already know that we love machine learning at Voyage. If a problem can be tackled by a neural network, odds are that we are working on it. We’ve used deep learning to detect the state of traffic lights with great accuracy, recognize obstacles in a sea of clustered LIDAR points, separate buildings from the road with scene classification, and generated steering wheel angles directly from imagery in end-to-end networks. While throwing a GPU at a problem isn’t always the right solution, it’s a fantastic tool to really leverage the power and ingenuity of our engineers.

David, one of our Robotics experts

Power

The entire system is powered by the Ford Fusion’s original 12V vehicle battery. The battery feeds power to a Power Distribution Unit (PDU) which is a really smart relay switcher with 9 different 12V connectors. It comes with its own scripting language and one can programmatically turn on or off individual switches. The Linux Box gets its power through a 110V inverter also connected to one of the PDUs switches.

In order for the PDU to stay on, the car has to be running. In the case of the Ford Fusion Hybrid, if the vehicle detects no driver is in the seat, the Body Control Module will shut down the ignition relay after 30 minutes of non-driving. This would in turn shut down the PDU and the Linux Box, creating many a disgruntled engineer.

To fix the auto shutdown feature, we simply changed the vehicle’s factory configuration. We changed one of the modules so the Ford Fusion Hybrid would stay running indefinitely. Using an open source tool called FORScan, and an FTDI Chip enabled OBD-II tool, we sent specific CAN messages to the Body Control Module and modified the original factory settings so the vehicle would never auto-shutdown again. With tool in hand and a “might as well” look on our faces, we also disabled the annoying “double honk” punishment when you dare shut the door with the key fob in the car. Now, the car quietly shuts the door, key in it, with no complaints.

Disabling the annoying double honk “feature”

Drive-by-Wire

Now that we have compute and power, how can we actually control the vehicle programmatically? In our Sense, Plan, Act paradigm, how do we Act? The answer is a drive-by-wire kit. At the simplest level, you can consider a drive-by-wire kit to be the interface between our sensors/compute and actuators. We’ve covered drive-by-wire and the CAN Bus in a previous post that’s worth checking out.

The kit enables our compute, after being fed data by sensors, to issue commands (ultimately CAN messages), and have those commands (steering, braking etc.) be actuated on the car. The actuators are the accelerator pedal, brake pedal and steering rack — all interfaced by Dataspeed’s drive-by-wire kit. In modern cars, most of these actuators are already completely decoupled from the driver’s input. For example, when you press the accelerator pedal, you are simply moving two potentiometers that send a raw 0 to 5V voltage signal to the Engine Control Module (ECM). Within the ECM, this pedal position information is then converted to a desired engine torque which in turns decides how much spark timing to advance or increase airflow by opening the throttle.

The drive-by-wire kit that actuates the accelerator pedal, is connected between the pedal assembly and the ECM. When the system is disabled, the pedal’s original potentiometers send a 0–5V signal to the ECM. However, when the drive-by-wire kit is enabled, a new signal is generated digitally based on commands generated by our Ubuntu/ROS computer.

A self-driving car is a series of complex, interconnected systems, but I hope this post revealed just that little bit more about what it’s like to work on self-driving cars. At Voyage we believe that by being transparent about our work we can help drive momentum and bring self-driving cars to the world faster.

If you found this information useful, and you’re thinking of either starting a career in this industry or transitioning to a company that loves to ship, please consider applying to one of our job listings. Our team would be excited to speak with you!

--

--

Obsessed with AI. Built self-driving cars at Cruise and Voyage. Board member at Skyways. Y Combinator alum. Angel investor in 50+ AI startups.