How Self-Driving Cars Use CNN Technology

As a kid, I loved watching cartoon films with cars that seemed to drive themselves. It made me wonder if such cars could be real and if there were tiny robots inside driving them magically.

As we’re getting older, self-driving cars are becoming real! I’m genuinely fascinated by them. Do they understand when to stop at stop signs and red lights, for instance? Can they even see animals and people walking around on the road? And what about driving when it’s all dark outside or when the weather is all rainy or snowy?

Let’s talk about self-driving cars! These are cars that can drive themselves without a human driver. Companies like Tesla and Waymo use smart computer techniques, like deep learning, to make these cars super smart. Deep learning helps the cars do cool things, like understanding road signs and driving safely even when the weather is terrible. It’s all about using advanced technology to shape how we’ll get around in the future!

History

The history of self-driving automobiles is similar to a long, exciting adventure. Imagine the 1920s, when autonomous vehicles were still only a dream for people. One inventive mind, Francis Houdina, stood out by creating a car that followed lines on the road. However, it needed special hidden wires under the road to guide it.

Source: theatlantic.com

Moving forward to the 1980s and 1990s, the brilliant minds at Carnegie Mellon University were onto something big. They developed cars that could “see” using cameras, helping them navigate busy city streets. These cars were like learning explorers, discovering how to drive by looking around.

Then, a significant moment arrived in 2004, set against the backdrop of a desert challenge. Self-driving cars entered the scene, attempting a tough race – a race they didn’t win, but it was a start. Think of it as their training ground to become better drivers.

However, the actual breakthrough happened in the 2000s and 2010s when major companies like Tesla, Uber, and Google (now Waymo) entered the automotive sector. Google began testing self-driving cars in 2009. Fast forward to 2015, Tesla’s cars introduced a feature where they could partly drive themselves on certain roads. They could handle steering and staying on the road without constant human control.

As more companies joined the race, the competition to create completely self-driving cars heated up. Picture teams of inventors racing to make cars that could drive without needing humans to steer.

But the story keeps going. We’re still working on making cars that can drive by themselves, which would change how we travel. This adventure is still happening, and it means we could have safer and easier trips because these fancy self-driving cars are getting better.

How Do Self-Driving Cars Work?

Self-driving cars are like super-smart decision-makers! They use cameras, LiDAR, RADAR, GPS, and inertia sensors to gather data about their surroundings. Then, special algorithms called deep learning algorithms process this data to understand what’s happening around them. Based on this understanding, they make important decisions to drive safely and smoothly.

Source: arxiv.org

If we want to figure out how self-driving cars really work, let’s take a closer look at these four parts shown in the diagram above. It’s like solving a puzzle – understanding each piece will help us see the bigger picture of how these amazing cars operate:

Perception
Localization
Prediction
Decision Making
- High-Level Path Planning
- Behaviour Arbitration
- Motion Controller

Perception

#1. Camera

Cameras are like the eyes of a self-driving car – they’re super important! They help the car know what’s happening around it. These cameras do different jobs, like figuring out what things are, separating different parts, and finding where the car is.

To ensure the car doesn’t miss anything, it has cameras placed all over – front, back, left, and right. These cameras work together to make a big picture of everything around the car. It’s like the car’s own special 360-degree view!

These cameras aren’t just for show. They’re smart. Some look far away, up to 200 meters, so the car knows what’s coming up ahead. Others focus on stuff nearby so the car can pay close attention to details. This camera team helps the car see and understand everything, like a friend guiding it, so it can drive safely and make good choices.

Sometimes, cameras are super helpful, like when parking, because they show a wide view and help make good choices for driving carefully.

But only using cameras for seeing things has problems, especially in tough weather like fog, heavy rain, and at night. In these times, the pictures from cameras can look weird and messy, which can be really unsafe.

To handle these challenging situations, we need special sensors that can work when it’s really dark or even totally night. They should also be able to measure how far things are without needing light that we can see. When we put these sensors in the car’s eyes (perception system), the car gets better at driving in bad weather or when it’s hard to see. So, the car can drive more safely, which is great for everyone driving on the road.

#2. LiDAR

LiDAR, which means Light Detection And Ranging, is a fancy tech that uses lasers to figure out how far things are. LiDAR sends out laser beams and measures how long they take to come back from things.

When LiDAR and cameras work together, they help the car understand things more clearly. It makes a 3D map of what’s around the car. This special info can then be looked at by smart computer programs, which help the car guess what other cars might do. This is useful when the road is tricky, like at busy crossings, because the car can watch other cars and drive safely.

However, LiDAR has limitations that can be problematic. While it works well at night and in dark environments, it can struggle in conditions with interference from rain or fog, potentially leading to inaccuracies in perception. To solve these problems, we use both LiDAR and RADAR sensors at the same time. These sensors provide extra information that helps the car understand things more clearly. The car can drive on its own in a safer and better way.

#3. RADAR

RADAR, which stands for Radio Detection and Ranging, has been used for a long time in everyday things and also by the military. Originally used by the military to detect objects, RADAR calculates distances using radio wave signals. Nowadays, RADAR is vital in lots of cars, especially self-driving ones.

RADAR is awesome because it can work in any kind of weather and light. Instead of lasers, it uses radio waves, which makes it flexible and super useful. However, RADAR is considered a noisy sensor, which means it can detect obstacles even when the camera does not see any.

The self-driving car’s brain can get confused by all the extra signals from RADAR, which we call “noise.” To fix this, the car needs to clean up the RADAR info so it can make good choices.

Cleaning the data means using special tricks to distinguish the strong signals from the weak ones, like separating the important things from the not-so-important stuff. The car uses a clever trick called Fast Fourier Transforms (FFT) to understand the information even better.

RADAR and LiDAR give information about single points, like dots on paper. To understand these dots better, the car uses something like grouping. It’s like when you put similar things together. The car uses clever statistical methods, such as Euclidean Clustering or K-means Clustering, to combine similar dots and understand them. This makes the car able to drive smarter and safer.

Localization

In self-driving cars, localization algorithms play a crucial role in determining the vehicle’s position and orientation as it navigates, known as Visual Odometry (VO). VO functions by identifying and matching key points in consecutive video frames.

The car looks at special dots in the information, like marks on a map. After that, the car uses statistics called SLAM to find out where things are and how they’re moving. This helps the car know what’s around, like roads and people.

And to do this even better, the car uses something called deep learning. It’s like super-smart computer.

These tricks make the car good at understanding things very well. Neural networks like PoseNet and VLocNet , leverage point data to estimate the 3D position and orientation of objects. These estimated 3D positions and orientations can then be utilized to derive scene semantics, as demonstrated in the image below. When the car uses math and smart computer tricks, it knows where it is and what’s around it. This helps the car drive safely and smoothly on its own.

Prediction

Understanding human drivers is indeed a complex task, as it involves emotions and reactions rather than straightforward logic. Because we don’t know what other drivers will do, it’s crucial for self-driving cars to make good guesses about their actions. This helps ensure safety on the road.

Imagine self-driving cars having eyes all around, like a 360-degree view. This lets them see everything happening. They use this info with deep learning. The car uses clever techniques to predict what other drivers might do. It’s similar to playing a game where you plan ahead to do well.

Prediction using Deep Learning

The special sensors in self-driving cars are like their eyes. They help the cars know what things are in pictures, find stuff around them, know where they are, and see where things end. This helps the car figure out what’s nearby and make smart choices.

During training, deep learning algorithms model complex information from images and cloud data points obtained from LiDARs and RADARs. During actual driving (inference), the same model helps the car prepare for possible moves, including braking, halting, slowing down, changing lanes, and more.

Deep learning is like a smart helper for the car. This makes the car understand things it’s unsure about, figure out its location, and drive better. This keeps driving safe and makes it go more smoothly.

But, the tricky part is deciding the best action from a few choices. Choosing the right moves requires careful thinking so the car can drive well and stay safe.

Decision Making

Self-driving cars have to make important choices in tricky situations, but it’s not easy. This is because sensors might not always be correct, and people on the road can do unexpected things. The car must guess what others will do and move to avoid crashes.

To make choices, the car needs lots of info. The car gathers this information using sensors and then uses deep learning algorithms to understand where things are and predict what might happen. Localization helps the car know its initial position, while prediction generates multiple possible actions based on the environment.

However, the question remains: how does the car choose the best action among the many predicted ones?

Source: semanticscholar.org

Deep Reinforcement Learning (DRL) is a technique for making decisions, and it uses an algorithm called Markov Decision Process (MDP). MDP is helpful in guessing how people on the road might act in the future. When there are more things moving around, things get more complicated. This means the self-driving car has to think about even more possible actions.

To address the challenge of finding the best move for the car, the deep learning model is optimized using Bayesian optimization. In some cases, a framework combining a Hidden Markov Model and Bayesian Optimization is employed for decision-making, enabling the self-driving car to navigate effectively and safely in various complex scenarios.

Source: arxiv.org

Decision-making in self-driving cars follows a hierarchical process with four key components:

Path or Route Planning: At the beginning of the journey, the car determines the best route from its current position to the desired destination. The goal is to find an optimal solution among various possible routes.

Behavior Arbitration: The car must steer through the route after planning it. The car is aware of static objects like roads and crossroads, but it is unable to foresee the exact actions of other drivers. To handle this uncertainty, we use smart methods like Markov Decision Processes (MDPs) for planning.

Scenario decision of top state machine

Motion Planning: With the route planned and the behavior layer determining how to navigate it, the motion planning system coordinates the car’s movements. This means making sure the car moves in a way that’s both safe and comfortable for the people inside. It thinks about things like how fast it goes, changing lanes, and what’s around it.

Vehicle Control: The final step is vehicle control, which executes the reference path generated by the motion planning system, ensuring the car follows the intended trajectory smoothly and safely.

By breaking down decision-making into these different parts, self-driving cars can drive well and safely in complicated places. This makes sure passengers have a smooth and comfortable ride.

Convolutional Neural Networks

Convolutional neural networks (CNNs) are widely used in self-driving cars due to their ability to model spatial information, particularly images. CNNs excel at extracting features from images, making them helpful for figuring out lots of different things.

In a CNN, as the network’s depth increases, different layers capture varying patterns. Early layers detect simple features like edges, while deeper layers recognize more complex ones, such as object shapes (like leaves on trees or tires on vehicles). This adaptability is why CNNs are a central algorithm in self-driving cars.

The core component of a CNN is the convolutional layer, which utilizes a convolutional kernel (filter matrix) to process local regions of the input image.

The filter matrix is updated during training to obtain meaningful weights. A fundamental property of CNNs is weight sharing, where the same weight parameters are used to represent different transformations, saving processing space and enabling diverse feature representations.

The output of the convolutional layer is usually passed through a nonlinear activation function, like Sigmoid, Tanh, or ReLU. ReLU is preferred as it converges faster than the others. Also, the result often goes through a max-pooling layer. This keeps important details from the picture, like the background and textures.

Three essential properties of CNNs make them versatile and fundamental in self-driving cars:

Local Receptive Fields
Shared Weights
Spatial Sampling

These properties reduce overfitting and store critical representations and features crucial for image classification, segmentation, localization, and more.

Here are two CNN networks used by companies pioneering self-driving cars:

HydraNet by Tesla
ChauffeurNet by Google Waymo

Learn more about Convolutional Neural Networks.

#1. HydraNet by Tesla

HydraNet is a dynamic architecture introduced by Ravi et al. in 2018, primarily developed for semantic segmentation in self-driving cars. Its key objective is to improve computational efficiency during inference.

The concept of HydraNet involves having different CNN networks, called branches, assigned to specific tasks. Each branch receives various inputs, and the network can selectively choose which branches to run during inference, ultimately aggregating the outputs from different branches to make a final decision.

Tesla’s Hydranet

In the context of self-driving cars, inputs can represent different aspects of the environment, such as static objects (trees and road railings), roads and lanes, traffic lights, etc. These inputs are trained in separate branches. During inference, the gate mechanism decides which branches to activate, and the combiner collects their outputs to make the final decision.

Speed, Lane, and Movement Detection

Tesla has adapted the HydraNet architecture, incorporating a shared backbone to address challenges in segregating data for individual tasks during inference. The shared backbone, usually modified ResNet-50 blocks, allows the network to be trained on all objects’ data. Task-specific heads based on semantic segmentation architecture, like the U-Net, enable the model to predict outputs specific to each task.

Tesla’s HydraNet stands out by its ability to project a birds-eye view, creating a 3D representation of the environment from any angle. This enhanced dimensionality aids the car in better navigation. Remarkably, Tesla achieves this without using LiDAR sensors. Instead, it relies on just two sensors: a camera and a radar. The efficiency of Tesla’s HydraNet allows it to process information from eight cameras and generate depth perception, demonstrating impressive capabilities without the need for additional LiDAR technology

#2. ChauffeurNet by Google Waymo

ChauffeurNet is an RNN-based neural network used by Google Waymo for training self-driving cars using imitation learning. While it primarily relies on an RNN for generating driving trajectories, it also incorporates a CNN component known as FeatureNet.

This convolutional feature network extracts contextual feature representations shared by other networks and is used to extract features from the perception system.

Source: Researchgate

The concept behind ChauffeurNet is to train the self-driving car by imitating expert drivers using imitation learning. To overcome the limitation of insufficient real-world training data, the authors of the paper “ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst” introduced synthetic data.

This synthetic data introduces various deviations, such as perturbing the trajectory path, adding obstacles, and creating unnatural scenes. Training the car with synthetic data was found to be more efficient than using only real data.

In ChauffeurNet, the perception system is not part of the end-to-end process but acts as a mid-level system. This allows the network to have various input variations from the perception system. The network observes a mid-level representation of the scene from the sensors, and using this input along with synthetic data, it imitates expert driving behavior.

By factoring out the perception task and creating a high-level bird’s eye view of the environment, ChauffeurNet facilitates easier transfer learning, enabling the network to make better decisions based on both real and simulated data. The network generates driving trajectories by iteratively predicting successive points in the driving path based on the mid-level representations. This approach has shown promise in training self-driving cars more effectively, providing a path toward safer and more reliable autonomous driving systems.

#3. Partially Observable Markov Decision Process used for self-driving cars

Partially Observable Markov Decision Process (POMDP) is a mathematical framework used in the context of self-driving cars to make decisions under uncertainty. In real-world scenarios, self-driving cars often have limited information about their environment due to sensor noise, occlusions, or imperfect perception systems. POMDP is designed to handle such partial observability and make optimal decisions by considering both uncertainty and available observations.

In a POMDP, the decision-making agent operates in an environment with partially observable states. The agent takes action, and the environment transitions to new states probabilistically. However, the agent only receives partial observations or noisy information about the true state of the environment. The objective is to find a policy that maximizes the expected cumulative reward over time while considering the uncertainty in the environment and the agent’s observations.

Source: Researchgate

In the context of self-driving cars, POMDP is particularly useful for tasks such as motion planning, trajectory prediction, and interaction with other road users. The self-driving car can use POMDP to make decisions about lane changes, speed adjustments, and interactions with pedestrians and other vehicles, considering the uncertainty in the surrounding environment.

The POMDP has six components, and it can be denoted as POMDP

M:= (I, S, A, R, P, γ)

where,

I: Observations

S: Finite set of states

A: Finite set of actions

R: Reward function

P: Transition probability function

γ: discounting factor for future rewards.

POMDPs can be computationally challenging due to the need to consider multiple possible states and observations. However, advanced algorithms, such as belief space planning and Monte Carlo methods, are often employed to efficiently approximate the optimal policy and enable real-time decision-making in self-driving cars.

By incorporating POMDP into their decision-making algorithms, self-driving cars can navigate complex and uncertain environments more effectively and safely, considering the uncertainty in sensor readings and making informed decisions to achieve their intended goals.

The self-driving automobile, functioning as an agent, learns by interacting with the environment with reinforcement learning (RL), a sort of machine learning. State, action, and reward are the three important variables at the core of Deep Reinforcement Learning (DRL).

State: Describes the current situation of the self-driving car at a given time, such as its position on the road.

Action: Represents all the possible moves that the car can make, including decisions like lane changes or speed adjustments.

Reward: Provides feedback to the car whenever it takes a particular action. The reward can be positive or negative, and the goal of DRL is to maximize the cumulative rewards.

Unlike supervised learning, where the algorithm is explicitly given the correct actions, DRL learns by exploring the environment and receiving rewards based on its actions. The self-driving car’s neural network is trained on perception data, which includes features extracted by convolutional neural networks (CNNs).

DRL algorithms are then trained on these representations, which are lower-dimensional transformations of the input, resulting in more efficient decision-making during inference.

Training self-driving cars in real-world scenarios is dangerous and impractical. Instead, they are trained in Simulators, where there is no risk to human safety.

Simulator

Some open-source simulators are:

By combining perception data with reinforcement learning, self-driving cars can learn to navigate complex environments, make safe and optimal decisions, and become more adept at handling real-world driving scenarios.

FAQs

What are autonomous vehicles?

Autonomous vehicles, commonly referred to as self-driving cars, are automobiles with cutting-edge sensors and artificial intelligence that can navigate and drive on their own. These vehicles assess their environment and make driving judgments using cameras, LiDAR, RADAR, and sophisticated algorithms.

Are autonomous vehicles safe?

In the development of self-driving automobiles, safety comes first. To ensure they adhere to high safety regulations, these vehicles undergo thorough testing and simulation. Although there have been incidents during testing, the ultimate objective is to make self-driving cars more secure than human-driven ones.

Can autonomous vehicles be used in any weather?

Extreme weather, including heavy rain or snow, could present problems for self-driving automobiles. Unfavorable weather might reduce the accuracy of sensors and impair driving ability. Engineers are constantly trying to make the technology function better in adverse weather conditions.

Autonomous vehicles – are they allowed?

The legality of autonomous vehicles varies by nation and location. To accommodate autonomous vehicles, many jurisdictions are revising their laws and regulations. Self-driving car testing and limited deployment are already permitted in some areas.

Do autonomous vehicles need human intervention?

Most currently available self-driving cars are at level 2 or level 3 automation, where they might need human assistance sometimes. The industry, however, strives to reach higher levels of automation, such as level 4 or level 5, where human intervention becomes minimal or unnecessary.

Conclusion

In conclusion, self-driving cars have the ability to transform the auto industry by enhancing road efficiency and safety. We looked into all the essential components that support these autonomous cars, including LiDAR, RADAR, cameras, and advanced algorithms.

While progress has been promising, there are still important challenges to tackle. Presently, self-driving cars are at level 2 out of level 5 of advancement, requiring human intervention in certain scenarios. However, through continued dedication and innovation, we are inching closer to achieving full autonomy.

Key Takeaways

Advancing Algorithms: Further optimization of algorithms is crucial to enhance road perception, especially in challenging conditions where road markings and signs may be lacking.

Refining Sensing Modality: Improving the accuracy and efficiency of sensing modalities for mapping and localization will be instrumental in achieving higher levels of autonomy.

Vehicle-to-Vehicle Communication: Creating a linked and intelligent road ecosystem will be made possible by pursuing vehicle-to-vehicle communication.

Human-Machine Interaction: Encouraging public acceptability of self-driving technologies will require examining and addressing issues related to human-machine interaction.

Future Prospects: Despite the difficulties, the achievements so far are remarkable, and with ongoing cooperation and research, self-driving cars offer the potential to provide a safer and more effective transportation environment for everybody.

We are all on the same journey towards completely autonomous self-driving cars. As we solve challenges and spur innovation, we get closer to a time when vehicles smoothly manage our roadways, improving safety, the environment, and convenience for everyone.

You may now learn about spatial computing and its application in developing self-driving cars.

machine learning, Smart Gadgets

Show Comments

How Self-Driving Cars Use CNN Technology

History

How Do Self-Driving Cars Work?