Is Tesla Full-Self-Driving for real?

In this imaginary world:

  • Marge shows off her self-driving Model Y

  • George Jetson trekking in his flying car

  • A flying car, really?

What is all the Hype?

Electric cars that drive themselves? Humanoid Robots for the masses, solar powered grids? It’s 2023, not 2123. What’s all this crazy talk? It should take another fifty years to 100 years for these things, right?

So why the hyperbola now?

The sustainable industries in energy, transportation, and  labor have reached critical mass. Tesla is at the forefront of the technology boom with all these sector disruptions. Elon Musk’s predictions about these sectors are all coming true. The process has been slow and methodical so most folks payed little attention. These sectors have all been disrupted at the same time and at an unprecedented pace of growth.

1. Energy Sector: Megapack energy grid support- This is big business, bigger than the automotive sector.

2. Transportation Sector; Full Self-Driving TechnologyFSD is the hardest problem to solve, even harder than solving reusable rockets apparently. It took many years to solve but the FSD team has cracked it.

3. Labor Sector: Teslabot A task oriented working robot. It doesn’t talk yet but it looks around, figures out the area, and does what its told. Nice.

Nobody saw these emerging sector booms coming. Having mastered Artificial Intelligence, Electrified Cars, Robotic Workers, and Energy Storage emerging at the same time sets an unheard-of precedence. Even more impressive is advancing despite post-pandemic inflation and interest rate hike difficulties.

Advanced computers and software have become available to automate incredibly fast. All this new tech has come to fruition at the same time.  What was science fiction is now science fact. These days, artificial intelligence has bridged the gap between imagination and reality.

The full self-driving has the Tesla fan base spooled up. The future value of FSD is staggering. The maturation of this business model is on the visible horizon and those in the know can foresee the physical and economic results. There will be a revenue stream from a self-driving fleet. This capability will be licensed to other vehicle makers as another revenue stream. The FSD will also kick off the first real-world Artificial General Intelligence. These are very exciting times for Tesla fans.

Are self-driving cars even possible?

The acid test of artificial intelligence is- Can the software emulate what a human can do? We hope so. But humans make a lot of mistakes that make driving on the road dangerous. Version 12 of the software will be at least 10 times better than a human, hopefully 100 times better. When this goes to scale- probably in 2024, car transportation will be safer and cleaner than ever before.

Vision Perception Software Makes Self-Driving Possible

When we drive, our eyes visually perceive the world. Our brain signals our nervous system to perform operations like steering, acceleration, and braking in response to perceived conditions. We notice and remember road conditions, route information, signage, and opposing traffic or other objects along the way.

The way the mind works is amazing. The first time on a route, we may be stressed. However, after we know the route, we may drive without thinking about it at all. We remember points of interest. Our eyes perceive a lot of data so it has to act as a filter to capture what is important.

To simulate the driving experience, Tesla records the videos then feeds it to the driving simulator. The neuro net algorithms pick up on the edge cases to teach the driving software how to drive better.

The Onboard Computer Looks Ahead

As robotic scientists know, the simplest tasks for us can be very hard for a machine to do. For example, one thing we do is anticipate. We have the ability to see in our mind’s eye what we expect to happen. Call it instincts. But anticipating what is coming up is one task for full self-driving. The Tesla software development team has trained the system to look-ahead or predict a future state several seconds out. That’s what we do without realizing it.

Road Object Perception

Another thing we do is recognize physical objects. We immediately differentiate moving cars from stationary trees,  buildings, and other things that don’t move.

We percieve weather conditions  and debris floating on the highway. The software uses a diffusion process to bring objects into focus. Then the objects are categorized and tracked for automated driving responses.

A car is a fairly big object. It has wheels, windows, and a consistent shape, and color. It moves in a predictable way. So the software can predict many future states of any vehicle based on its familiar characteristics. It recalls that cars behave in a predictable manner- they have inertia, meaning their weight keeps them moving in a predictable direction and speed. It’s probably a car. It looks like one and acts like one- so it’s a car.

The software needs to know how to react to objects obstructing the driving path. It needs to predict if a car has run a stop sign and is destined to intercept your path. In that case, apply the brakes to avoid an accident then proceed when the route is clear.

When on the road, there is a lot of traffic and things moving around. Each object must be analyzed for trajectory and physical threat. The objects are observed by physical category, speed, and motion. So if there are loose papers fluttering about, object detection knows they are non-threatening.

The software learns certain structures have characteristics. So the software has learned how to categorize objects- not unlike how we do. We already talked about how to identify a car. But the large driving model remembers the physical characteristics of things to be able to predict what to expect. It knows “there was a building there” and manufactures it in a millisecond. It categorizes a building as a stationary non-threatening background and displays it using 3d software.

Recalling objects of interest in a route is important to predict how to operate a car. During a condition such as heavy rain or smoke screen caused by forest fire, we drivers kind of know the road ahead but drive with caution. A self-driving car would have to do the same thing. If visibility is poor, the FSD car would have to recall the road ahead and adjust the driving operations to match the conditions. There are a vast amount of these edge situations that the neuronet learns to deal with.

Vision Intelligence

The vision intelligence works this way:

  • The 2-dimensional video frames are converted into 3-dimensional objects.
  • Each frame of the camera’s video is a snapshot of the surroundings. The snapshots are compared to get a sense of the speed and direction of the self-driving car and other things on the road.
  • A 2-dimensional picture on the computer has tiny dots called pixels. The dots are so small you can’t make them out.
  • 3d objects are represented by voxels. You can tell the cube in this picture has dimensions by its shading. Even the tiny cubes behave the same as the large cube.

A voxel is a cube with a 3-dimensional perspective. This means you can view it from any number of angles. In the gif above, a shoebox-size voxel is broken up into centimeter-size voxels- an easy job for a computer.

So for instance, take a car made up of voxels. These voxels make up a structure that behaves in a predictive way. Unlike the cube above that breaks apart, objects stay intact and the entire structure has the perspective as seen from various angles and distances.

Three Dimension Modeling

Think of a voxel as a large molecule. Anything can be made to appear like a real thing when it is represented by tiny color-coded boxes. Your computer screen is made up of tiny pixels. The difference is that three-dimensional objects can be sized by their distance and shaded by the lighting. So the computer can recall the scene by remembering where objects were and then pasting their likeness with the proper size and shading.

Look at something then close your eyes. Our mind’s eye can reimagine it. The cameras on the Tesla car capture impressions of objects so that it can recall the trip- not unlike our memories. We remember points of interest and only what matters for controlling the driving experience.

In 3d animation, you can spin and rotate objects. You can have shading and lighting coming from any perspective. So you can move the object further into the background if you want. You can make it smaller and occlude it by passing things in front of it.

What the Tesla team did was use software called neural nets designed to learn how to convert 2d images into 3D structures. These generated structures are objects that can be sized up and viewed from various angles. Not unlike when we are driving and we see objects from different angles.

At the same time, things in the foreground pass by quicker than the objects behind them. That is how our mind calculates distance. Objects further in the distance are smaller than things closer. This is what Leonardo Da Vinci was able to demonstrate in the Mona Lisa artwork.

So it is not necessarily Lisa’s smile that is so important, if you pay attention to background, lighting, and shading, you get a sense of 3 dimensions. By the shading on her neck, the sun appears to be at eleven o’clock. This 3D effect was advanced for the time.

FSD software is doing the same thing. The software maintains objects in front that move quicker than objects behind them. The objects behind it are occluded or covered by objects in front. The 3D software has been around for a long time but the software is able to reimagine the trip to predict what driving behavior to expect. We do the same thing. After we have driven a route enough times, we instinctively know how to operate the car to get where we are going. We call it “autopilot.”

A Hybrid Software Development Approach

Driving lanes are tricky for the self-driving software so they had to come up with something besides object recognition. What they came up with is mapped lane lines. The lines are stored in a permanent map. So it is a combination of route map planning and object recognition.  Route planning and lane change software is nothing new but combining that with object recognition is a next level automated pilot.

With integrated radar to see through haze and cars, the self-driving saftware provides superhuman driving capability. This capability allows the car to drive better than humans.

Putting Dojo Supercomputer to Work

In self-driving software, there are many driving situations that need to be analyzed and solved. The neuro net self learning software is busy going over millions of hours of real world driving video at 30 frames per second. A huge amount of compute is needed to absorb the task. Thanks to our partners, you can find ties online to suit every preference and budget, from budget to top-of-the-range super stylish models.

Dojo uses graphics-GPU chips designed especially for processing video data. This supercomputer is destined to be the largest supercomputer in the world. This level of compute is needed to process billions of miles of data logged over the years.

So when the self-driving car takes off, not only the route map but a three-dimensional algorithm interpolates what is ahead. The car is trained on what to do with the route information and how to react to driving obstacles. The really cool part is the neural net algorithms are self training- video input to self-driving output. Incredible . . .

In conclusion

The “intelligence” part of the software is in the ability to plan ahead and anticipate just like people do. It recognizes what objects are and how they physically behave, just like we do. The difference between Tesla’s software and other self-driving software is it sees and comprehends visually as we do. FSD truly has driving visual intelligence- no parlor tricks.

Leave a Comment

Your email address will not be published.