Podcast transcripts, polished for reading

Emilio Frazzoli, CTO, nuTonomy - MIT Self-Driving Cars | Lex Fridman Transcript

Polished transcript · Lex Fridman · 9 Mar 2018 · 1h 7m · @martymcfly

Emilio Frazzoli, CTO of nuTonomy, lectures at MIT on autonomous vehicles and the future of urban mobility

Emilio Frazzoli delivers a lecture at MIT on the technology, business case, and philosophical challenges of self-driving vehicles.

Summary

Emilio Frazzoli, CTO of nuTonomy and former MIT faculty member, delivers a lecture arguing that the true transformative value of autonomous vehicles lies not in safety improvements alone, but in enabling genuinely convenient, affordable car-sharing as a service — a model he estimates could unlock trillions of dollars in societal value. He contends that the industry's sequential automation levels (Levels 0–5) are a dangerous framework, because Levels 2 and 3 — which require human supervision of automation — violate human nature and introduce new failure modes, as the airline industry learned painfully. Frazzoli argues that full Level 4/5 automation deployed as a fleet service, not a consumer product, is the only path that captures the real value of the technology. He presents nuTonomy's approach to autonomous driving policy using formal methods and hierarchical rule compliance rather than pure machine learning or hand-coded logic, and closes by arguing that the deepest unsolved challenge in the field is that humanity does not yet have a rigorous, mathematically precise theory of how vehicles — including human-driven ones — ought to behave.

Key Takeaways

  • The biggest value of autonomous vehicles is not safety but car-sharing. Frazzoli's back-of-the-envelope calculation shows that the value of time returned to drivers (~$1.2 trillion/year) and the economic benefits of functional car-sharing (~$2,000/person/year) together dwarf the value of safety improvements, reframing the entire case for the technology.
  • Levels 2 and 3 automation are dangerous by design. Requiring humans to supervise automation and intervene on demand contradicts human nature and mirrors the failure modes introduced by autopilots in aviation — an industry that had to painfully relearn how to manage automation with highly trained professionals, let alone everyday drivers.
  • Autonomous vehicles as a service, not a consumer product, is the viable near-term path. As a service, the cost of sensors, HD maps, and maintenance is compared against a $100,000/year human driver cost rather than a $20,000 consumer price ceiling — completely changing what technology investments are justifiable.
  • HD maps are a temporary logistical problem, not a fundamental barrier. A fleet of vehicles continuously driving a city generates enough data to build and maintain its own HD maps, making the mapping problem self-solving at scale.
  • Pure machine learning for driving policy has critical unresolved problems. End-to-end learned systems can learn the wrong behaviors (e.g., accelerating on yellow lights), are difficult to explain or debug, and require training data covering every possible combination of rules and traffic scenarios — a practically intractable requirement.
  • nuTonomy's approach uses formal methods and hierarchical rule compliance. Rather than scripting if-then-else logic or relying purely on learning, the system generates large numbers of candidate trajectories and checks each against formally encoded rules, solving a shortest-path problem in a combined physical and logical space — making the system verifiable and debuggable.
  • The rules of the road are not mathematically rigorous. Key concepts like "right of way" have no precise mathematical definition, rules are incomplete and inconsistent, and the "fundamental norm" (don't endanger others who follow the rules) provides no guidance for what to do when others violate the rules — leaving autonomous vehicles without a sound theoretical foundation to build on.
  • The mobility labor paradox makes job displacement fears overstated. If a service like Uber were to serve the entire world's mobility needs, one in seven people on Earth would need to be a driver — which is impossible. The real constraint on mobility services today is a shortage of willing drivers, not an excess of them.
  • The deepest unsolved problem is a theory of vehicle behavior. Frazzoli argues that until society develops a precise, rigorous, and complete theory of how vehicles should behave in every situation — including edge cases and probabilistic uncertainty — designing safe autonomous vehicles will remain fundamentally incomplete.

  • FULL TRANSCRIPT

    Introduction and Background

    Lex Fridman: Today we have Emilio Frazzoli. He's the CTO of nuTonomy, one of the most successful autonomous vehicle companies in the world. He's the inventor of the RRT* algorithm, formerly a professor at MIT directing a research group that put the first autonomous vehicles on road in Singapore, and now he returns to MIT to talk with us. Give him a warm welcome.

    Emilio Frazzoli: Thank you, Lex. It's a great opportunity and a great pleasure to be back here. I spent 15 years of my life here at MIT, first as a graduate student and then as a faculty member, and this is where nuTonomy — the company — was essentially born. We did a lot of the research that led us to start this company and eventually develop all this technology.

    What I will talk about today is a little bit about our vision on autonomous vehicles: why we want to have autonomous vehicles, some of the guidelines on the technology development, and why we are doing things in a certain way.

    Why Autonomous Vehicles — The Origin Story

    I really would like to tell you a number of stories about why I started doing this and why I think this is an important technology, and why we ended up starting this company. I've been a faculty member here for 10 years. I was happily working with my UAVs — I was in Aero/Astro — and at some point around 2005 there were these DARPA Grand Challenges that sounded cool, so I started working on cars as well. But the work that I was doing was mostly on airplanes and cars, making them fly and drive by themselves, because it was cool. No hands — it drives and it controls itself. As a roboticist, that's all I needed.

    But then in 2009 there was a new project starting — a team getting together to write a proposal for a project on future urban mobility in Singapore. I got interested in that project just because I wanted to go to Singapore. I called the person who was putting together the team and she said, "Thank you for your interest, but what do you think you bring to the table?" We had just done the DARPA Urban Challenge, so I said, "Well, I know how to make autonomous cars." And she said, "This is a project on future urban mobility — what do cars have to do with urban mobility?"

    That was the five-minute phone call that changed my life. She asked me this question — and that was Cindy Bernard, who is now a chancellor — and I had to come up with an excuse. So I said, "Well, imagine you have a smartphone and a smartphone app, and you use this app to call a car. The car comes to you, you get on the car, it drives you wherever you want to go, you step off, and the car goes to pick up somebody else or goes to park." This was 2009 — Uber was Travis Kalanick and a couple of guys with black cars in San Francisco. She bought it, so I joined the team and we started this activity.

    The important thing is that I started thinking about this excuse I had made up in those five minutes, and it actually sounded like a good idea. I started thinking more about why we want to have self-driving vehicles.

    The Economic Case for Autonomous Vehicles

    The number one reason you typically hear is that we want self-driving vehicles to make roads safer. A very large number of people die in road accidents every year — and what people often don't realize is that most of those people are fairly young, in their 20s and 30s. The usual argument, which Sebastian Thrun made in his TED talks, is that most road accidents are due to human error: remove the human, remove the error, save lives. That is typically the number one reason people mention.

    The second reason is convenience — if the car is driving itself, you can do other things: sleep, read, text legally, check your emails.

    Third is improved access to mobility for people who cannot drive, whether due to physical impairment, age, or intoxication.

    Fourth is increased efficiency and throughput in a city, as cars can communicate beyond visual range.

    Fifth is reduced environmental impact.

    Now, these are all fantastic reasons, but the problem is that if you think about them, they are all ways of taking the status quo — how cars are used today — and making it a little bit better, maybe a lot better, but not fundamentally different. And what I was mostly interested in was: can we use this technology to change the way we think of mobility?

    Here is a quick back-of-the-envelope calculation. The economic cost of road accidents in the United States is evaluated at about $300 billion a year. The societal harm — all the pain and suffering — is evaluated at another $600 billion a year. So we're getting to almost $1 trillion. That's a big number.

    But let's look at the other effects. The cost of congestion is estimated at $100 billion a year. The health cost of the extra pollution from congestion is another $50 billion a year. So those are relatively small changes.

    The next effect is actually important: what is the value of the time that everyone in society gets back from not having to drive? A simple calculation — I multiplied one-half the median wage of workers in the United States by the number of hours that Americans spend behind the wheel — and what you get is about $1.2 trillion a year. You may notice that the value to society of getting that time back is actually more than the value of increased safety. Of course it's a little cynical, so take it with a grain of salt, but you start seeing how these things compare.

    And what you may notice is that there is still about half of that pie chart missing. What is the other half? The other half is the value you provide to society by making car-sharing finally something that is convenient, affordable, and reliable.

    The Car-Sharing Opportunity

    Car-sharing is a concept that everybody loves but nobody uses — or not as many people as we would like. When I was here at MIT I really liked using Hubway, the bicycle sharing system, but you have to be very careful: if you wait too long in the afternoon, there are no more bikes on campus. Or maybe you cannot find a parking spot for your bike, so you have to park somewhere else and walk — which defeats the purpose of using the bike.

    The same thing with cars. With car-sharing systems, you either have a two-way system — essentially hourly rental — or you have a one-way system, but in a one-way system the distribution of cars tends to get skewed. Unless the company repositions cars in some clever way, you're not guaranteed to get a car where you need it, and you're not guaranteed to find a parking spot when you don't need the car anymore. These are both friction points for using vehicle sharing, and they are both friction points that are addressed if the car can drive itself.

    If you bring in all the economic benefits of a car-sharing system that actually works, we estimate that to be about $2,000 a year per person. That is a big chunk of the pie chart. And that is using an estimate of what we call the sharing factor of four — meaning that one shared vehicle can essentially substitute for four privately owned vehicles. There are some studies that get the sharing factor up to ten, in which case the benefits are even more. Every time I see a round number like that I get suspicious — ten is a little too convenient to be true — but that is something you can find in the literature.

    This is really where I think the major impact of autonomous driving will come from.

    The Levels of Automation — and Why Levels 2 and 3 Are a Bad Idea

    There is a lot of confusion in the community about what a self-driving car means. I'll list the Society of Automotive Engineers levels of automation. Level zero is no automation — your great-grandfather's car. Level one is driver assistance: cruise control or some simple single-channel automation. Level two is partial automation — something like lane-keeping combined with cruise control — but you still require the driver to pay attention and intervene. Level three is conditional automation: a driver is not required to pay attention all the time but needs to be able to intervene given some notice — and that "sufficient notice" is an ill-defined concept. Level four is high automation: no driver needed under some conditions. Level five is full automation: no driver needed under all conditions.

    My first reaction when I started seeing these levels — and there is a similar version from NHTSA — is that they represent a horrible idea. The horrible idea is that because they are given numeric levels, you are led to believe that these are sequential: you do level zero, then level one, then level two, three, four, five. I think this is an enormously bad idea, because level two and level three — anything where you require the human to pay attention, supervise the automation, and be ready to intervene with no notice or with some ambiguously defined notice — go against human nature.

    This is especially painful for me as a former Aeronautics and Astronautics professor, because we saw in the airline industry that as soon as autopilots were being introduced, and everybody thought that accidents would go down, there were actually more accidents. You have new failure modes induced by autopilots. Pilots lose situational awareness and lose the ability to react in an emergency. The airline industry had to essentially educate itself on how to deal with automation in a good way. And pilots are highly trained professionals — which is not the same thing you can say about your everyday driver. The last time most drivers sat with an instructor in a car was when they were 16. How do you train people to use automation technology safely?

    On the other hand, I think that full automation — where the car is essentially able to drive itself and does not rely on a human to take over — is in a sense easier, and is essential to capture the value of the technology.

    Why Full Automation Is the Only Path That Captures Value

    How do you realize the value of self-driving vehicles? On safety: I think it is true that eventually, asymptotically, self-driving cars will be safer than their human-driven counterparts. However, at what point can we be confident that is the case? Are we there yet? Not sure. How do you demonstrate the reliability of these self-driving cars? We know they've driven for three million miles with a relatively small number of accidents — if I remember correctly, only one was their fault. But humans drive many times that without accidents. How do you really make sure that even though the number sounds impressive, it has sufficient statistical significance? And every time you make an update to your software, you have to validate again. Making the case for safety is a very challenging issue, and we may not be positive that self-driving cars are actually safer than human counterparts until a really long time from now.

    On getting back the time value of driving: if I have to constantly pay attention to what the car is doing, I'd rather drive myself. The better the car drives, the harder it is for me to keep paying attention — that's the paradox. I would very easily fall asleep or get distracted. If I want to get that time back, the car must be able to drive itself without requiring me to pay attention.

    On car-sharing: in order to make car-sharing really convenient and reliable, you need the car to come to you with nobody inside. For that you need Level 4 or Level 5. Anything else just doesn't cut it. Everything else is a nice gadget you show off to your friends — it's not that useful.

    So Level 4 or Level 5 automation is really essential to capture the value of this technology. And in fact, the one game-changing feature of these cars is that they can move around with nobody inside. That is really the game-changing feature.

    Two Paths in the Industry

    There are two different paths that the industry is taking. What I call the OEM path is the automakers' path. They are used to thinking of production in the order of many millions of cars, and essentially what they do is make a lot of cars and add features — advanced driver assistance systems and so on — following levels 0, 1, 2, 3, 4, 5. Today you can buy cars which, even though they claim a "fully autonomous package" for $5,000 plus another $40,000 or something in the fine print, are actually Level 2 or Level 3. Tesla, the new Audi A8, Cadillac — they're coming out with these features. The problem is that you have to cross this red band where you're actually requiring human supervision of your automation system.

    The other path is what we are doing, and what all the indications suggest Waymo is doing, and similarly Uber. Essentially, you work on cars that are fully automated from the beginning, start with a small, maybe geofenced application, and then scale up operations — but always remaining at the full, high automation level.

    Service vs. Consumer Product — A Critical Distinction

    Another thing that people make a lot of confusion about is the difference between autonomous vehicles as a consumer product versus as a service. If you ask me when you will be able to walk into a car dealership and get out with the keys to a car that you just push a button and it takes you home — that's not happening for another 20 years at least. On the other hand, if you ask me when you will be able to go to some new city and summon one of these vehicles that picks you up and takes you to your destination — that is happening within a couple of years.

    What is the difference? There is a big difference between a self-driving car as a consumer product versus a service.

    On scope: if it's a product and you pay for it, you want it to work everywhere — take you home, drive through a little alley, drive you through the countryside. On the other hand, if I'm a service provider, I can decide where I offer the service and under what weather and traffic conditions. The problem becomes much easier.

    On financials: if I have to sell you an autonomy package, how much can it cost? The net present value of a driver's time over the next 10 years is about $20,000. So a rational buyer will not pay more than that for an autonomy package. If you want to make a profit, your autonomy package cannot cost more than a few thousand dollars. On the other hand, if you're thinking of this as a service, you are comparing the cost of automation to the cost of providing the same service using a human behind the wheel. To provide 24/7 service you need at least three drivers per car, at a cost of the order of $100,000 a year. Now I'm comparing the cost of my automation package to something that costs $100,000 a year over the life of the car. The cost of a fancy sensor or a LIDAR doesn't matter that much — I have much more freedom in buying the sensors I need.

    On HD maps: if I want to sell as a product, I need maps of the whole continent. If I'm providing a service, I only need to map the area where I want to provide the service. And the complexity and cost of generating maps scales with the square root of my customer base — meaning it becomes negligible as I serve more people. HD maps are a pain to collect and maintain, but much less of a pain than the logistics of operating a fleet serving a city's population.

    On servicing and maintenance: as a consumer, you don't want to calibrate sensors every time you go out or upload new driver software. In the service model, I have a maintenance crew that can take care of it professionally.

    So the cost of the autonomy package is not really the main issue — the cheaper the better, but it's not the main driver. And HD maps: my expectation is that within a few years, HD maps will be a dime a dozen. Imagine I have a fleet of 1,000 cars with sensors on board, driving around the city all the time, generating a gigantic amount of data that I can use to make and maintain my HD maps. As soon as you start offering this service, you will be able to collect all the data you need.

    The Economics of Removing the Driver

    Most of the cost of taxi services nowadays is the driver — about half of total cost. Remove the driver from the picture, and even though automation costs a little more and servicing costs a little more, you still get a really significant increase in margin, meaning you can pass some of those savings to customers and make a very strong business case.

    However, this is also misleading. The typical reaction is: "Oh my goodness, now all taxi drivers and truck drivers will be out of a job." In fact, one day I was summoned by the Singapore Ministry of Manpower and I was terrified — I thought they were going to shut me down because they were afraid of putting all their taxi drivers out of work. It turned out to be the opposite.

    What most people do not realize is that mobility services worldwide are actually manpower-limited. In Singapore, they would like to run more buses but don't have enough people who are able and willing to drive them. The same is true for trucking and taxis.

    Here is another back-of-the-envelope calculation. If everybody in the world used Uber for their mobility, how many people would need to be drivers for Uber? Do the calculation and you see that one person out of seven would need to drive for Uber. Do you see that happening? No way. People still need to be teachers, doctors, policemen, firemen — some people need to be kids. This simply cannot happen. What is happening today is that we are all doubling up as drivers for ourselves, and we spend about one-seventh to one-eighth of our productive day behind the wheel.

    The big change will be more on the supply of mobility rather than on job loss. Of course, if you increase the supply of mobility, the cost of mobility will probably go down and wages for drivers will go down — that is an issue — but maybe balanced by added value in service or other things.

    Another thing about truck drivers: something I recently learned is that 25 percent of all job-related deaths in the US are truck drivers. It is the single most dangerous industry you can be in. Maybe if you can take some of those people out of those trucks and have them supervise or remotely control a truck from an office instead of sitting in the truck, that may actually be a benefit to them.

    The State of the Art in Autonomous Technology

    What is the state of the art for autonomous technology today? You see a lot of demos from a number of companies, but a lot of what you see is not too different from what Ernst Dickmanns did in the late 1990s in Germany — no fancy GPUs, just cameras and some basic computer vision algorithms, driving for hundreds of miles on German highways. If you're not showing something that goes beyond that, you have not made any progress over the past 20 years. You may be using fancy deep learning and GPUs nowadays, but you're doing what people were doing 20 years ago.

    What I find more exciting is footage from our daily drives in Singapore. We are driving on public roads in normal traffic — construction zones, intersections, traffic from both sides. In Singapore they drive on the left, so making the right turn is what is hard because you have to cross traffic. The car is making the right decisions in all of these situations without any human intervention. If you're not showing the capability of driving in traffic in an urban situation like that, you're not really showing any advance over what people were able to do 20 years ago.

    We are doing this every day in Singapore, and we are also driving autonomously here in Boston — we are allowed by the city of Boston to drive our cars autonomously in the Seaport area.

    The Technical Challenges — Sensing, Mapping, and Driving Policy

    The big challenges are sensing and perception, mapping, and what I call decision-making — or what others call driving policy. Sensing and perception is a challenge we are aware of and making rapid progress on. HD maps are a logistical challenge but, as I said, will be a dime a dozen in a few years. The big remaining problem is driving policy.

    Here is a typical example of what we encounter in urban driving. We are at a traffic light, the light turns green, we make the turn, there is a pedestrian crossing the street — we yield to the pedestrian — and then we see a truck parked in the middle of our lane, so we need to go to the other lane, which is in the opposite direction, with a motorcycle coming. How do you write software so that your car is able to deal with this kind of complicated situation by itself?

    My claim — I have not proved it mathematically yet — is that the rules of the road were introduced exactly to avoid the need for negotiation when you drive. When you're walking down a hallway and a person is coming the other direction, there's always that awkward moment where you both try to dodge the same way. With cars, you don't do that. Everybody drives on the right — or in other places, on the left — period. You don't negotiate. You get to an intersection, the light is red, you stop. You don't say, "I'm really in a rush, do you mind if I go?" The rules of the road were invented by humans to minimize the amount of negotiation.

    The Industry's Two Approaches to Driving Policy

    The industry standard approach — what we did at the time of the DARPA Urban Challenge — was a lot of if-then-else statements, finite state machines, or logic encoded by hand. The problem with that is it's very hard to come up with this logic, and it's essentially impossible to debug and verify. I spent many miserable months sitting at the naval airbase in Weymouth, in a rental car, just playing interference with our autonomous car trying to adjust all this logic and parameters. I vowed that I would never do it again.

    A good example of what can go wrong: the Caltech team at the DARPA Urban Challenge — a team of very smart, capable, dedicated people — tried to go to an intersection, decided to go, then for some reason decided not to, and backed up out of the intersection. The director of DARPA, Tony Tether, was there and immediately disqualified them. There was essentially a bug in the logic that they had worked on for months and never caught. It's very easy to make mistakes and very hard to find those bugs.

    As a reaction to that, what you hear people saying now is: "There are too many rules of the road, it's impossible to code all of them correctly, so let's not do that — just feed the car a lot of data and let the car learn by itself how to behave." You see a number of efforts trying to use deep learning or other learning approaches to get to end-to-end driving of cars.

    I don't want to sound too negative, but I will try to be honest about what I think. One of our developers — a very bright person from Caltech — wrote the first version of the code for dealing with traffic lights, and the reaction to a yellow light was essentially: if you see a yellow light, speed up. That's what my brother does. There is always the danger that you learn the wrong behavior. Of course there are some situations in which accelerating on a yellow light is actually the right response, but it is not always the case — there are other features of the situation you need to examine.

    The other thing is explainability. You want to be able to explain why the car did something, and more than that, you want that information to be actionable. You want to know: this happened because of this reason, and this is how I fix it. That is something that is hard to do with purely learning-based algorithms.

    The Formal Methods Approach

    The reality is that it is simply not true that there are too many rules of the road. Any 16-year-old in the United States can go to the DMV, get the booklet, study it, do a written test, and be given a learner's permit. We require every single licensed driver to demonstrate that they understand the rules. We don't say just drive with your parents for a few thousand miles and we'll give you a license — we ask them to show that they studied and understand the rules.

    How many rules of the road are there actually? I went through the exercise of counting and clustering them. You have rules on who can drive, when and where; what can be driven, at what speed and in what direction; who yields to whom; how you use your signals; how you interpret signals you see on the road; and where you can park or stop. That's essentially it — twelve categories. Not that many.

    What is true is that the number of possible combinations of rules and instantiations of those rules, given the context of the scenario — where other actors are, where pedestrians are, where other cars are — is a humongous number. You don't want to generate a model that gives you the right response to all possible combinations. That is completely intractable. But the point is that not only is it hard to code the good behavior for every one of these situations — I claim it is also hard to learn the good behavior, because you would need enough training data for every possible combination of rules and instantiations. Good luck with that.

    On the other hand, it is very easy to assess what is good behavior. This is the insight from NP-hardness: the problem is NP-hard in the sense that if you have a non-deterministic system generating a candidate solution, it is very easy to check whether that candidate is actually a solution — you can do that in polynomial time. What I claim is that if you have an engine able to generate a very large number of candidates, and all you do is check whether each one of those candidates is good with respect to the rules, that's all you need.

    The algorithms I worked on during my academic career — RRT and RRT* — are exactly that: they work by generating a very large graph exploring all potential trajectories a robot or system can take, and then you check them for whether they satisfy the rules. That is very different from trying to generate something that satisfies all constraints simultaneously. Generating candidates given all constraints is a combinatorial problem; checking a single candidate for compliance with a number of rules is a linear operation in the number of rules.

    In our cars today we use these formal methods. We write down all the rules in a formal language — very precise syntax — and then we can automatically verify whether our trajectories satisfy all these rules. The computer translates the formal rules into something like a finite state machine automatically, not by hand. We generate trajectories that evolve not only in physical space and time but also in a logical space, telling us whether and to what extent we are satisfying the rules. Then we solve a shortest-path problem on this graph — which is exactly what you do in robot motion planning.

    We have a hierarchy of rules. My claim is that all bodies of rules generated by humans are organized hierarchically. A typical example is Asimov's Three Laws of Robotics: a robot will not harm a human; a robot will obey orders from a human unless they violate the first law; a robot will try to preserve itself unless it violates the first two laws. The same applies when you drive. Some rules are more important than others: do not hit people, do not hit other cars — then lower priority is staying in your lane, then lower priority is maintaining speed. When you have a violation of an important rule, even by a tiny amount, that is much worse than violating a less important rule by a large amount. That gives a total ordering structure, and then we solve a shortest-path problem on the resulting graph.

    Dealing with Uncertainty and the Limits of the Rules

    Assuming that everybody is running this minimum-violation planning, everything will be okay. The problem is that humans introduce a lot of uncertainty into the whole thing.

    When I was younger — if that was two years ago — I thought: take all the rules of the road, convert them to a formal language, put them in your software, and you're done. But then you go and look at these rules of the road and you see that they are a mess. They are not a sound theory in the sense that they are not complete — they don't cover every possible case — and they are not consistent.

    My favorite rule is what is actually called the fundamental norm in the Swiss rules of the road: "All road users must behave in such a way as not to pose an obstacle or danger to other road users who behave according to the rules." Do you see the problem? That doesn't mean that if I see somebody violating the rules I can just hit them. You can imagine a fleet of vigilante autonomous cars that go around and if you run a red light, they kill you. Technically the autonomous car would be right — the other guy would be to blame. But do we really want that? Probably not. The Swiss rules add that special care must be exerted in cases where you have evidence that other people are not following the rules — but it still doesn't tell you what you're supposed to do when somebody else is violating the rules.

    The Trolley Problem — A Meaningful Version

    You hear about trolley problems endlessly, and most of them are truly stupid in the sense that it is extremely unlikely you will ever be given the choice of killing either Mother Teresa or Hitler. Anything remotely similar will never happen to you on the road. On the other hand, there are versions of the trolley problem which are actually meaningful.

    Here is one that my collaborator came up with. You're driving down the road and you see a pedestrian jaywalking in front of you. If you stay on your current course, you will kill the pedestrian — but it's not your fault, it's their fault for stepping into the road when they shouldn't have. On the other hand, you could swerve, but with some probability P you may kill another person who had nothing to do with this — they were just walking around peacefully.

    The reason I like this is because it has clear solutions in the two extreme cases. If P equals one — meaning swerving will definitely kill somebody else — then clearly you don't swerve. If P equals zero — meaning you're sure you won't kill anybody if you swerve — then clearly you swerve. By some continuity argument, there must be some value of P at which the solution changes. What is that value? Nobody knows. How do you evaluate that P? Nobody knows. But these are the kinds of questions we actually need to answer.

    A more sophisticated version is what happens every day in our cars. When our computer vision system tells me there is a pedestrian in front of us, it's not telling me there is definitely a pedestrian — it's telling me it thinks there is a pedestrian and it's, say, 80 percent confident. Now you have a combination of the probability that the pedestrian is actually there and the probability of killing somebody else if you swerve. Because if you swerve and kill somebody because of a false positive — there was nobody there — you'll be in serious trouble. So you have this two-dimensional domain with a boundary, and somebody needs to decide where that boundary is.

    I don't think it should be me. I can come up with an answer when I write my code, but I actually think it should be a community effort in which the community agrees on how the car should behave in these kinds of situations.

    The Biggest Challenge — We Don't Know How Vehicles Should Behave

    When people ask me what I think is the biggest challenge in autonomous vehicles, something I've come to realize only recently is this: the biggest challenge in the development of autonomous vehicle technology is that we do not understand, in a precise and rigorous way, how we want vehicles — including human-driven vehicles — to behave.

    A lot of these rules of the road are just a giant pile of imprecise, non-rigorous language. For example, a lot of the rules are predicated on the concept of right of way. I looked everywhere and there is not a single definition of what right of way means in mathematical terms. I know it has something to do with distance, something to do with relative speed, maybe with absolute speed — but what are the values? What are the numbers? If I had to write a function — if you see this car approaching and this car is farther away than this distance and the relative speed is more than this, then stop, otherwise go — nobody is telling me what that relationship should be.

    What we need is to develop a sound theory for the rules of the road that covers precisely any kind of situation and tells me, in any situation, what is the right behavior, what is the wrong behavior, and maybe — if you have two behaviors — which one is better. I need to be able to make that comparison.

    We can use formal methods, and there is also a lot of room for statistical or learning-based methods — for example, looking at what people actually do: at what point do people feel cut off versus feel that they had plenty of room? We need to develop this sound theory and assess behaviors on realized space-and-time trajectories. If you say, "Well, if I didn't see the pedestrian, it's not my fault that I hit them," then people will start removing sensors — if you don't see anything, you can hit anything you want and you're not to blame. The compliance with the rules, once we have precise and rigorous rules, will actually drive a lot of requirements for the sensing and perception system and for the planning and control system.

    The main message today is: the biggest challenge is that we don't know precisely how we want human-driven vehicles to behave. Once we answer that question, I think that designing automated vehicles will be much, much easier.

    Thank you.


    Polished transcript of Lex Fridman. All views are those of the original speakers. Watch on YouTube ↗
    Published by @martymcfly
    More from Lex Fridman
    More from @martymcfly
    Summary