Scaling AI for the Future of Autonomous Driving: The Role of End-to-End Multimodal Models

Scaling AI for the Future of Autonomous Driving: The Role of End-to-End Multimodal Models

avatar
BSPK
2024.10.31조회수 6회


[요약]

  1. Waymo에서 멀티모달 LLM Gemini nano를 이용하여 자율주행을 구현하였고, 기존 Waymo 자율주행 모델인 Waymoformer 대비 우수한 성능을 보임

    image.png
    image.png
    EMMA model visualization
  2. LLM의 reasoning 성능향상을 위해 사용되는 학습/프로프트 테크닉인 Chain of Thought 를 이용하여 차량 Path Planning을 구현함.

    image.png
  3. Transformer 기반의 LLM의 언어 성능에 적용되던 Scaling law가 자율주행에도 적용 됨을 확인, 규모 경쟁이 발생할 가능성이 높음

    image.png

End-to-End Multimodal Models: The Future of Autonomous Driving?

The rapid evolution of AI has been nothing short of transformative, and its impact on industries like autonomous driving is becoming increasingly evident. A striking example is the shift towards End-to-End multimodal models for autonomous driving, a trand that has gained momentum thanks to companies like Tesla and Waymo.

The ChatGPT Revolution and its Influence

To understand how we arrived at this point, we need to revisit the rise of transformer-based models like OpenAI's ChatGPT. ChatGPT demonstrated the incredible potential of large language models (LLMs) to to process and generate human-like text by learning from vast amount of data. The success sparked a broader interest in applying similar architectures beyond natural language processing (NLP), including in fields like computer vision and robotics.


Tesla took note of this paradigm shift. inspired by the success of transformer-based models in NLP, Tesla restructured its Full Self-Driving (FSD) system into an End-to-End model. However, instead of processing text, Tesla's model processes images from the cameras mounted on the vehicle. The goal? To generate driving path as output, replacing traditional rule-based systems with a more holistic approach that directly maps raw sensor data to driving actions.

The Rise of End-to-End Multimodal Models

While the concept of E2E models was gaining traction by late 2023, there was still limited clarity on the exact architectures being used by industry leaders. However, just before the end of 2024, Waymo made a significant contribution to this space by publishing a detailed paper on their End-to-End Multimodal Model for Autonomous Driving, known as EMMA


EMMA represents a major leap forward in autonomous driving technology. Built on a multimodal ...

회원가입만 해도
이 글을 무료로 읽을 수 있어요.

이미 계정이 있으신가요?로그인하기
댓글 0
avatar
BSPK
구독자 459명구독중 9명
전자전기공학 박사, AI 연구자를 거쳐 전략기획 업무를 합니다. 기술의 발전이 가져올 세상의 변화를 먼저 포착하고 전달하고자 합니다.