2019-04-13 | Motion Planning | UNLOCK

国外运动规划论文阅读笔记| learning-based

1. Path Planning

Neural Path Planning: Fixed Time, Near-Optimal Path Generation via Oracle Imitation

Info: University of California 2019年发布在arXiv上

克服图/栅格搜索对维度扩展差和采样搜索对环境复杂性敏感的缺点,利用RNN模仿Oracle算法,并且利用平滑器对曲线进行平滑。

Our approach leverages the Recurrent Neural Network (RNN) in order to mimic the stepwise output of an oracle planner in a predefined environment, moving from the start to the end location in a relatively smooth manner.

OracleNet的advantages:

  • 能在线快速创建最优化路径
  • 如果具备概率完备性将能创建一个有效的路径
  • 不论状态空间如何复杂它都具有连续性
  • 它与维度成线性关系

2. Imitation Learning

2.1 中文笔记

概念

模仿学习是指从示教者提供的范例中学习,一般提供人类专家的决策数据${ \tau_1,\tau_2,…,\tau_m } $,每个决策包含状态和动作序列$\tau_i=<s_1^i,a_1^i,s_2^i,a_2^i,…,s_n^i>$,将所有[状态-动作对]抽取出来构造新的集合$D={(s_1,a_1),(s_2,a_2),(s_3,a_3),…}$。

之后就可以把状态作为特征(feature),动作作为标记(label)进行分类(对于离散动作)或回归(对于连续动作)的学习从而得到最优策略模型。模型的训练目标是使模型生成的状态-动作轨迹分布和输入的轨迹分布相匹配。

2.2 相关论文

Driving Policy Transfer via Modularity and Abstraction

Müller, M., Dosovitskiy, A., Ghanem, B., & Koltun, V. (2018). Driving policy transfer via modularity and abstraction. arXiv preprint arXiv:1804.09364.

发表在CoRL-Conference of Robot Learning18年会议上(第二届)

1. 创新点

本文的关键点 是端到端策略不直接对原始图片进行处理,而是对语义分割之后的图像进行处理

结构主要分为三个部分:

  1. 感知,包括语义分割。输入为原始图像,输出为分割好的场景
  2. 端到端策略学习,输入为步骤一的分割场景,输出为一个局部规划路径,得到车的可行驶区域(并不是油门跟转角)
  3. 一个低级的运动控制器根据可行驶区域(PID)

2. 使用的方法

  • 对于语义分割网络的训练利用开源的数据集[文献9]
  • 驾驶策略训练直接在模拟器进行
  1. Perception

    对于分割网络的训练直接用的公开数据集(the real-world Cityscapes dataset),并不是用的模拟驾驶数据集。但是也能很好的适应模拟器跟真实场景。

    使用分割网络为ERFNet architecture(文献39)

  2. Driving policy

    驾驶决策部分输出为每帧两个路点(waypoints)一个是用来控制转向,一个是用来长期规划控制油门(比如转向前减小油门)

    本文中固定$w_1$ 和$w_2$ 的距离大小$r$,只预测角度$\varphi$ 。

    训练driving policy用imitation learning网络使用CIL(文献8)在模拟器中进行,数据集是一个observation-command-action 的三元数组。其中observation为观察到的图像,command为三个高级的导航命令(左转,直行,右转),action为车辆的控制(转向,油门)

  3. 数据增加的方法(data augmentation)

    domain randomization approach(文献42)

3. 评估

  • 评估时用单目摄像头
  • 不同的虚拟环境(天气)
  • 真实环境

4. 结论

  • 在模拟器中,相比传统的monolithic end-to-end counterparts,该modular method 具有很大的优势
  • 现实环境中,不同的路面(clear, snowy, wet)、天气环境(sunshine, overcast, dusk)

5. 相关工作

文献7工作与本文类似,不过他们需要一个AR设备来诊断和推动物体。

文献37最早提出迁移学习, 不过就只进行了车道线跟踪(文献3)。文献31实现了避障,利用深度表达,但是不适合城市复杂道路的驾驶。离线驾驶(文献25,44),简单城市环境的导航(文献8).

6. 参考文献

数据集

[9] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and
B. Schiele. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016.

模仿学习方法CIL

[8] F. Codevilla, M. Muller, A. Dosovitskiy, A. L ¨ opez, and V. Koltun. End-to-end driving via conditional ´
imitation learning. In ICRA, 2018.

分割网络

[39] E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo. Efficient ConvNet for real-time semantic ´
segmentation. In Intelligent Vehicles Symposium (IV), 2017.

数据增加方法domain randomization

[42] F. Sadeghi and S. Levine. CAD2RL: Real single-image flight without a single real image. In RSS, 2017

End-to-end Driving via Conditional Imitation Learning

Codevilla, F., Miiller, M., López, A., Koltun, V., & Dosovitskiy, A. (2018, May). End-to-end driving via conditional imitation learning. In 2018 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1-9). IEEE.

1. 基本概念

1
2
3
Conditional Imitation Learning概念: 

At training time, the model is given not only the perceptual input and the control signal, but also a representation of the expert’s intention. At test time, the network can be given corresponding commands, which resolve the ambiguity in the perceptuomotor mapping and allow the trained model to be controlled by a passenger or a topological planner, just as mapping applications and passengers provide turn-by-turn directions to human drivers.
1
2
3
ablation studies:模型简化测试,顾名思义,就是去掉模型中的部分模块,然后看模型的性能是否发生变化。

英文解释 An ablation study typically refers to removing some “feature” of the model or algorithm, and seeing how that affects performance.

2. 相关工作

  • 和本文工作类似的文献有27,22,4, 36, 6

  • 通过人类指导的机器人行为(文献5,14,24,34,35),貌似大部分是自然语言处理。

    • 本文是利用的预先定义的文本,比如下一个路口向右转,下个路口向左转,直行。
    • 另外,由于本文也是基于视觉的端到端深度网络,限制了接收语言命令的能力

3. Conditional Imitation Learning介绍 & Methodology

前提背景:控制器与环境交流是离散的时间步长,是一个监督学习问题

  • 网络结构
    • 输入为image i ,维度m,命令command c
    • 输出为action a ,action space 是一个连续二维的空间:转向角和加速度
    • 两种方法将command c 添加到网络中
  • 网络详细介绍
    • actions的损失函数
    • 求解器的设置Adam solver
    • ….
  • 训练数据的分配
    • 不能单一的从一个专家经验中获得,参考文献29,4
    • to further augment the training dataset,加入噪声到专家控制信号,与文献21所作工作类似
  • 数据增量
    • transformations 包括对比度、亮度、加各种噪声污点和区域剔除
    • 但是没有几何变化,如旋转和平移,因为这种情况下command也需要改变

4. 实验设置

  • simulated environment
    • command也是通过方向盘上的buttons实现(command命令有四个continue, left, straight, right)
    • 开的时候注意避障,但是不考虑红绿灯跟停止线
  • physical system
    • command只有三个(left, straight, right)
    • 训练的时候用三个相机的图像,测试的时候只用中间相机的图像

5. 实验

  • 仿真实验
    • 回合(episode)成功的标志为agent在规定的时间内到达终点
    • 成功率利用没有违规行驶的平均距离来衡量(???距离怎么转换成百分比)
    • 训练集2h,其中10%包含injected noise(噪声怎么加入不明白???),但是赵老师那边读的论文最好训练时间为10h,过长过短都不好
  • 物理实验
    • 路口,车辆离开路面不能超过5秒。
    • 模型测试是在overcast weather,但是数据采集是在sunny weather
  • 结果
    • 对比了几种不同的模型
      • command的输入位置不同,有无噪声一共四种
    • 说明了noise injection和data augmentation的重要性
    • 在两种天气(阴天,晴天)和三种环境进行了测试

6. 讨论

自然语言处理

人类与机器人的自然语言交流已经在相关文献进行研究(文献5,14,24,34,35)。本文也说将来会把unstructured natural language communication with autonomous vehicles 作为一个重点研究方向。

7. 参考文献

数据集获得

[4] M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba. End to end learning for self-driving cars. arXiv:1604.07316, 2016

[29] S. Ross, G. J. Gordon, and J. A. Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. In AISTATS, 2011.

人类指导机器人行为

[5] A. Broad, J. Arkin, N. Ratliff, T. Howard, and B. Argall. Realtime natural language corrections for assistive robotic manipulators. International Journal of Robotics Research, 2017.

[14] S. Hemachandra, F. Duvallet, T. M. Howard, N. Roy, A. Stentz, and M. R. Walter. Learning models for following natural language directions in unknown environments. In ICRA, 2015.

[24] C. Matuszek, L. Bo, L. Zettlemoyer, and D. Fox. Learning from unscripted deictic gesture and language for human-robot interactions. In AAAI, 2014.

[34] S. Tellex, T. Kollar, S. Dickerson, M. R. Walter, A. G. Banerjee, S. J. Teller, and N. Roy. Understanding natural language commands for robotic navigation and mobile manipulation. In AAAI, 2011.

[35] M. R. Walter, S. Hemachandra, B. Homberg, S. Tellex, and S. J. Teller. Learning semantic maps from natural language descriptions. In RSS, 2013.

与本文所做工作类似

[27] D. Pomerleau. ALVINN: An autonomous land vehicle in a neural network. In NIPS, 1988.

[22] Y. LeCun, U. Muller, J. Ben, E. Cosatto, and B. Flepp. Off-road obstacle avoidance through end-to-end learning. In NIPS, 2005.

[6] C. Chen, A. Seff, A. L. Kornhauser, and J. Xiao. DeepDriving: Learning affordance for direct perception in autonomous driving. In ICCV, 2015.

[36] J. Zhang and K. Cho. Query-efficient imitation learning for end-to-end simulated driving. In AAAI, 2017

[4] M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba. End to end learning for self-driving cars. arXiv:1604.07316, 2016

文献6 DeepDriving: Learning affordance for direct perception in autonomous driving

Chen, C., Seff, A., Kornhauser, A., & Xiao, J. (2015). Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2722-2730).

提出三种paradigms基于视觉的自动驾驶

  1. Mediated Perception parses every scene into structured data and derives decision from that
  2. Behaviour Reflex maps directly from input image to driving control using deep learning
  3. Direct Perception maps input to a small number of perception indicators and make a driving decision

    本文属于第三种,文中关于距离的限制可以借鉴,那样能保证车走在路中心

文献22 Off-road obstacle avoidance through end-to-end learning

简要介绍
  • end-to-end网络将原始输入图像映射为方向盘转角。

  • 在各种地形、天气、光照、障碍物类型下进行数据采集训练。

    • 数据收集是15帧每秒
  • 小车安装两个前向无线彩色相机,连接远程电脑对视频进行处理。

  • 网络为6层卷积网络
    • 卷积神经网络适合这个问题的原因,局部和稀疏的连接结构允许处理高分辨率的图像
    • 在训练数据有限的情况下能学习相关的局部特征
  • 车速2 m/s

文献36 Query-Efficient Imitation Learning for End-to-End Simulated Driving

Zhang, J., & Cho, K. (2017, February). Query-efficient imitation learning for end-to-end simulated driving. In Thirty-First AAAI Conference on Artificial Intelligence.

A human driver (reference policy) cannot cover all situations in data. This paper introduces imitation learning for AD, where a CNN learns a primary policy and together with the reference policy iterate to generate more data. Approach based on DAgger A safety policy, estimated by an additional FCN, predicts, if it is safe for a NN to drive. Evaluated on TORCS only.

  • Architecture: 6-layer CNN, 2 layer FCN
  • Input: 160x72 image (simulated in TORCS), Conv5
  • Output: Steering angle, safe/unsafe

Virtual to Real Reinforcement Learning for Autonomous Driving

Pan, X. , You, Y. , Wang, Z. , & Lu, C. . (2017). Virtual to real reinforcement learning for autonomous driving.

Due to training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error, the VISRI can convert virtual image input into synthetic realistic images. Then given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving.

  • Scene Parsing

    • 文献16中采用deep convolution neural network or fully convolution neural network
    • 本文中,采用SegNet,网络结构可参考文献2,由两部分组成
      • 第一部分是an encoder,which consists of Convolutional, Batch Normalization, ReLU and max pooling layers
      • 第二部分是decoder, which replaces the pooling layers with upsampling layers
    • 训练网络使用的数据集为CityScape dataset(文献7)
      • 有11类,训练30000次
    • 实际使用数据集为文献5中
      • 将里面45k张图像都进行了分割

    总结:分割网络先在CityScape dataset上训练好,然后将真实dataset中部分拿出来进行supervised learning model的训练,剩下部分进行test。

提到模仿学习里面的一个通病covariate shift problem(文献20)

  • 结果
    • 比baseline好,但是没有supervised method好,因为它需要大量标注好的数据进行训练

参考文献

[2] Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561, 2015.

[5] Sully Chen. Autopilot-tensorflow, 2016. URL https://github.com/SullyChen/Autopilot-TensorFlow.

[7] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. CoRR, abs/1604.01685, 2016. URL http://arxiv.org/abs/1604.01685.

[16] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.

Exploring the Limitations of Behavior Cloning for Autonomous Driving

多帧,

自然语言处理相关文献(Natural Language Processing)

Real-Time Natural Language Corrections for Assistive Robotic Manipulators

Broad, A., Arkin, J., Ratliff, N., Howard, T., & Argall, B. (2017). Real-time natural language corrections for assistive robotic manipulators. The International Journal of Robotics Research, 36(5-7), 684-698.

主要参考点
  • 利用Amazon Mechanical Turk进行数据收集
  • 开源的语音到文本软件(speech-to-text)结合一个Kinova Robotics MICO机械臂得到一个端到端的系统
  • 通过限制机器人需要理解的语言和动作来进行简化
  • DCG模型

Learning Models for Following Natural Language Directions in Unknown Environments

Hemachandra, S., Duvallet, F., Howard, T. M., Roy, N., Stentz, A., & Walter, M. R. (2015, May). Learning models for following natural language directions in unknown environments. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (pp. 5608-5615). IEEE.

本文提出了一种新的框架能够使机器人在未知环境下根据自然语言实现导航。主要有三个算法做出了贡献:

  • 学习语言理解模型,能够通过command理解环境的标注和期望的行为
  • 估计-理论算法,通过将推断得到的注释视为对环境的观察,并将它们融合为来自机器人传感器流的观察来学习假设世界模型的分布。
  • 从人类示范中学到的置信空间策略,该政策直接进行世界模型的分配来确定合适的导航行动
前期工作
  • 这篇文章是在作者前期的基础上进一步改进得到的(前期工作见文献10)

  • 对于Imitation Learning Formulation,有一个prior work 在文献12中。

  • 文献23利用data aggregation 算法对策略进行训练

参考文献

[10] F. Duvallet, M. R. Walter, T. Howard, S. Hemachandra, J. Oh, S. Teller, N. Roy, and A. Stentz, “Inferring maps and behaviors from natural language instructions,” in Proc. Int’l. Symp. on Experimental Robotics
(ISER), 2014.

[12] F. Duvallet, T. Kollar, and A. Stentz, “Imitation learning for natural language direction following through unknown environments,” in Proc. IEEE Int’l Conf. on Robotics and Automation (ICRA), 2013.

[23] S. Ross, G. J. Gordon, and J. A. Bagnell, “A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning,” in International Conference on Artificial Intelligence and Statistics, 2011.

Imitation Learning for Natural Language Direction Following through Unknown Environments

Duvallet, F., Kollar, T., & Stentz, A. (2013, May). Imitation learning for natural language direction following through unknown environments. In 2013 IEEE International Conference on Robotics and Automation (pp. 1047-1053). IEEE.

本文主要解决问题为:没有经过培训的能通过自然语言也能对机器人进行控制

DAGGER一个模仿学习框架

注意力机制

Gaze Training by Modulated Dropout Improves Imitation Learning

Chen, Y., Liu, C., Tai, L., Liu, M., & Shi, B. E. (2019). Gaze Training by Modulated Dropout Improves Imitation Learning. arXiv preprint arXiv:1904.08377.

  • 强化学习需要事先顶一个奖励函数,而模仿学习不需要。

  • 模仿学习里面最典型的就是通过监督学习的行为克隆(behavioural cloning through supervised learning)

    • 行为克隆是一个teacher-students paradigm
相关研究
  • 文献7,8通过experts’ gaze对novice human learners的影响,因此它是很有前景的研究通过行为克隆训练的深度驾驶网络是否能收益于expert gaze patterns。

  • 如何将gaze information与deep neural networks结合起来,最近的工作可参考文献9,10(Pix2Pix

  • 一个conditional adversarial network被训练用来估计人类眼睛注意力的分布在驾驶的时候,然后本文利用这个估计的注意力分布来调整dropout概率在不同的空间位置
  • 如何评估这种模型的泛化性,文献12,13有介绍
  • 文献14有说conditional imitation learning(deep multi-branch imitation network)这个系统不能泛化到未知环境中的缺点
  • gaze in autonomous driving
    • 文献18也在探索human gaze对自动驾驶的作用的影响
    • 文献19提出一个多分支深度神经网络来预测eye gaze在城市驾驶场景中

dropout :指在深度学习网络的训练过程中,按照一定的概率将一部分神经网络单元暂时从网络中丢弃,相当于从原始的网络中找到一个更瘦的网络。

实验设计
  • 数据收集,gaze数据通过一个远程的眼部追踪器Tobii Pro X60获得,一共5个场景,一个场景重复4次实验
  • 训练和测试
    • 对于gaze network, Track1 和Track2 用来训练(3500 images)
    • 对于imitation network,Track1和Track2各三个trial用来训练(40k images),剩下两个加其余场景的用来测试
    • 考虑到robustness,training数据里面有10%的数据为异常数据(recovery from drift),另外,也进行data augmentations,比如random changes in contrast, brightness and gamma online for each image
Uncertainty in Deep Learning

关于不确定性的研究人员比较公认地将不确定性分为两类,一类是指事物内在的不确定性(Aleatory Uncertainty),另一类是指对事物的认知不完整所导致的不确定性(Epistemic Uncertainty)。在这样的一个分类基础上,人们在量化这些不确定性时采用了不同的数学模型。对于前者,大家比较公认的是采用概率统计的办法,而针对后者,其数学模型可谓形形色色、纷纷扰扰,用的较多的有主观概率(subjective probability, Bayesian statistics)、模糊集(fuzzy set, possibility theory)、随机集(random set, imprecise probability)、凸集(convex set, info-gap)、证据理论(evidence theory)或区间理论(interval theory)等等。

参考文献

[7] S. J. Vine, R. S. Masters, J. S. McGrath, E. Bright, and M. R. Wilson, “Cheating experience: Guiding novices to adopt the gaze strategies of experts expedites the learning of technical laparoscopic skills,” Surgery, vol. 152, no. 1, pp. 32–40, 2012.
[8] Y. Yamani, P. Bıc¸aksız, D. B. Palmer, J. M. Cronauer, and S. Samuel, “Following expert’s eyes: Evaluation of the effectiveness of a gazebased training intervention on young drivers’ latent hazard anticipation skills,” in 9th International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, 2017.

如何将gaze information与deep neural networks结合起来

[9] R. Zhang, Z. Liu, L. Zhang, J. A. Whritner, K. S. Muller, M. M. Hayhoe, and D. H. Ballard, “Agil: Learning attention from human for visuomotor tasks,” arXiv preprint arXiv:1806.03960, 2018.
[10] C. Liu, Y. Chen, L. Tai, H. Ye, M. Liu, and B. E. Shi, “A gaze model improves autonomous driving,” in Proceedings of the 2019 ACM Symposium on Eye Tracking Research & Applications. ACM, 2019, to appear

[12] R. McAllister, Y. Gal, A. Kendall, M. van der Wilk, A. Shah, R. Cipolla, and A. Weller, “Concrete problems for autonomous vehicle safety: Advantages of bayesian deep learning,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, 2017, pp. 4745–4753. [Online]. Available:
https://doi.org/10.24963/ijcai.2017/661
[13] A. Kendall, V. Badrinarayanan, and R. Cipolla, “Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding,” arXiv preprint arXiv:1511.02680, 2015.

[14] X. Liang, T. Wang, L. Yang, and E. Xing, “Cirl: Controllable imitative reinforcement learning for vision-based self-driving,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp.
584–599.

gaze in autonomous driving

[18] S. Alletto, A. Palazzi, F. Solera, S. Calderara, and R. Cucchiara, “Dr (eye) ve: a dataset for attention-based tasks with applications to autonomous and assisted driving,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 54–60.
[19] A. Palazzi, D. Abati, S. Calderara, F. Solera, and R. Cucchiara, “Predicting the driver’s focus of attention: the dr (eye) ve project,” arXiv preprint arXiv:1705.03854, 2017.

Predicting the Driver’s Focus of Attention: the DR(eye)VE Project

Palazzi, A., Abati, D., Calderara, S., Solera, F., & Cucchiara, R. (2018). Predicting the Driver’s Focus of Attention: the DR (eye) VE Project. IEEE transactions on pattern analysis and machine intelligence.

本文目的是评估一个驾驶员在开车时,视野内哪部分对完成任务比较重要

  • 提出一个multi-branch deep architecture由三部分组成:raw video, motion and scene semantics。

  • 也介绍一个DR(eye)VE数据集,现今可获得的最大眼部追踪标注数据集(500k registered frames)

    • in different traffic and weather conditions
    • approximate 6 hours
    • recorded by an accurate eye tracking device and a roof-mounted camera
    • The DR(eye)VE data richness enables us to train an end-to-end deep network that predicts salient
      regions in car-centric driving videos.
  • 可应用于人-车交互和驾驶员注意力分析等
Introduction

本文从两部分获得这个结果:

  1. 在不同的情况和场景下研究数据驱动的驾驶员注视位置,发现场景的语义、速度和自下而上的特征都会影响驾驶员的注视
  2. 提倡一种适合不同驾驶员的通用gaze pattern

网络由three branches 组成:

  1. 场景的视觉信息
  2. 运动信息(光流)
  3. 语义分割
关于visual attention

文献20是关于视觉注意力机制的一篇综述

有两个计算模型用于FoA预测

  • top-down
    • aim at highlighting objects and cues that could be meaningful in the context of a given task
    • methods are known as task-driven(文献9,49,50)
    • integrate semantic contextual information in the attention prediction process(文献64)
  • bottom-up
    • capture salient objects or events naturally popping out in the image
参考文献

Exploring the Limitations of Behavior Cloning for Autonomous Driving

需要看的文献27,29,44

  • validation datasets的建立参考文献9,

Appendix

  • 数据收集过程中不变道和超车

* Deep Imitation Learning for Autonomous Driving in Generic Urban Scenarios with Enhanced Safety

University of California, Berkeley 出品

本文提出了一种新的思路,利用鸟瞰图进行模仿学习,类似于Waymo跟Uber的方法。效果很好,并且在复杂场景:signalized intersection和roundabout表现良好,且泛化性也不错。

也是分为三个模块:Perception,Deep imitation learning, Safety & Tracking Controller

但是需要HD Map,会对历史状态进行记录

这个就有点类似行为学习的味道了。

几点疑惑:

  • bird-view怎么获得
  • 只在仿真环境中进行了实验,实际没有,因为涉及到HD Map
  • Perception模块怎么获得文章没具体说
  • Waymo跟Uber文章可以看下

1560867663495

1560867591065

3. Direct Perception

Conditional Affordance Learning for Driving in Urban Environments(2018-ETH)

DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving (2015 ICCV,Princeton University)

2. Learning affordance for driving perception

  • training phase
    • collect screenshots –>train model to estimate affordance in a supervised learning manner
  • test phase
    • the trained model takes driving scene image and estimates the affordance indicators for driving
    • a driving controller processes the indicators and computes the steering and acceleration/brake commands
    • ground truth labels –> evaluate the system’s performance

2.1 Mapping from an image to affordance

focus on highway driving with multiple lanes

Highway driving actions:

  • following the lane center line
  • changing lanes or slowing down to avoid collisions with the preceding cars

three types of indicators:

  • heading angle
  • the distance to the nearby lane marking
  • the distance to the preceding cars

In total have 13 affordance indicators

Affordance Learning In Direct Perception for Autonomous Driving

滑铁卢大学认知实验室出品,本文提出的affordance都比较抽象了,例如车辆航向角、几车道,是否是自行车道,并没有做实验(包括模拟环境中都没有跑),只对现成的数据集进行了预测,然后与人为识别进行了对比。

主要是参考的文献10——Learning from Maps: Visual Common Sense for Autonomous Driving ,这篇文章是Princeton大学写的,导师肖健雄开了公司AutoX,技术路线是纯视觉的方法。可能没有Wayve用强化学习那么激进。

评论加载中