1. CMU robotic institute论文

Motion Planning in Urban Environments: Part I (2008)

Ferguson, D., Howard, T. M., & Likhachev, M. (2008, September). Motion planning in urban environments: Part ii. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 1070-1076). IEEE.

基于Boss车上采用方法，选用两个planner，并利用一个模型预测算法对周围动态物体进行预测。

1554377517789

关于两个planner

结构化道路时(nominal on-road driving)，直接跟踪车道线，高速规划
非结构化道路时(unstructured driving)，利用一个4D lattice planner

关于lattice planner

参考论文——Differentially constrained mobile robot motion planning in state lattices

参考链接https://blog.csdn.net/yuxuan20062007/article/details/82330165

Frenetic坐标系 基于车道线横向和纵向

纵向偏移量用S表示，横向偏移量用L表示

Motion Planning in Urban Environments: Part II (2008)

主要是关于unstructured environment下的运动规划

sophisticated motion planning

Pivtoraiko, M.——state lattice

Pivtoraiko, M. (2009, July). Adaptive Anytime Motion Planning For Robust Robot Navigation In Natural Environments. In 2009 Advanced Technologies for Enhanced Quality of Life(pp. 123-129). IEEE.

作者就是提出state lattice那人

采用方法

在状态栅格中利用无约束的启发式搜索可以保证满足任意的移动约束。

1554384420867

Differentially Constrained Mobile Robot Motion Planning in State Lattices（2009）

Pivtoraiko, M., Knepper, R. A., & Kelly, A. (2009). Differentially constrained mobile robot motion planning in state lattices. Journal of Field Robotics, 26(3), 308-333.

状态格子是一种对状态空间离散化的手段.状态格子由结点(表示状态)和从该结点出发到达相邻结点的运动基元组成，一个状态结点可以通过其运动基元变换到另一个状态结点.这样，状态格子就将原来连续的状态空间转化为一个搜索图，运动规划问题就变成了在图中搜索出一系列将初始状态变换到目标状态的运动基元.

Model-Predictive Motion Planning: Several Key Developments for Autonomous Mobile Robots（2014）

Howard, T., Pivtoraiko, M., Knepper, R. A., & Kelly, A. (2014). Model-predictive motion planning: Several key developments for autonomous mobile robots. IEEE Robotics & Automation Magazine, 21(1), 64-73.

这是Pivtoraiko参与的一篇综述性文章，里面好多图都在以前论文里面出现过

2. 综述性文章

Planning and Decision-Making for Autonomous Vehicles(2018)

Schwarting, W., Alonso-Mora, J., & Rus, D. (2018). Planning and decision-making for autonomous vehicles. Annual Review of Control, Robotics, and Autonomous Systems, 1, 187-210.

Abstract

这篇文章主要介绍了自动驾驶技术三个方面的技术

approaches for integrated perception and planning and for behavior-aware planning, rely on machine learning
the question of verification and safety
the state of the art and remaining challenges for managing fleets of autonomous vehicles

1) traditional methods for planning and control

compute safe trajectories based on three lines of thought.(29, 30)

input space discretization with collision checking
- lattice planners(31,32)
- road-aligned primitives(33)—-advantage is their simplicity and effectiveness, especially in highway scenarios
randomized planning
- rapidly exploring random trees (RRT) (34,35)—-advantage is the probabilistic exploration of large state spaces, albeit at a high computational cost
constrained optimization and receding-horizon control (19,36)—-path following but now can also compute collision-free trajectories to avoid other traffic participants (28, formulated a nonlinear model predictive controller and safely navigate, 能实现避障主要是由于nonlinear constrained optimization最近几年的进展)

constrained optimization的优势主要是smoothness of trajectories and direct encoding of the vehicle model in the trajectory planning，前提条件是为凸函数

以上规划算法的前提是周围交通参与者的未来轨迹是已经预测可知的。

handling complex clutter and modeling the interactions with other road users remains an unsolved problem for autonomous driving

2) end-to-end learning

传统的自动驾驶框架（13，67mediated perception）

a method for integrated perception and planning, which generate a control input for the vehicle directly from sensory information and typically rely on machine learning.

文献68创造path通过lidar point clouds + GPS-IMU + Google navigation information, based on a fully CNN, 文献69直接利用的camera image space。

端到端最早可追溯到1989年ALVINN (Autonomous Land Vehicle in a Neural Network)利用一个神经网络从图像输出转角。文献67以此为基础作为行为反射法，到2006年文献71已经可以躲避越野道路的障碍物。随着GPU计算能力的增长，NVIDIA(72)训练的一个深度神经网络已经可以在gravel road, passing through roadwork, and driving during the night in poorly lit environments.

文献73，74，67对各种道路特征进行了学习，研究输入图像哪部分对网络贡献最大

文献75利用一个大尺度驾驶视频对一个端到端全卷积长短期记忆网络进行了训练。

文献78将端到端应用到了机器人领域，利用2-D激光雷达，文献79提出一种对称神经网络结构学习社会意识行为（感觉跟Wayve公司最新的研究声明很相似，只利用摄像头和基本的导航，学习方式与人类驾驶员相同：通过经验、错误、反馈和模仿）。由于现阶段的ensemble、bootstrap、Monte Carlo方法对陌生环境效果不好，文献80提出一种自动编码器对新环境的数据进行识别，然后将端到端的学习方法转换到安全的非基于学习的方法。

既然对新环境不友好，但是对于那种封闭园区或者固定线路的是否可以呢（比如清扫车，固定路线的公交车）

simulation learning

类似于强化学习，在稀有情况下不能奏效。文献81利用深度Q-Network学习了车辆的转角在一个仿真环境，文献82利用分割网络来缩小虚拟数据跟真实数据之间的差距，文献83提出actor-critic和model-free算法来解决连续动作空间的操作问题，该算法基于确定的梯度下降并且依赖强化学习，能实现车道保持在虚拟环境中。

3) behavior-aware planning

这一小节都在将如何估计周车行为，不通过车车通信。

定义：decision-making and planning are integrated into interactive planning

Handling complex clutter and modeling interactions with other road users are necessary to provide safety.

“freezing robot” problem: once the environment surpasses a certain level of complexity, the planner decides that all forward paths are unsafe, and the robot freezes in place (or performs unnecessary maneuvers) to avoid collisions.

“机器人冻结”问题：一旦环境超过某种复杂度，规划器会认为所有向前的路径是不安全的，机器人“冻结“（plus:或卡住）在原地（或执行不必要的动作（plus:原地瞎晃悠））以避免碰撞。

Learning-Based Approaches

将决策与规划解耦的典型方法，文献105训练一个支持向量机实现换道决策，文献106提出的高斯混合模型（参数由神经网络得到）具有自车、周围车辆状态、过去的行为及规范化道路的特征，用来预测一组车辆在公路上的运动。

文献107提出非参数预测结构（nonparametric prediction architecture），此处没看懂

文献87通过对周车建模，利用逆强化学习（inverse reinforcement learning）得到奖励函数

文献111展示了具有未知奖励函数的马尔可夫决策过程在模仿不同的驾驶风格方面的潜力（在一个公路模拟器里面），文献112中利用更少的学习样本实现在停车场的类人轨迹生成。

文献113利用最大熵IRL来避免过拟合，最大熵分布对数据依赖小。最大熵IRL模型已经成为机器人和自动驾驶汽车学习代价函数的热门方法。

文献114、88、115描述学习社会顺从运动规划和驾驶行为（socially compliant motion planning and human behaviour）

文献116提出优先适应导航，机器人选择根据任务的社会接受度选择导航模型。相似的变体，文献117、118、119利用最大收益规划（maximum margin planning）实现机器人在复杂非结构化地形进行导航。

文献120利用连续逆最优控制得到局部最优例子，用来解决连续状态和动作，并具有学习复杂侵略驾驶风格的能力。类似的，文献87也展示了一个人类驾驶模型（学习人类驾驶员的奖励）。

文献121说明了一个风险敏感的IRL框架，能够将专家经验中的风险敏感系数考虑进去。

文献122提出最大熵深度IRL框架探索驾驶员行为的代价模型，文献123利用端到端的映射从25,000个实例（120km/h）原始输入中得到代价地图。

文献124展示了模仿学习（文献125）的有效性。

两种学习方法：一种是先利用IRL对专家经验提取得到代价函数，然后利用强化学习利用该代价函数得到相关策略，另一种是用生成对抗模仿学习，直接在专家轨迹中学到策略，绕过IRL的中间过程。

4) verification of the methods for autonomous driving

5) manage fleets of autonomous vehicles

编队管理

without pooling requests，文献136关注流动估计(fluid approximations)，文献137为基于队列的方程（queuing-based formulations），文献138提出基于命令的自动移动系统（autonomous mobility-on-demand systems），文献139-141主要是针对拼车问题（ride-pooling）相关的车辆路径问题和动态取送问题。

拼车问题，文献143（TUD Alonso的）提出一种高低容量车辆请求匹配和动态车辆路径选择的实时优化方法，主要分为三步：1.修剪可能的出行组合；2.分配出行车辆；3.编队再管理。解耦的方法

文献144对AI在编队管理中的作用做了相关描述，文献145交通大数据，文献146 关于随机路径的综述，文献147 （TUD Alonso的）对随机路径进行了补充，计算未来需求的历史概率分布包括车辆动态路径和乘客任务的随机采样，在共享驾驶的背景下。

现在最重要的挑战是如何将现有的技术扩展到城市规模的问题上，涉及每天数百万次的出行 ，大型编队的管理、隐私和人类监督也是未来的研究方向，AI技术将扮演重要的角色。

6) future research

3. TUDelft Cognitive Robotics

Parallel autonomy in automated vehicles: Safe motion generation with minimal intervention

Schwarting, W., Alonso-Mora, J., Pauli, L., Karaman, S., & Rus, D. (2017, May). Parallel autonomy in automated vehicles: Safe motion generation with minimal intervention. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1928-1935). IEEE.

1）解决的问题

Parallel Autonomy平行驾驶（人机共驾）中对人能力的补偿

2）提出的方法

开发一个作为非线性模型预测控制（NMPC）的滚动优化规划器

3）实现步骤

1555420865701

4）自己的评价

本文主要是利用一个滚动优化规划器来最小化人类输入的偏差（在平行驾驶框架下确保安全的条件），在T形路口，换道，左转上面表现良好。主要目的是让机器自己决策，减少人的干预。

未来将考虑轮胎滑移载荷变化等情况，并在更加复杂的环境如：很多人、自行车，各种交通灯、停止信号的环境。

Sample Efficient Learning of Path Following and Obstacle Avoidance Behavior for Quadrotors

Stevšić, S., Nägeli, T., Alonso-Mora, J., & Hilliges, O. (2018). Sample efficient learning of path following and obstacle avoidance behavior for quadrotors. IEEE Robotics and Automation Letters, 3(4), 3852-3859.

1）解决的问题

无人机的避障

2）提出的方法

提出一个算法用于神经网络控制策略的训练（用于无人机），利用模仿学习产生策略（专家经验由一个无时间模型预测路径跟踪的控制器生成），学习的demo为几个实例，保证模型学习算法的鲁棒性。

The method uses a collision-free exploration strategy, which bounds divergence from a collision-free region during learning.
The learned policy reduces computational cost compared to the supervision algorithm.
The learned policy receives coarse guidance(大概的指导) from a global planner

3）实现步骤

4）自己的评价

Collision avoidance for aerial vehicles in multi-agent scenarios

Alonso-Mora, J., Naegeli, T., Siegwart, R., & Beardsley, P. (2015). Collision avoidance for aerial vehicles in multi-agent scenarios. Autonomous Robots, 39(1), 101-121.

2）提出的方法

local motion planning

3）实现步骤

进行了两个实验

四个无人机近距离的飞行
两个无人机对一个人进行避障

Joint Multi-Policy Behavior Estimation and Receding-Horizon Trajectory Planning for Automated Urban Driving

Zhou, B., Schwarting, W., Rus, D., & Alonso-Mora, J. (2018, May). Joint multi-policy behavior estimation and receding-horizon trajectory planning for automated urban driving. In 2018 IEEE International Conference on Robotics and Automation (ICRA) (pp. 2388-2394). IEEE.

多个物体的行为预测+轨迹规划

周车：部分马尔可夫决策过程估计行为，Partially Observable Markov Decision Processes to estimated the behaviour of other traffic participants

自车：POMDP给一个规划轨迹（planned trajectory），滚动优化控制产生安全的轨迹（safe trajectories）

滚动优化规划器通过chance constraints over multiple motion polices，这些约束主要是由周车的不确定行为产生，参考文献为4
POMDP最新的求解器为DESPOT，参考文献为14

1）方法的结构框架（schema of the proposed method ）

1555675469605

$P$ 为每个障碍车运动目的的权重，$\Pi$ 为每辆障碍车的估计轨迹，${\pi}^0$ 为自车的规划轨迹。

2）方法

4. Fleet management

Alonso-Mora, J., Wallar, A., & Rus, D. (2017, September). Predictive routing for autonomous mobility-on-demand systems with ride-sharing. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3583-3590). IEEE.

提出一个适用于城市车辆的实时自动驾驶拼车系统，通过历史数据获得当前车辆的动态轨迹，并将车辆轨迹和乘客任务分配进行解耦。

本方法主要分为三步：

对出行方案可行性进行分析
分配车辆出行
闲置车辆的再平衡

Fleet operational policies for automated mobility: A simulation assessment for Zurich

Hörl, S., Ruch, C., Becker, F., Frazzoli, E., & Axhausen, K. W. (2019). Fleet operational policies for automated mobility: A simulation assessment for Zurich. Transportation Research Part C: Emerging Technologies, 102, 20-31.

文章里面算法是用的已经开源的一个 AMoD https://www.amodeus.science/，并且在论文“AMoDeus, a Simulation-Based Testbed for Autonomous Mobility-on-Demand Systems”种介绍很详细。

文章使用了两个模拟器MATSim和AMoD，其中MATSim 可参考论文“The Multi-Agent Transport Simulation MATSim. Ubiquity ”

5. Path Planning

Practical Search Techniques in Path Planning for Autonomous Driving

解决的问题

提出Hybrid Astar算法
- 用一种启发搜索的方法获得全局满足车辆运动学的最优轨迹，解的结果在全局最优附近
- 利用conjugate gradient descent得到局部最优，同时满足全局最优
克服势场法对于狭窄通道实际不能通行的问题
- 基于工作空间的几何尺寸进行势场调整

基本描述

算法满足前进和倒车的要求，对倒车和改变运动方向有惩罚函数
两种启发函数
- non-holonomic-without-obstacles
  - use a max of non-holonomic-without-obstacles cost and 2D Euclidean distance as heuristic
  - 这种启发函数不依靠实时的传感器信息，可以先离线计算好。速度接近一个数量级的提高
- holonomic-with-obstacles
  - 在2D空间发现所有的U形障碍物和死胡同，在3D搜索时远离这些区域

国外运动规划论文阅读笔记（1）

1. CMU robotic institute论文

Motion Planning in Urban Environments: Part I (2008)

关于两个planner

关于lattice planner

Motion Planning in Urban Environments: Part II (2008)

Pivtoraiko, M.——state lattice

Adaptive Anytime Motion Planning For Robust Robot Navigation In Natural Environments (2009)

采用方法

Differentially Constrained Mobile Robot Motion Planning in State Lattices（2009）

Model-Predictive Motion Planning: Several Key Developments for Autonomous Mobile Robots（2014）

2. 综述性文章

Planning and Decision-Making for Autonomous Vehicles(2018)

Abstract

1) traditional methods for planning and control

2) end-to-end learning

3) behavior-aware planning

Learning-Based Approaches

4) verification of the methods for autonomous driving

5) manage fleets of autonomous vehicles

6) future research

3. TUDelft Cognitive Robotics

Parallel autonomy in automated vehicles: Safe motion generation with minimal intervention

1）解决的问题

2）提出的方法

3）实现步骤

4）自己的评价

Sample Efficient Learning of Path Following and Obstacle Avoidance Behavior for Quadrotors

1）解决的问题

2）提出的方法

3）实现步骤

4）自己的评价

Collision avoidance for aerial vehicles in multi-agent scenarios

2）提出的方法

3）实现步骤

Joint Multi-Policy Behavior Estimation and Receding-Horizon Trajectory Planning for Automated Urban Driving

1）方法的结构框架（schema of the proposed method ）

2）方法

4. Fleet management

Predictive Routing for Autonomous Mobility-on-Demand Systemswith Ride-Sharing

Fleet operational policies for automated mobility: A simulation assessment for Zurich

5. Path Planning

Practical Search Techniques in Path Planning for Autonomous Driving