PlatformHover#

A cooperative control task where a group of k UAVs are connected together by a rigid frame to form an overactuated platform. Each individual UAV, attached by a 2-DoF passive gimbal joint, acts as a thrust generator. The goal for the agents is to make the platform hover at a reference pose (position and attitude).

Observation#

The observation is a CompositeSpec containing the following items:

obs_self (1, *): The state of each UAV observed by itself, containing its kinematic information with the position being relative to the frame center, and an one-hot identity indicating the UAV’s index.
obs_others (k-1, *): The observed states of other agents.
obs_frame:
state_frame: (1, *): The state of the frame.
rpos (3): The relative position of the platform to the reference positions.
time_encoding (optional): The time encoding, which is a 4-dimensional vector encoding the current progress of the episode.

Reward#

reward_pose: The reward for the pose error between the platform and the reference (position and orientation).
reward_up: The reward for the alignment of the platform’s up vector and the reference up vector.
reward_spin: Reward computed from the spin of the drone to discourage spinning.
reward_effort: Reward computed from the effort of the drone to optimize the energy consumption.

The total reward is computed as follows:

\[r = r_\text{pose} + r_\text{pose} * (r_\text{up} + r_\text{spin}) + r_\text{effort}\]

Config#

Parameter	Type	Default
`drone_model`	str	“hummingbird”
`num_drones`	int	4
`arm_length`	float	0.85
`reward_distance_scale`	float	1.2
`time_encoding`	bool	True

PlatformHover

Contents

PlatformHover#

Observation#

Reward#

Config#