Forest#
This is a single-agent task where the agent is required to navigate a randomly generated cluttered environment. The agent needs to fly at a commanded speed along the positive direction while avoiding collisions with obstacles.
The agent utilizes Lidar sensors to perceive its surroundings. The Lidar has a horizontal field of view (FOV) of 360 degrees and a the vertical view can be specified.
Observation#
The observation is given by a CompositeSpec containing the following values:
"state"(16 +num_rotors): The basic information of the drone (except its position), containing its rotation (in quaternion), velocities (linear and angular), heading and up vectors, and the current throttle."lidar"(1, w, h) : The lidar scan of the drone. The size is decided by the field of view and resolution.
Reward#
vel: Reward computed from the position error to the target position.up: Reward computed from the uprightness of the drone to discourage large tilting.survive: Reward of a constant value to encourage collision avoidance.effort: Reward computed from the effort of the drone to optimize the energy consumption.action_smoothness: Reward that encourages smoother drone actions, computed based on the throttle difference of the drone.
The total reward is computed as follows:
Episode End#
The episode ends when the drone misbehaves, e.g., when the drone collides with the ground or obstacles, or when the drone flies out of the boundary:
or when the episode reaches the maximum length.
Config#
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
“firefly” |
Specifies the model of the drone being used in the environment. |
|
float |
4.0 |
Specifies the maximum range of the lidar. |
|
float |
[-10, 20] |
Specifies the vertical field of view of the lidar. |
|
bool |
True |
Indicates whether to include time encoding in the observation space. If set to True, a 4-dimensional vector encoding the current progress of the episode is included in the observation. If set to False, this feature is not included. |