Class Agent
An agent is an actor that can observe its environment, decide on the best course of action using those observations, and execute those actions within the environment.
Inherited Members
Namespace: Unity.MLAgents
Assembly: solution.dll
Syntax
[HelpURL("https://github.com/Unity-Technologies/ml-agents/blob/release_21_docs/docs/Learning-Environment-Design-Agents.md")]
[Serializable]
[RequireComponent(typeof(BehaviorParameters))]
[DefaultExecutionOrder(-50)]
public class Agent : MonoBehaviour, ISerializationCallbackReceiver, IActionReceiver, IHeuristicProvider
Remarks
Use the Agent class as the subclass for implementing your own agents. Add your Agent implementation to a GameObject in the Unity scene that serves as the agent's environment.
Agents in an environment operate in steps. At each step, an agent collects observations, passes them to its decision-making policy, and receives an action vector in response.
Agents make observations using ISensor implementations. The ML-Agents API provides implementations for visual observations (CameraSensor) raycast observations (RayPerceptionSensor), and arbitrary data observations (VectorSensor). You can add the CameraSensorComponent and RayPerceptionSensorComponent2D or RayPerceptionSensorComponent3D components to an agent's GameObject to use those sensor types. You can implement the CollectObservations(VectorSensor) function in your Agent subclass to use a vector observation. The Agent class calls this function before it uses the observation vector to make a decision. (If you only use visual or raycast observations, you do not need to implement CollectObservations(VectorSensor).)
Assign a decision making policy to an agent using a BehaviorParameters component attached to the agent's GameObject. The BehaviorType setting determines how decisions are made:
- Default: decisions are made by the external process, when connected. Otherwise, decisions are made using inference. If no inference model is specified in the BehaviorParameters component, then heuristic decision making is used.
- InferenceOnly: decisions are always made using the trained model specified in the BehaviorParameters component.
- HeuristicOnly: when a decision is needed, the agent's Heuristic(in ActionBuffers) function is called. Your implementation is responsible for providing the appropriate action.
To trigger an agent decision automatically, you can attach a DecisionRequester component to the Agent game object. You can also call the agent's RequestDecision() function manually. You only need to call RequestDecision() when the agent is in a position to act upon the decision. In many cases, this will be every FixedUpdate callback, but could be less frequent. For example, an agent that hops around its environment can only take an action when it touches the ground, so several frames might elapse between one decision and the need for the next.
Use the OnActionReceived(ActionBuffers) function to implement the actions your agent can take, such as moving to reach a goal or interacting with its environment.
When you call EndEpisode() on an agent or the agent reaches its MaxStep count, its current episode ends. You can reset the agent -- or remove it from the environment -- by implementing the OnEpisodeBegin() function. An agent also becomes done when the Academy resets the environment, which only happens when the Academy receives a reset signal from an external process via the Communicator.
The Agent class extends the Unity MonoBehaviour class. You can implement the standard MonoBehaviour functions as needed for your agent. Since an agent's observations and actions typically take place during the FixedUpdate phase, you should only use the MonoBehaviour.Update function for cosmetic purposes. If you override the MonoBehaviour methods, OnEnable() or OnDisable(), always call the base Agent class implementations.
You can implement the Heuristic(in ActionBuffers) function to specify agent actions using your own heuristic algorithm. Implementing a heuristic function can be useful for debugging. For example, you can use keyboard input to select agent actions in order to manually control an agent's behavior.
Note that you can change the inference model assigned to an agent at any step by calling SetModel(string, ModelAsset, InferenceDevice).
See Agents and Reinforcement Learning in Unity in the Unity ML-Agents Toolkit manual for more information on creating and training agents.
For sample implementations of agent behavior, see the examples available in the Unity ML-Agents Toolkit on Github.
Fields
| Name | Description |
|---|---|
| MaxStep | The maximum number of steps the agent takes before being done. |
Properties
| Name | Description |
|---|---|
| CompletedEpisodes | Returns the number of episodes that the Agent has completed (either EndEpisode() was called, or maxSteps was reached). |
| StepCount | Returns the current step counter (within the current episode). |
Methods
| Name | Description |
|---|---|
| AddReward(float) | Increments the step and episode rewards by the provided value. |
| Awake() | Called when the Agent is being loaded (before OnEnable()). |
| CollectObservations(VectorSensor) | Implement |
| EndEpisode() | Sets the done flag to true and resets the agent. |
| EpisodeInterrupted() | Indicate that the episode is over but not due to the "fault" of the Agent. This has the same end result as calling EndEpisode(), but has a slightly different effect on training. |
| GetCumulativeReward() | Retrieves the episode reward for the Agent. |
| GetObservations() | Returns a read-only view of the observations that were generated in CollectObservations(VectorSensor). This is mainly useful inside of a Heuristic(in ActionBuffers) method to avoid recomputing the observations. |
| GetStackedObservations() | Returns a read-only view of the stacked observations that were generated in CollectObservations(VectorSensor). This is mainly useful inside of a Heuristic(in ActionBuffers) method to avoid recomputing the observations. |
| GetStoredActionBuffers() | Gets the most recent ActionBuffer for this agent. |
| Heuristic(in ActionBuffers) | Implement Heuristic(in ActionBuffers) to choose an action for this agent using a custom heuristic. |
| Initialize() | Implement |
| LazyInitialize() | Initializes the agent. Can be safely called multiple times. |
| OnActionReceived(ActionBuffers) | Implement |
| OnAfterDeserialize() | Called by Unity immediately after deserializing this object. |
| OnBeforeSerialize() | Called by Unity immediately before serializing this object. |
| OnDisable() | Called when the attached [GameObject] becomes disabled and inactive. [GameObject]: https://docs.unity.cn/Manual/GameObjects.html |
| OnEnable() | Called when the attached [GameObject] becomes enabled and active. [GameObject]: https://docs.unity.cn/Manual/GameObjects.html |
| OnEpisodeBegin() | Implement |
| RequestAction() | Requests an action for this agent. |
| RequestDecision() | Requests a new decision for this agent. |
| ScaleAction(float, float, float) | Scales continuous action from [-1, 1] to arbitrary range. |
| SetModel(string, ModelAsset, InferenceDevice) | Updates the Model assigned to this Agent instance. |
| SetReward(float) | Overrides the current step reward of the agent and updates the episode reward accordingly. |
| WriteDiscreteActionMask(IDiscreteActionMask) | Implement |