Safety Gym was announced by OpenAI, a non-profit organization that studies artificial intelligence. Existing reinforcement learning pointed out that AI has the potential to cause unexpected errors due to dangerous actions, and was introduced as a tool that can perform agent reinforcement learning while respecting safety restrictions.
Safety Gym is a reinforcement learning agent or a module for AI that maintains motivation toward a goal by reward and punishment. Open AI introduced constrained reinforcement learning in Safety Gym where AI automatically thinks about cost and conducts simulation.
Constrained reinforcement learning agents set cost targets at the beginning of learning and perform learning using rewards and punishments. In other words, AI through constrained reinforcement learning is required to predict risks in advance.
Safety Gym uses three agents: Point, Car, and Doggo to explore the congested environment and reach the goal. In addition, three tasks are set: a goal to a designated area, a button that continuously passes through a checkpoint, and a push to push an object to a designated position. In addition, there are two levels of difficulty, and a warning light flashes around the agent whenever the agent performs an unsafe task.
The point is that a robot with a rotating actuator and an actuator for forward and backward movement runs on a 2D plane. The car is driven by a robot with two independently driven front wheels and one rotating rear wheel. In order for the car robot to turn or move, it has to operate two front wheels at the same time. Dogo is a simulation of a symmetrical robot with 4 legs. The leg must be manipulated so that the azimuth and relief angle can be manipulated on the fuselage, and the robot does not fall even if the angle adjustment joint is lowered.
Open AI says that since safety gym is still a developing country, a lot of work is still needed to combine safety technology in addition to other problems. Three tasks are realization of constrained reinforcement learning along with performance improvement, safe transfer learning, distribution change problem investigation, and human taste. Explain that there is. Through a system like Safety Gym, AI developers are expected to work on a shared system to facilitate collaboration on the safety of the entire AI field. Related information can be found here .
Add comment