Home

Published

- 1 min read

PPO: Easy Concepts and Implementation

img of PPO: Easy Concepts and Implementation

The Goals

Quick Intro

PPO, the Proximal Policy Optimization. works by providing a probability distribution of potential actions based on the current environmental conditions. This allows the method to adapt and improve with new, uncertain scenarios.

  • Proximal: Close to the decision-making center
  • Policy: A distribution of actions
  • Optimization: The process of finding a better solution

In a nutshell, PPO is a solution to finding the best action distribution from a given environment. A better policy distribution can translate to a better chance of getting the right action.

I have created a flow chart for visualization purposes. It explained how the method was used when training a agent.

Flowchart of action recognition
PPO Data & Method Workflow

PPO Demonstration with Open-AI Cartpole Environment

Run Yourself! (with cartpole demonstration)

👉 Google colab