Published
- 1 min read
PPO: Easy Concepts and Implementation

The Goals
- Decomplicating the PPO method with a easy-to-follow flowchart.
- Providing a PPO code for buidling ppo from scratch check here for you to follow, come with full math notations.
Quick Intro
PPO, the Proximal Policy Optimization. works by providing a probability distribution of potential actions based on the current environmental conditions. This allows the method to adapt and improve with new, uncertain scenarios.
- Proximal: Close to the decision-making center
- Policy: A distribution of actions
- Optimization: The process of finding a better solution
In a nutshell, PPO is a solution to finding the best action distribution from a given environment. A better policy distribution can translate to a better chance of getting the right action.
I have created a flow chart for visualization purposes. It explained how the method was used when training a agent.
