Advanced Policy Gradients
In the policy gradient algorithms, we were looking at stochastic optimal policy $\pi_\theta$ in a continuous state and action spaces setting. However, there are scenarios where we want to have a deterministic policy in a continuous state and action spaces setting.
