The algorithm consists of two networks, an Actor and a Critic network, which approximate the policy and value functions of a reinforcement learning problem. The name DDPG, or Deep Deterministic Policy ...