Overview of Deep Reinforcement Learning Methods - presented by Prof. Steven L. Brunton

Overview of Deep Reinforcement Learning Methods

Prof. Steven L. Brunton

Prof. Steven L. Brunton
Slide at 21:03
ADVANTAGE ACTOR-CRITIC NETWORK
ACTOR: DEEP POLICY
NETWORK
Q(Skako2)
CRITIC: DEEP DUELING
Q NETWORK
UPDATE
Share slide
Summary (AI generated)

In this transcript, the speaker discusses the policy iteration and policy gradient iteration techniques. They note that the latter is faster than traditional model-free techniques, but requires a model with parameters (θ) to take the derivative. The speaker also mentions the use of an actor-critic method, in which a Q network is used to learn the quality function, and the policy is updated using a policy gradient network. The speaker notes that this approach combines value-based and policy-based optimization, which is different from Q-learning, where the Q function is updated based on Q information and the policy is optimized separately. Overall, the speaker finds this approach to be a cool and innovative way to optimize policies.