Introduction to Reinforcement Learning

Home

Courses

/Introduction to Reinforcement Learning

Ajit Kumar

19 modules

English

Lifetime access

Reinforcement Learning is a major subtopic within AI. This course covers its basics.

Overview

Test Description

Modules

Udemy Section 4: Monte Carlo Method

4 attachments • 2 mins

ON Policy Monte Carlo Method

Constant Alpha - On Policy MC Method

Coding: Off-Policy MC Method

Correction: Modify the MC code acording to Barto and Sharbat algorithm

Task: Due date - Sept 16 th

Submission - Sept 18th

1 attachment

Did you code the MC method algorithm?

Udemy Section 5: Temporal Difference Methods

9 attachments • 4 mins

Video 52 Notes

Video 53 Notes

Visualize the updates in Q-table in the temporal difference methods.

SARSA

Coding Exercise: Code SARSA for the Maze Environment

Q-Learning

Coding Exercise: Code Q-learning for the Maze example

Coding Task: Comparision between SARSA & Qlearning

Question: What are some of the advantages of Temporal Difference Methods over Monte Carlo and dynamic programming?

Review Exercises

6 attachments

Create a new Environment: Maze2

Maze2: Dynamic Programming

Maze2: Monte Carlo coding

Maze2: SARSA coding exercise

Maze2: Q-learning Coding exercise

Comparison analysis

Udemy Section 6: N Step Bootstrapping

4 attachments • 1 mins

What is the algorithm of N-Step Bootstrapping methods?

How is MonteCarlo Method similar to N-Step bootrstapping method?

How is N Step bootstapping similar to SARSA method?

In video 66, there was discussion on increase in variance with in increase in N. Explain this in the context of Maze example.

Task Due Date: Sept 23

1 attachment • 1 mins

Submit codes for SARSA & Q Learning

Coding Exercise: Develop a Tic Tac Toe player

2 attachments

Develop an Env which plays Tic-Tac-Toe against you

Use any of the method: MC/SARSA/N-step so that it learns to play intelligently.

Section 7: Cotinuous State Spaces

2 attachments • 2 mins

Coding Exercise: Run Cartpole, Acrobot, MountainCar, Pendulum Examples

Basics of Gym Environment

Udemy Section 7: Continuous State

3 attachments • 3 mins

Question: Give an example of a cotinuous state MDP.

🔎Understanding the Moundtain environment

Question: Can we apply the SARSA, Monte Carlo, methods on continuous problems?

Deep Learning - Basics

6 attachments • 4 mins

What is the objective of deep learning?

What is Mathematical meaning of Neural Network?

What is the meaning of Neural Network "training"?

Pytorch Coding Example

Coding: Manual hyperparamter tuning

Coding: Hyperparameter Tuning with Optuna

Section 9 & 10: Deep RL

6 attachments • 6 mins

TASK: Memorize DEEP SARSA algorithm

CODING: Run the DeepSarsa code for the Mountain Car env

TASK: Memorize DEEP-Q algorithm

CODING: Run the Deep-Q code for the Mountain Car env

CODING: Run the Deep-Q code for the CartPole env

TASK: Spot theoretical difference between Deep-Sarsa & Deep-Q algorithm

Using RL Library

5 attachments • 1 mins

Run simple Gym Environments

Train Rllib trained agent on Cartpole

Animate with trained agent

Observe training history on Tensorboard

Hyperparameter Training on Ray tune

CAPSTONE DEEP REINFORCEMENT LEARNING

1 attachment • 1 mins

CODE an agent which can learn to navigate non-constant Maze

Section 11: REINFORCE

13 attachments • 11 mins

What are Policy Gradient Methods?

Question: What is Stochastic Policy?

SOFTMAX function

Coding Exercise: Write a python function to evaluate softmax activation.

How to compare policy performance in REINFORCE?

Why parallel learning?

Using Entropy to Incentivise exploration

TASK: Memorize REINFORCE algorithm

Coding: Run the REINFORCE code for Carpole Example

Coding: Run the REINFORCE code for Double Pendulum Example

Coding: Run the REINFORCE code for Mountain-Car Example

Coding: Create a new simple environment & run the REINFORCE algo on it.

Note: Identify the number of core in your machine, and use them all for parallelization.

Section 10: A2C: Advantage Actor Critic

3 attachments • 1 mins

Task: Memorize A2C algorithm

Code: Run the A2C code for cartpole, MountainCar, Double Pendulum

Task: Create your own simple environment and implment A2C on that environment

Do a Comparison Study between REINFORCE & A2C

1 attachment

Coding Task: On the cartpole example, which method between A2C and REINFORCE is better?

Capstone: Apply RL algorithm on a simple stock market trading learning

Tensortrade

4 attachments • 2 mins

Understand Streams

Simplest Stream + Datafeed code

Create a simple Environment Using RELIANCE two year data

Run the debugger. Understand the action. And reward mechanism

FAQs

How can I enrol in a course?

Enrolling in a course is simple! Just browse through our website, select the course you're interested in, and click on the "Enrol Now" button. Follow the prompts to complete the enrolment process, and you'll gain immediate access to the course materials.

Can I access the course materials on any device?

Yes, our platform is designed to be accessible on various devices, including computers, laptops, tablets, and smartphones. You can access the course materials anytime, anywhere, as long as you have an internet connection.

How can I access the course materials?

Once you enrol in a course, you will gain access to a dedicated online learning platform. All course materials, including video lessons, lecture notes, and supplementary resources, can be accessed conveniently through the platform at any time.

Can I interact with the instructor during the course?

Absolutely! we are committed to providing an engaging and interactive learning experience. You will have opportunities to interact with them through our community. Take full advantage to enhance your understanding and gain insights directly from the expert.

About the creator

Ajit Kumar

Assistant Professor

Dept of Mathematics
Shiv Nadar University

Rate this Course

Introduction to Reinforcement Learning

Ajit Kumar

19 modules

English

Lifetime access

Free