
Reinforcement Learning Course - Full Machine Learning Tutorial
video description
Date: 2022-03-14
Comments and reviews: 10
Barbell
The length of the flatten outputlayer can actually be calculated from first conv layer tracing the data through the network. Just use the function:
((dimension length - kernal size for the dimension + 2-padding)/stride)+1 = output length for the dimension
do this for each dimension for each conv layer and multiply by number of outputs in the end to find the length of the flat dimension as such:
1st conv layer: ((185 - 8 + 2-1)/4) + 1 = 44 (acutally 44.75 but you always round down, since there are no 0.75 pixels)
((95 - 8 + 2-1)/ 4) + 1 = 22 (rounded down from 22.25)
2nd conv: ((44 - 4 + 2-0)/2) + 1 = 21
((22 - 4 + 2-0)/2) + 1 = 10
3rd conv: ((21 - 3 + 2-0)/1) + 1 = 19
((10 - 3 + 2-0)/1) + 1 = 8
this means the 3rd layer outputs 128 frames with each having dimensions 19-8 and therefore if you wanted to flatten them into one you would get one dimension with 128-19-8 vectors.
Just neat little trick for those who want it
reply
The length of the flatten outputlayer can actually be calculated from first conv layer tracing the data through the network. Just use the function:
((dimension length - kernal size for the dimension + 2-padding)/stride)+1 = output length for the dimension
do this for each dimension for each conv layer and multiply by number of outputs in the end to find the length of the flat dimension as such:
1st conv layer: ((185 - 8 + 2-1)/4) + 1 = 44 (acutally 44.75 but you always round down, since there are no 0.75 pixels)
((95 - 8 + 2-1)/ 4) + 1 = 22 (rounded down from 22.25)
2nd conv: ((44 - 4 + 2-0)/2) + 1 = 21
((22 - 4 + 2-0)/2) + 1 = 10
3rd conv: ((21 - 3 + 2-0)/1) + 1 = 19
((10 - 3 + 2-0)/1) + 1 = 8
this means the 3rd layer outputs 128 frames with each having dimensions 19-8 and therefore if you wanted to flatten them into one you would get one dimension with 128-19-8 vectors.
Just neat little trick for those who want it
reply
Machine
Here are some time stamps folks!
Intro 00:00:00
Intro to Deep Q Learning 00:01:30
How to Code Deep Q Learning in Tensorflow 00:08:56
Deep Q Learning with Pytorch Part 1: The Q Network 00:52:03
Deep Q Learning with Pytorch part 2: Coding the Agent 01:06:21
Deep Q Learning with Pytorch part 3: Coding the main loop 01:28:54
Intro to Policy Gradients 01:46:39
How to Beat Lunar Lander with Policy Gradients 01:55:01
How to Beat Space Invaders with Policy Gradients 02:21:32
How to Create Your Own Reinforcement Learning Environment Part 1 02:34:41
How to Create Your Own Reinforcement Learning Environment Part 2 02:55:39
Fundamentals of Reinforcement Learning 03:08:20
Markov Decision Processes 03:17:09
The Explore Exploit Dilemma 03:23:02
Reinforcement Learning in the Open AI Gym: SARSA 03:29:19
Reinforcement Learning in the Open AI Gym: Double Q Learning 03:39:56
Conclusion 03:54:07
reply
Here are some time stamps folks!
Intro 00:00:00
Intro to Deep Q Learning 00:01:30
How to Code Deep Q Learning in Tensorflow 00:08:56
Deep Q Learning with Pytorch Part 1: The Q Network 00:52:03
Deep Q Learning with Pytorch part 2: Coding the Agent 01:06:21
Deep Q Learning with Pytorch part 3: Coding the main loop 01:28:54
Intro to Policy Gradients 01:46:39
How to Beat Lunar Lander with Policy Gradients 01:55:01
How to Beat Space Invaders with Policy Gradients 02:21:32
How to Create Your Own Reinforcement Learning Environment Part 1 02:34:41
How to Create Your Own Reinforcement Learning Environment Part 2 02:55:39
Fundamentals of Reinforcement Learning 03:08:20
Markov Decision Processes 03:17:09
The Explore Exploit Dilemma 03:23:02
Reinforcement Learning in the Open AI Gym: SARSA 03:29:19
Reinforcement Learning in the Open AI Gym: Double Q Learning 03:39:56
Conclusion 03:54:07
reply
The
Just need to get this over.
After I find her,
That's it.
I will share the technique.
She must code it.
Then its done.
I will go to desert.
If your thinking nuclear reactor?
In the middle of the congested city in the world with rascal scientist like me .
It will result to catastrophe.
reply
Just need to get this over.
After I find her,
That's it.
I will share the technique.
She must code it.
Then its done.
I will go to desert.
If your thinking nuclear reactor?
In the middle of the congested city in the world with rascal scientist like me .
It will result to catastrophe.
reply
Aarya
Anyone interested in learning the terminologies of what he is talking about should go check out the video lectures Stanford did on MDPs(Markov decisions processes and RL), they're about each an hour long and do go in depth behind the math for a lot of this stuff. Cheers!!!
reply
Anyone interested in learning the terminologies of what he is talking about should go check out the video lectures Stanford did on MDPs(Markov decisions processes and RL), they're about each an hour long and do go in depth behind the math for a lot of this stuff. Cheers!!!
reply
Haneul
Amazing course, thanks alot Phil! One question, you were comparing policy gradient methods with reinforcement learning however after few searches it seems like policy gradient method is an algorithm within RL. Could you clarify?
reply
Amazing course, thanks alot Phil! One question, you were comparing policy gradient methods with reinforcement learning however after few searches it seems like policy gradient method is an algorithm within RL. Could you clarify?
reply
David
are you preaching, are you lazy, your face and wavy hands aren't helping me learn. Code is often blurry at 360P, are you catering to 4k viewers with un-shared 100 megabit connections, I've seen better presentations at 360P.
reply
are you preaching, are you lazy, your face and wavy hands aren't helping me learn. Code is often blurry at 360P, are you catering to 4k viewers with un-shared 100 megabit connections, I've seen better presentations at 360P.
reply
Joel
When you mentioned a course on Full Machine Learning Tutorial - Reinforcement Learning and there is no proper order to it. I don't recommend watching this thing. There are tons of materials that are a lot simpler than this.
reply
When you mentioned a course on Full Machine Learning Tutorial - Reinforcement Learning and there is no proper order to it. I don't recommend watching this thing. There are tons of materials that are a lot simpler than this.
reply
freecodecamp
This is a great video if you already understand the topic, understand the code and just want a guy saying what he's typing out aloud, kinda explaining bits and pieces here and there.
reply
This is a great video if you already understand the topic, understand the code and just want a guy saying what he's typing out aloud, kinda explaining bits and pieces here and there.
reply
Say
This is the only issue that i often see on any -basic tutorial- videos. There's no explaination on the terminologies during the intros.
reply
This is the only issue that i often see on any -basic tutorial- videos. There's no explaination on the terminologies during the intros.
reply
Finarwa
I really appreciate the work you are doing . Could you mention which the development tool you are
using for the whole series?
reply
I really appreciate the work you are doing . Could you mention which the development tool you are
using for the whole series?
reply
Add a review, comment
Other channel videos















