Coursera - Supervised Machine Learning: Regression and Classification - Week 1 - Section 6 - Train the model with gradient descent
2025年01月26日




Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J. What does this update statement do? (Assume α is small.)
This updates the parameter by a small amount, in order to reduce the cost J.

-


Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J.
repeat until convergence:{
\( w=w-\alpha \frac{\partial J(w, b)}{\partial w} \)
\( b=b-\alpha \frac{\partial J(w, b)}{\partial b} \)
}
Assume the learning rate α is a small positive number. When \( \frac{\partial J(w, b)}{\partial w} \) is a positive number (greater than zero) -- as in the example in the upper part of the slide shown above -- what happens to w after one update step?
The learning rate α is always a positive number, so if you take W minus a positive number, you end up with a new value for W that is smaller
-
-
-
-
-
Week 1: Introduction to Machine Learning
Section 6: Train the model with gradient descent
1. Video: Gradient descent


2. Video: Implementing gradient descent


Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J. What does this update statement do? (Assume α is small.)
- Checks whether ww is equal to \( w-\alpha \frac{\partial J(w, b)}{\partial w} \)
- Updates parameter w by a small amount
This updates the parameter by a small amount, in order to reduce the cost J.
3. Video: Gradient descent intuition

-


Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J.
repeat until convergence:{
\( w=w-\alpha \frac{\partial J(w, b)}{\partial w} \)
\( b=b-\alpha \frac{\partial J(w, b)}{\partial b} \)
}
Assume the learning rate α is a small positive number. When \( \frac{\partial J(w, b)}{\partial w} \) is a positive number (greater than zero) -- as in the example in the upper part of the slide shown above -- what happens to w after one update step?
- It is not possible to tell if w will increase or decrease
- w stays the same
- w decreases
- w increases
The learning rate α is always a positive number, so if you take W minus a positive number, you end up with a new value for W that is smaller
4. Video: Learning rate
--
5. Video: Gradient descent for linear regression
--
6. Video: Running gradient descent
--
7. Lab: Optional lab: Gradient descent
--
-