Coursera - Supervised Machine Learning: Regression and Classification - Week 1 - Section 6 - Train the model with gradient descent
2025年01月26日
data:image/s3,"s3://crabby-images/c3f98/c3f98bc2c36746d874f61d4d30e07a33d82ec1ed" alt=""
data:image/s3,"s3://crabby-images/09dc5/09dc5ec65844f601f52eeaaa77428ccd480657aa" alt=""
data:image/s3,"s3://crabby-images/a1835/a18351206c03274786373a9792f84cb35a74c40d" alt=""
data:image/s3,"s3://crabby-images/565c4/565c4116ffecc64c6ee0fce5628ea1d71ce4d03c" alt=""
Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J. What does this update statement do? (Assume α is small.)
This updates the parameter by a small amount, in order to reduce the cost J.
data:image/s3,"s3://crabby-images/212a8/212a8d7475df89eaee764d73be5e3c63e9ecd5b6" alt=""
-
data:image/s3,"s3://crabby-images/60abd/60abd0fe16252710edd5c71ba8e2cfb57b54d8f5" alt=""
data:image/s3,"s3://crabby-images/3af05/3af057a88a21cc4eb27ef43ed1f873c0ae78bc59" alt=""
Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J.
repeat until convergence:{
\( w=w-\alpha \frac{\partial J(w, b)}{\partial w} \)
\( b=b-\alpha \frac{\partial J(w, b)}{\partial b} \)
}
Assume the learning rate α is a small positive number. When \( \frac{\partial J(w, b)}{\partial w} \) is a positive number (greater than zero) -- as in the example in the upper part of the slide shown above -- what happens to w after one update step?
The learning rate α is always a positive number, so if you take W minus a positive number, you end up with a new value for W that is smaller
-
-
-
-
-
Week 1: Introduction to Machine Learning
Section 6: Train the model with gradient descent
1. Video: Gradient descent
data:image/s3,"s3://crabby-images/c3f98/c3f98bc2c36746d874f61d4d30e07a33d82ec1ed" alt=""
data:image/s3,"s3://crabby-images/09dc5/09dc5ec65844f601f52eeaaa77428ccd480657aa" alt=""
2. Video: Implementing gradient descent
data:image/s3,"s3://crabby-images/a1835/a18351206c03274786373a9792f84cb35a74c40d" alt=""
data:image/s3,"s3://crabby-images/565c4/565c4116ffecc64c6ee0fce5628ea1d71ce4d03c" alt=""
Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J. What does this update statement do? (Assume α is small.)
- Checks whether ww is equal to \( w-\alpha \frac{\partial J(w, b)}{\partial w} \)
- Updates parameter w by a small amount
This updates the parameter by a small amount, in order to reduce the cost J.
3. Video: Gradient descent intuition
data:image/s3,"s3://crabby-images/212a8/212a8d7475df89eaee764d73be5e3c63e9ecd5b6" alt=""
-
data:image/s3,"s3://crabby-images/60abd/60abd0fe16252710edd5c71ba8e2cfb57b54d8f5" alt=""
data:image/s3,"s3://crabby-images/3af05/3af057a88a21cc4eb27ef43ed1f873c0ae78bc59" alt=""
Gradient descent is an algorithm for finding values of parameters w and b that minimize the cost function J.
repeat until convergence:{
\( w=w-\alpha \frac{\partial J(w, b)}{\partial w} \)
\( b=b-\alpha \frac{\partial J(w, b)}{\partial b} \)
}
Assume the learning rate α is a small positive number. When \( \frac{\partial J(w, b)}{\partial w} \) is a positive number (greater than zero) -- as in the example in the upper part of the slide shown above -- what happens to w after one update step?
- It is not possible to tell if w will increase or decrease
- w stays the same
- w decreases
- w increases
The learning rate α is always a positive number, so if you take W minus a positive number, you end up with a new value for W that is smaller
4. Video: Learning rate
--
5. Video: Gradient descent for linear regression
--
6. Video: Running gradient descent
--
7. Lab: Optional lab: Gradient descent
--
-