Reference no: EM133896982
Assignment:
Implementation of Gradient Descent with MNIST data
For Task 2, we will use the same data as Task 1. However, we will find the optimal coefficients by using "Gradient Descent" algorithm. Then, we will compare with the solution that we found in Task 1.
The procedure of Task 2 is the almost same as Task 1, but need to implement "Gradient Descent" algorithm, instead of a least-square minimization based solution of single line equation as ((X'X)^(-1)X'y).
For the Gradient Descent algorithm, please follow the procedure:
1. Set the initial coefficient to zeros (can be any random values though) - Think of what the dimension of the coefficient vector is?
2. Determine hyper-parameters such as learning rate (α) and iteration numbers (k).
3. Run "gradient descent" algorithm with the hyper-parameters and check "Learning Curve" as shown: * Learning curve shows whether it converges or not. X-axis shows the number of iterations, while y-axis shows cost (J). * Learning curve must be showing as "converged", otherwise the solution may be not good.
4. Display the optimal coefficients (denoted by b_est)
5. Classify test data (MNIST_test.csv) with a threshold of 0.5 as described below: - y_pred = X_test * b_est - if y_pred > 0.5, class 1, otherwise 0
6. Display the accuracy
7. Display the aggregate difference between b_opt and b_est as defined below: - sum(abs(b_opt - b_est)