Implement the gradient descent

Assignment Help Computer Engineering
Reference no: EM132995790

Question 1. Rewrite the objective in the form

f(x) = 1/2n ||y - Ax||2

where A is the data matrix and n is the number of data points.

Question 2. Implement the gradient descent (GD) method which consists of the iter- ations:

xk+1 = xk - αk∇f(xk)

where

∇f(x) = -1/n AT(y - Ax).

where we choose αk = a > 0 for some constant a. Tune your algorithm to different values of a. You can stop your algorithm when ||∇(x)|| ≤ ε with ε = 10-2. For which values of a, do you see divergence or convergence in your algorithm?

How close is the solution you compute with GD is to the actual solution
x = [x1, x2, x3 ] where x = (AT A)-1AT y?

Question 3. Plot the level sets of the objective f(x1, x2, x3) when x1 = x∗1. In other words, plot the level sets of f(x1∗ , x2, x3) versus x2 and x3. Mark the iterates of gradient descent on the level set plots and verify that gradient descent moves perpendicular to the level sets. Visualize the function f(x1∗ , x2, x3) in 3D as a function of x2 and x3. Does it look like a "cereal bowl"?

Question 4. Repeat the previous question with different choise of the stepsize αk in your implementation

(a) αk = a > 0 is a constant. How does the performance change when we vary a? For which values of a, do you see divergence or convergence? How large a should be for you to observe divergence of the GD?
(b) Repeat part (a) with αk = a/k varying the values of a.
(c) Repeat part (a) with αk = a/√k. Based on these experiments, what is the best stepsize rule in your opinion to get the best performance?

Question 5. Note that

f(x) = X f where f = 1 (y - aT x)2

is the loss of the data point i where aT is the i-th row of the data matrix

A. Note that each row comes from a single data point and the gradient of fi is

∇fi(x) = -ai (yi - ai x).

At step k, choose a data point index ik uniformly randomly from the set of all indices {1, 2, . . . , n} and consider the iterations

xk+1 = xk - αk∇fik (xk)

where the replace the gradient in the GD method with a random estimate of it; i.e. ∇fik . This is called the stochastic gradient method (SGD).

Repeat questions 3) and 4) for stochastic gradient descent. Can SGD be faster than GD for different values of target accuracy ε = 10-1, 10-2, 10-3, . . . etc? Please explain.

Attachment:- Question.rar

Reference no: EM132995790

Questions Cloud

What is the value of the firm under each plan : -If the shareholders require a 15% return before personal taxes, what is the value of the firm under each plan? (Do not ignore personal taxes)
What is the internal revenue service key customer data : What is the Internal Revenue Service Key Customer Data, in terms of Product/service, market segment, buying from the competition
What would be the incremental cash flow : What would be the incremental cash flow in Year 4 from leasing instead of purchasing if the purchased asset had a pretax salvage value of $900
Compute the net income under Matthew proposal : Matthew quoted an old marketing research report that said that sales volume would increase by 60%. Compute the net income under Matthew proposal
Implement the gradient descent : Implement the gradient descent - How close is the solution you compute with GD is to the actual solution
What is Gagah weighted average cost of capital : Gagah Motors Enterprise., a producer of turbine generators, is in this situation: EBIT = RM4 million, What is Gagah weighted average cost of capital
What could be the net realizable value of an inventory : Assume the damaged merchandise that had cost of BR 3500 can be sold for only 3000 direct costs, what could be the net realizable value of an inventory
Is the market for Gold - Copper efficient : The next trade of the share takes place at 10.45 am at $2 per share. Is the market for Gold & Copper efficient? Why or why not
Determine the optimal order quantity : The chemical is purchased 10 kilogram canisters for $95 each. The firm 2 uses 4 800 canisters per year. Determine the optimal order quantity

Reviews

len2995790

9/22/2021 9:31:39 PM

gradient descent and stochastic gradient descent methods to solve the regression problems.

Write a Review

Computer Engineering Questions & Answers

  What type of light rays does each type provide

Describe the elements of a digital rendering scene. List the five types of lights that SolidWorks uses. What type of light rays does each type provide?

  List the patients who do not have a prescription

List the patients who do not have a prescription. The list should show the patient's name, gender, age, admission date, room, and illness description.

  Create a program that keeps track of different types of cars

Choose the abstract classes and concrete classes that you would like to use within the program in question.

  Compare control-flow dataflow and reduction computers

Compare control-flow dataflow and reduction computers in terms of the program flow mechanism used.

  Risk assessment of ict system

Risk Assessment of ICT System and Developing Computer Incident Response System on WAMP platform

  Why would an improvement of only ten percent occur

Why would an improvement of only 10% occur? Could it be that no improvement at all would occur? Explain.

  Write an application that includes two additional methods

Write an application that includes two additional methods in addition to the Main( ) method. One method should return a string consisting of four or five lines.

  Determine the best uses of 3g and 4g technology

Compare the pros and cons of 3G and 4G technology to determine the best uses of 3G and 4G technology in today's applications

  What are people doing to achieve security objectives

What are people currently doing to achieve security objectives? Where do those security objectives originate? Who are the people who are engaged in security.

  Create a program plan and then convert

Create a program plan and then convert it into C++ statements. Practice debugging, declaring variables, file I/O, functions, arrays, sorting and searching array

  Research the recording industry association of america

Research the Recording Industry Association of America (RIAA) and the Motion Picture Association of America (MPA).

  Describe in precise terms

Describe in PRECISE terms (in terms of the contents of the packets that are sent) how the server (point B) is able to detect that the client is fake.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd