## Monday, January 9, 2017

### Unizor - Derivatives - Newton's Method of Finding Zeros of a Function

Notes to a video lecture on http://www.unizor.com

Derivatives - Newton's Method
to Find Zeros of Function

The purpose of this lecture is to address a specific problem of finding an approximate solution of equations f(x)=0 using some methodology suggested by Sir Isaac Newton a few centuries ago.
Obviously, it makes sense to apply this methodology when there is no analytical solution of a given equation, or this analytical solution requires too much efforts.

A word of warning should be given up front. This method might sometimes fail to find an approximate solution. After describing this method it will be obvious that under some circumstances the process of finding a solution is not converging to any number.

Let's start with an impractical but illustrative example of a linear function f(x)=2x−4. Obviously, linear equation f(x)=0 looks like 2x−4=0 and can be immediately solved:
2x=4
x=2

But let's approach it differently using the following method.
The graph of f(x)=2x−4 looks like this (blue line): To solve the equation 2x−4=0 means to find the X-coordinate of point C, where the graph intersects the X-axis.

Let's choose any point A with X-coordinate x0 and draw a perpendicular to X-axis from this point to an intersection with the graph of our function f(x)=2x−4 at point B with coordinates {x0, f(x0)} (black line).
We notice that segment AC is a cathetus of triangle ΔABC and, therefore, its length equals to
AC = AB/tan(∠ACB)

We also know that tan(∠ACB)is a derivative of a function f(x)=2x−4 at point C, which is the same as a derivative at any other point, like A, since our function is linear. This derivative is equal to f I(x0)=2.

Therefore, knowing this derivative, X-coordinate of point A, which is equal to x0, and the value of our function at point A, which is equal to f(x0), that is the length of segment AB, we can easily find the length of segment AC and then the X-coordinate of point C:
AC = AB / tan(∠ACB) =
f(x0) / f I(x0)

Now, X-coordinate of point C, let's call it x1, equals to X-coordinate of point A minus the length of segment AC:
x1 = x0 − f(x0) / f I(x0)

Point A can be taken arbitrarily. Let's say, x0=4.
Then AB=f(x0)=2·4−4=4.
Since tan(∠ACB)=2,AC=4/2=2.
Finally, knowing AC and coordinate of point A (x0=4), we can determine the coordinate of point C:
x1 = x0 − AC = 4−2=2
This is a solution to an original equation 2x−4=0.

Next step towards Newton's Method is to consider a more complex function and how analogous approach to the one above might lead to a solution.

Our next example is a quadratic polynomial
f(x)=(2/9)x²+(2/9)x-(4/9)
presented on the next graph (red line). We have specifically considered such a function that the linear function described above (blue line) is a tangential line to this new function exactly at a point where we chose to start our calculation x0=4.
Incidentally, it's easy to see that the zero point of this quadratic polynomial is x=1. However, we pretend that we don't know this and attempt to approach this value in a process similar to described above for a linear function.

First of all, note that the behavior of parabola is very smooth on a graph. It is smooth not only in terms of differentiability and continuity of a derivative, but also in terms of its derivative being a monotonic function. In particular, as seen from the graph, direction of parabola and direction of its tangential line at x0=4 (blue line) are very close to each other. From the point of tangency x0=4 both parabola and its tangential line go to zero towards the same direction - to the left from point x0.

The main implication of this is that a zero point C of a tangential line with X-coordinate x1=2, as we calculated above, is closer to a zero point of parabola than original point A at X-coordinate x0=4, where we started the process. Therefore, jumping from point A to point C we got closer to a zero point of a parabola.

An obvious continuation of this process is to recursively repeat the step above, starting at point C with X-coordinate x1=2.
We draw a perpendicular to X-axis at point C to intersection with our parabola at point with coordinates {x1, f(x1)} and a tangential line to parabola at point D (green line) that intersects X-axis at point E with X-coordinate x2.

Let's determine x2 from triangle ΔEDC similarly to the above procedure:
CE = CD/tan(∠CED)
CD = f(x1) = (2/9)·2²+(2/9)·2−(4/9) = 8/9
tan(∠CED) = f I(x1) =
= (2/9)·2x1+(2/9) = 10/9

CE = (8/9) / (10/9) = 4/5
x2 = x1 − CE =
= x1 − f(x1) / f I(x1) =
= 2−(4/5) = 6/5

Got closer to a zero point of a parabola that we know is x=1, but pretend we don't know it.

Let's repeat this process once more, starting at x2=6/5.
x3 = x2 − f(x2) / f I(x2) ≅
≅ 1.2 − 0.142222 / 0.755556 ≅
≅ 1.01176

Got very close to x=1!

The recursive process described above is the Newton's method of finding arguments where function equals to zero.
Let's formulate it more rigorously.

Assume, we need to find zeros of some smooth function f(x).

Step 0
Start with some approximation of a value of an argument x0that is relatively close to the one where our function equals to zero.
There is no universal recipe for this approximation, but the better your approximation is - the sooner you will approach the real zero of f(x).
And, to be noted, if your starting value is too far from the real zero, the Newton's method might never converge to real zero of our function.

Step 1
From the chosen value xcalculate the next value of a sequence:
x1 = x0 − f(x0) / f I(x0)
This and subsequent steps are straightforward calculations.

Step 2, 3 etc.
Check two things:
(a) the value of f(xn) - is it close to zero within your chosen precision?
(b) the difference between xand xn−1 - is it smaller than your chosen precision?
If both checks are satisfactory, stop the process. The last xn is your final result.
If your precision requirements are not satisfied yet, continue.
Repeat the previous step according to this recursive formula:
xn+1 = xn − f(xn) / f I(xn)

As we noted, Newton's method is not universal. A lot depends on properties of function, whose zeros we are trying to find, and on a choice of the first approximation x0.
Let's have an example where Newton's method does not converge to a zero point of a function.

An obvious example is when we choose the first approximation point where a tangential line is parallel to the X-axis.
Thus, in case of a parabola f(x)=(2/9)x²+(2/9)x-(4/9), that we considered above, such point would be x=−0.5.

Another obvious example is an attempt to find a zero point if the function has none. Consider function f(x)=1/x with any starting point. The Newton's method will lead you to infinity.

It is very important that the first approximation point should be relatively close to a real zero point of a function. Imagine a function having a zero point and a "hump" near it. If we choose a starting point "behind a hump", the process will never get to a zero point. Here is an example.
f(x)=x·e−x This function has a maximum at point x=1. At this point the tangential line is parallel to X-axis. If our first approximation x0 is greater or equal to 1, we will never converge to a function's zero point at x=0.