Wednesday, November 30, 2016

Unizor - Derivatives - Intermediate Value Theorem

Notes to a video lecture on

Intermediate Value Theorem

The intermediate value theorem is, probably, a result of attempts to find roots of equations, where direct formula for the roots is not available.
For example, consider you have to solve an equation
2^x + x³ = 0
There is no formula for solutions of this equation. Without contemporary computers it's not easy to solve it. Obviously, we can only hope to find the solutions approximately. So, it would be useful if we can determine an interval where the solution is located. The narrower interval - the better our understanding about the solution of this equation.

Notice that the left side of our equation is negative at x=−1:
2^(−1) + (−1)³ = 0.5−1 = −0.5
At the same time, it is positive for x=1:
2^1 + (1)³ = 2 + 1 = 3
Considering the left side of the equation is a continuous function and taking into account that it is negative at x=−1 and positive at x=1, it is intuitively obvious that somewhere in interval [−1,1] our function must cross the value of 0. Therefore, the solution to our equation must lie inside the interval [−1,1].

If we want to evaluate the solution more precisely, let's take a midpoint of this interval, point x=0, and check the sign of a function at this point:
2^0 + (0)³ = 1
It is positive. So, the solution to an equation must be within a narrower interval [−1,0] because a continuous function on the left side of our equation takes negative value −0.5 at the left end (x=−1) of this interval and positive value 1 at the right end (x=0) of this interval.

This process of dividing an interval in halves can be continued to get to a solution closer and closer. In practice, many algorithms of finding solutions to complicated equations are exactly as described here.

While we are interested in methodology of this approach, we are also interested in theoretical foundation of it.
Its base - Intermediate Value Theorem.

This theorem states:
If continuous function f(x) is defined on segment [a,b] and takes different values at the ends of this segment, f(a) ≠ f(b), then for any intermediate value C between f(a) and f(b) there is a point h between a and b, where function f(x) takes this intermediate value C.

Symbolically, for f(a) smaller than f(b), it looks like this:
∀C ∈ [f(a),f(b)] ⇒
⇒ ∃ h∈[a,b]: f(h) = C


The proof of this intuitively obvious theorem is not exactly straight forward and needs certain axiomatic foundation.
The main idea of this proof is based on the Completeness Axiom - the axiom that says that a set of real numbers is complete in the following sense.
Assume, there is a non-empty set of real numbers bounded from above.
Then Completeness Axiom states that there exists exactly one real least upper bound (called supremum) - a real number that is an upper bound and, at the same time, is less or equal to any other upper bound.

To understand why this axiom is called Completeness Axiom, here is a simple example illustrating that rational numbers are not a complete set in the above sense.
Indeed, consider a set of rational numbers that are bounded from above by condition X² ≤ 2. The supremum of this set is not a rational number (it is square root of 2). Therefore, rational numbers do not represent a complete set in the above sense.

The Intermediate Value Theorem deals with real numbers, and we assume that Completeness Axiom takes place.
Also assume for definiteness that f(a) is smaller than f(b).

Let's choose any number C between f(a) and f(b):
f(a) ≤ C ≤ f(b).
If C equals to f(a) or f(b), the theorem is trivial - we just take h=a or h=b, correspondingly. So, we can assume that C is strictly greater than f(a) and strictly less than f(b).

Consider now a set S of all real numbers {x} within segment [a,b] for which the value f(x) is not greater than C: f(x) ≤ C ⇒ x∈S.
This set S is not empty because, at least, point a belongs to it since f(a) ≤ C.

Set S is, obviously, bounded from above by real number b. Therefore, according to Completeness Axiom, set S has supremum - the least upper bound - number h.
Number h belongs to segment [a,b] because, if not, number b would be a smaller upper bound, which contradicts to the fact that h is the least upper bound.

Consider now value f(h). We will prove that f(h)=C and, therefore, h is the number, whose existence we have to prove.

If f(h) is smaller than C, continuous function f(x) will be smaller than C also in the immediate neighborhood of point h. Thus, using the definition of continuity, we can choose ε=[C−f(h)]/2 and find δ such that |x−h|≤δ ⇒ |f(x)−f(h)|≤ε.
Then we set x = h+δ and conclude:
f(h+δ) ≤ f(h)+ε.
f(h)+ε = f(h)+[C−f(h)]/2 =
= C − [C−f(h)]/2 ≤ C,
it follows that
f(h+δ) ≤ C.
Therefore, h is no longer an upper bound of set S since h is smaller than h+δ and (h+δ)∈S.
Hence, f(h) cannot be smaller than C.

Somewhat similar logic can be applied in case f(h) is greater than C.
Continuous function f(x) will be greater than C also in the immediate neighborhood of point h. Thus, using the definition of continuity, we can choose ε=[f(h)−C]/2 and find δ such that |x−h|≤δ ⇒ |f(x)−f(h)|≤ε.
So, the value of function f(x) at all points within δ-neighborhood around point x=h is greater than C, thus containing no points of set S.
Therefore, all point of this C, to the left from point h and to the right of it, are upper bounds for set S, which is impossible since h is the least upper bound of set S, and no other upper bound points should be lying to the left of it.
Hence, f(h) cannot be greater than C.

The only possibility left is f(h)=C.
End of proof.

No comments: