## Wednesday, November 30, 2016

### Unizor - Derivatives - Intermediate Value Theorem

Notes to a video lecture on http://www.unizor.com

Intermediate Value Theorem

The intermediate value theorem is, probably, a result of attempts to find roots of equations, where direct formula for the roots is not available.
For example, consider you have to solve an equation
2^x + x³ = 0
There is no formula for solutions of this equation. Without contemporary computers it's not easy to solve it. Obviously, we can only hope to find the solutions approximately. So, it would be useful if we can determine an interval where the solution is located. The narrower interval - the better our understanding about the solution of this equation.

Notice that the left side of our equation is negative at x=−1:
2^(−1) + (−1)³ = 0.5−1 = −0.5
At the same time, it is positive for x=1:
2^1 + (1)³ = 2 + 1 = 3
Considering the left side of the equation is a continuous function and taking into account that it is negative at x=−1 and positive at x=1, it is intuitively obvious that somewhere in interval [−1,1] our function must cross the value of 0. Therefore, the solution to our equation must lie inside the interval [−1,1].

If we want to evaluate the solution more precisely, let's take a midpoint of this interval, point x=0, and check the sign of a function at this point:
2^0 + (0)³ = 1
It is positive. So, the solution to an equation must be within a narrower interval [−1,0] because a continuous function on the left side of our equation takes negative value −0.5 at the left end (x=−1) of this interval and positive value 1 at the right end (x=0) of this interval.

This process of dividing an interval in halves can be continued to get to a solution closer and closer. In practice, many algorithms of finding solutions to complicated equations are exactly as described here.

While we are interested in methodology of this approach, we are also interested in theoretical foundation of it.
Its base - Intermediate Value Theorem.

This theorem states:
If continuous function f(x) is defined on segment [a,b] and takes different values at the ends of this segment, f(a) ≠ f(b), then for any intermediate value C between f(a) and f(b) there is a point h between a and b, where function f(x) takes this intermediate value C.

Symbolically, for f(a) smaller than f(b), it looks like this:
∀C ∈ [f(a),f(b)] ⇒
⇒ ∃ h∈[a,b]: f(h) = C

Proof

The proof of this intuitively obvious theorem is not exactly straight forward and needs certain axiomatic foundation.
The main idea of this proof is based on the Completeness Axiom - the axiom that says that a set of real numbers is complete in the following sense.
Assume, there is a non-empty set of real numbers bounded from above.
Then Completeness Axiom states that there exists exactly one real least upper bound (called supremum) - a real number that is an upper bound and, at the same time, is less or equal to any other upper bound.

To understand why this axiom is called Completeness Axiom, here is a simple example illustrating that rational numbers are not a complete set in the above sense.
Indeed, consider a set of rational numbers that are bounded from above by condition X² ≤ 2. The supremum of this set is not a rational number (it is square root of 2). Therefore, rational numbers do not represent a complete set in the above sense.

The Intermediate Value Theorem deals with real numbers, and we assume that Completeness Axiom takes place.
Also assume for definiteness that f(a) is smaller than f(b).

Let's choose any number C between f(a) and f(b):
f(a) ≤ C ≤ f(b).
If C equals to f(a) or f(b), the theorem is trivial - we just take h=a or h=b, correspondingly. So, we can assume that C is strictly greater than f(a) and strictly less than f(b).

Consider now a set S of all real numbers {x} within segment [a,b] for which the value f(x) is not greater than C: f(x) ≤ C ⇒ x∈S.
This set S is not empty because, at least, point a belongs to it since f(a) ≤ C.

Set S is, obviously, bounded from above by real number b. Therefore, according to Completeness Axiom, set S has supremum - the least upper bound - number h.
Number h belongs to segment [a,b] because, if not, number b would be a smaller upper bound, which contradicts to the fact that h is the least upper bound.

Consider now value f(h). We will prove that f(h)=C and, therefore, h is the number, whose existence we have to prove.

If f(h) is smaller than C, continuous function f(x) will be smaller than C also in the immediate neighborhood of point h. Thus, using the definition of continuity, we can choose ε=[C−f(h)]/2 and find δ such that |x−h|≤δ ⇒ |f(x)−f(h)|≤ε.
Then we set x = h+δ and conclude:
f(h+δ) ≤ f(h)+ε.
Since
f(h)+ε = f(h)+[C−f(h)]/2 =
= C − [C−f(h)]/2 ≤ C,
it follows that
f(h+δ) ≤ C.
Therefore, h is no longer an upper bound of set S since h is smaller than h+δ and (h+δ)∈S.
Hence, f(h) cannot be smaller than C.

Somewhat similar logic can be applied in case f(h) is greater than C.
Continuous function f(x) will be greater than C also in the immediate neighborhood of point h. Thus, using the definition of continuity, we can choose ε=[f(h)−C]/2 and find δ such that |x−h|≤δ ⇒ |f(x)−f(h)|≤ε.
So, the value of function f(x) at all points within δ-neighborhood around point x=h is greater than C, thus containing no points of set S.
Therefore, all point of this C, to the left from point h and to the right of it, are upper bounds for set S, which is impossible since h is the least upper bound of set S, and no other upper bound points should be lying to the left of it.
Hence, f(h) cannot be greater than C.

The only possibility left is f(h)=C.
End of proof.

## Monday, November 28, 2016

### Unizor - Derivatives - Minimum or Maximum, or Inflection

Notes to a video lecture on http://www.unizor.com

Derivatives - MIN or MAX

We know that if a smooth function f(x), defined on interval (a,b) (finite or infinite) has a local maximum or minimum at some point x0, its derivative at this point equals to zero.

Consider the converse statement: if a derivative of a smooth function f(x), defined on interval (a,b), equals to zero at some point x0, it attains its local minimum or maximum at this point. Is it a correct statement?
The answer is NO.
Here is a simple example. Function f(x)=x³ is defined for all real arguments and is monotonically increasing, so it has no minimum and no maximum. Its derivative equals to f I(x) = 3x². It is non-negative, as the derivative of a monotonically increasing function should be, but, in particular, it is equal to zero at x=0.
So, a point where derivative equals to zero is not necessarily a point of minimum or maximum.

All points where a derivative equals to zero are called stationary points of a function. Some of them are points of local minimum, some - local maximum and others (not local minimums, nor maximums) are called inflection points.

Example of a local minimum is a function f(x) = x² at point x=0 with derivative f I(x) = 2x, which equals to zero at x=0.

Example of a local maximum is a function f(x) = −x² at point x=0 with derivative f I(x) = −2x, which equals to zero at x=0.

Example of a point of inflection is a function f(x) = x³ at point x=0 with derivative f I(x) = 3x², which equals to zero at x=0.

Our task now is to distinguish different kinds of stationary points of a smooth function, which ones are local minimums, which are maximums and which are inflection points.
To accomplish this, we need to analyze the behavior of both the first and the second derivatives of a function.

Examine the point of local minimum of a smooth function first. The fact that point x0 is a local minimum means that in the immediate neighborhood of this point the behavior of a function, as we increase the argument from some value on the left of x0 to some value on the right of x0, is to monotonically decrease its value, reaching minimum value at this point x0 and then to monotonically increase afterwards.
As we know, monotonically decreasing smooth functions have a non-positive derivative, while monotonically increasing - non-negative. Since in the immediate neighborhood of point x0 the only point where a derivative is equal to zero is only our point of local minimum x0, we conclude that a derivative to the left of a point of local minimum x0 is strictly negative and to the right of it - strictly positive. So, a derivative changes the sign from minus to plus going through a point of local minimum.
Therefore, our first tool to distinguish among different types of stationary points (those where a derivative equals to zero) is analyze the sign of a derivative to the immediate left and to the immediate right of a stationary point. If it changes from minus to plus, it's a point of local minimum.

In our first example above function f(x) = x² at stationary point x=0 has derivative f I(x) = 2x that changes the sign from minus to plus as we move from negative argument to positive over point x=0. That identifies stationary point x=0 as local minimum.

Analogously, it a derivative changes the sign from plus to minus in the immediate neighborhood of a stationary point, its a point of local maximum.

In our second example above function f(x) = −x² at stationary point x=0 has derivative f I(x) = −2x that changes the sign from plus to minus as we move from negative argument to positive over point x=0. That identifies stationary point x=0 as local maximum.

Finally, if a derivative does not change the sign, but, being negative on the left of stationary point, is increasing to zero at a stationary point and then goes negative again, it's an inflection point. Similarly, if a derivative does not change the sign, but, being positive on the left of stationary point, is decreasing to zero at a stationary point and then goes positive again, it's an inflection point as well.

In our third example above function f(x) = x³ at stationary point x=0 has derivative f I(x) = 3x² that does not change the sign as we move from negative argument to positive over point x=0. It's positive on both sides of a stationary point. That identifies stationary point x=0 as an inflection point.

Another approach to identify stationary points of a smooth function as local minimum, maximum or inflection points is to analyze the second derivative.

Assuming x0 is a stationary point of function f(x), that is f I(x0) = 0, let's check the function's second derivative at this point f II(x0). It can be positive, negative or zero.

If it's positive, it means that the first derivative (relative to which the second derivative of the original function is the first derivative) is monotonically increasing. Since the first derivative equals to zero at point x0, it must be negative to the left of this stationary point and positive to the right, that is it changes the sign from minus to plus and the stationary point is a local minimum.
In our first example of function f(x)=x² the second derivative is f II(x0) = 2 (constant), which at the stationary point x=0 equals to 2. It is positive, therefore we deal with local minimum.

If the second derivative at the stationary point is negative, it means that the first derivative (relative to which the second derivative of the original function is the first derivative) is monotonically decreasing. Since the first derivative equals to zero at point x0, it must be positive to the left of this stationary point and negative to the right, that is it changes the sign from plus to minus and the stationary point is a local maximum.
In our second example of function f(x)=−x² the second derivative is f II(x0) = −2 (constant), which at the stationary point x=0 equals to −2. It is negative, therefore we deal with local maximum.

Finally, if the second derivative is equal to zero at the stationary point, we cannot make any judgment looking on the value of the second derivative at the stationary point.
Consider function f(x)=x4. Its first derivative is f I(x0) = 4x³, it's equal to zero at x=0, so this is a stationary point. The second derivative is f II(x0) = 12x², which at the stationary point x=0 equals to zero. Since it's zero, we cannot identify this stationary point as local minimum, maximum or inflection, though visually it's a clear minimum. This illustrates that the method based on the value of a second derivative at the stationary point is not always working.

### Unizor - Derivatives - Taylor Series

Notes to a video lecture on http://www.unizor.com

Derivatives - Taylor Series

Functions can be simple, like
f(x)=2x+3 or f(x)=x²−1
or more complex, like
f(x)=[x+ln(x)]1−sin(x)·etan(x+5)

Obviously, it is always easier to deal with simple functions. Unfortunately, sometimes real functions describing some processes are too complex to analyze at each value of its argument, and mathematicians recommend to approximate a complex function with another, much simpler to deal with.
And the favorite simplification is approximation of a function with a polynomial.

There is a very simple reason for this. Computer processors can perform calculations very fast, but their instruction set includes only four arithmetic operations. That makes it relatively easy to calculate the values of polynomials, but not such functions as sin(x) or ln(x) frequently occurred in real life problems.
Yet, we all know that computers do calculations involved with these functions. The way they do it is using the approximation of these and many other functions with polynomials.

Our task is to approximate any sufficiently smooth (in a sense of differentiability) function with a polynomial.
In particular, we will come up with a power series that converges to our function.
So, cutting this series on any member would produce an approximation with a polynomial, and the approximation would be better and better if we cut the series further and further from the beginning, increasing the number of elements participating in polynomial approximation.

First of all, we mentioned power series. Here we mean an infinite series, nth member of which is Cn·x n (where n=0, 1, 2...), which we can express as
P(x) = Σn≥0[Cn·x n].
Any finite series of this type is a polynomial itself and does not need any other simplification. So, we are talking about infinite series that has a value in a sense of the limit, when the number of members infinitely grows.

Obviously, not any power series of this type is convergent, but for sufficiently smooth functions defined on finite segment [a,b] there exists such a power series that converges to our function at each point of this segment, and we can achieve any quality of approximation by allowing sufficient number of members of a power series to participate in the approximation, that is cutting the tail of a series sufficiently far from the beginning.
Let's assign symbol PN(x) for a partial sum of the members of our series up to Nth power:
PN(x) = Σn∈[0,N][Cn·x n].
Using this symbolics,
P(x) = limN→∞PN(x)

Let's analyze the representation of a sufficiently smooth on segment [a,b] function f(x) with a power series P(x).
In particular, let's assume that we want to find coefficients Cn of such a power series that
(1) for some specific value of argument x = x0, called a center of expansion, any partial sum PN(x) has the same value as function f(x) regardless of how many members N participate in a sum, that is
∀N ≥ 0: f(x0)=PN(x0);
(2) this power series converges to our function for every argument x∈[a,b], that is f(x)=P(x).
The first requirement assures that, at least at one point x = x0 our approximation of a function with a partial power series will be exact, regardless of how long the series is.

Based on the first requirement for any partial sum of a power series at point x = x0, that is PN(x0), to be equal to the value of original function f(x0) at this point, it is convenient to represent our power sum as
P(x) = Σn≥0[Cn·(x−x0) n]
with C0=f(x0).
Now, no matter how close to the beginning we cut P(x) to PN(x), we see that
f(x0) = PN(x0) for all N ≥ 0.
The first requirement is, therefore, satisfied in this form of our power series.

Let's now concentrate on the second requirement for P(x) to converge to f(x) for any point x of a segment [a,b].
We will do it in two steps.
Step 1 would assume that P(x) does converge to f(x) at any point. Based on this assumption, we will determine all the coefficients Cn. In a way, these specific values of coefficients are a necessary condition for equality between P(x) and f(x).
On the step 2, knowing that coefficients Cn of a power series P(x) must have certain values derived in step 1, we will discuss the issue of convergence.

So, assume the following is true:
f(x) = Σn≥0[Cn·(x−x0) n].
As we have already determined, C0=f(x0).
Let's differentiate both sides of the equality above.
A member C0·(x−x0)0 will disappear during differentiation since it's a constant and any member of type K·(x−x0)k will become k·K·(x−x0)k−1.
So, the resulting equality will look like this:
f I(x) = Σn≥1[nCn(x−x0)n−1].
This is an equality that is supposed to be true for any argument x. Substituting x=x0, all members of the infinite series except the first one will be zero. The first one is equal to
1·C1·(x−x0)0 and, since the exponent is 0, we have the following equality:
f I(x0) = 1·C1
Now we know the value of the next coefficient in our infinite series:
C1=f I(x0) / 1

The next procedure repeats the previous one. Let's take another derivative.
A member 1·C1·(x−x0)0 will disappear during differentiation since it's a constant and any member of type K·(x−x0)k will become k·K·(x−x0)k−1.
So, the resulting equality will look like this:
f II(x) =
= Σn≥2[n(n−1)Cn(x−x0)n−2].
This is an equality that is supposed to be true for any argument x. Substituting x=x0, all members of the infinite series except the first one will be zero. The first one is equal to
2·1·C2·(x−x0)0 and, since the exponent is 0, we have the following equality:
f II(x0) = 1·2·C2
Now we know the value of the next coefficient in our infinite series:
C2=f II(x0) / (1·2)

It can easily be seen that the repetition of the same procedure leads to the following values of coefficients Cn of our series:
C3=f III(x0) / (1·2·3)
C4=f IV(x0) / (1·2·3·4)
and, in general,
Cn=f (n)(x0) / (n!)
where f (n)(x0) signifies nth derivative at point x0 (with 0th derivative being an original function) and n! is "n factorial" - a product of all integer numbers from 1 to n with 0! being equal to 1 by definition.

We came up with the following form of representation of a function f(x) as a power series:
f(x)=Σn≥0[f (n)(x0)·(x−x0)n/(n!)]
This representation is called Taylor series.
Sometimes, in case of x0=0, it is referred to as Maclaurin series.

This form satisfies the first requirement we set in the beginning: for a center of expansion x = x0 any partial sum of this series has the same value as function f(x) regardless of how many members N participate in a sum.

We can also say that, if there is a power series converging to our function for every argument x∈[a,b], it must have a form above with coefficients as derived.

Our next task is to examine conditions under which the power series above exists and converges.

Obvious first requirement is infinite differentiability of the function f(x) since the coefficients of our series contain derivatives to any level.

As for convergence, it depends on the values of derivatives at point x0. A reasonable assumption might be that derivative f (n)(x0) for any level (n) is bounded by some maximum value M:
|f (n)(x0)| ≤ M
Let's prove that in this case the series converges.

Assuming the above boundary for derivatives of any level at point x0, the problem of convergence is reduced to proving that the following series is converging for any c:
S(c) = Σn≥0 cn/n!

Theorem
A sequence cn/n!, where c is any positive constant and n is an infinitely increasing index number, is bounded by a geometric progression with a positive quotient smaller than 1, starting at some index m.

Proof
Choose an integer m greater than c and start analyzing the members of this sequence with index numbers n greater than index m.
The following inequalities are true then:
cn/n! ≤ cn−m·cm/[n·(n−1)·...
...·(m+1)·m!] ≤
≤ cn−m·cm/[mn−m·m!] =
= (c/m)n−m·Q
where constant Q equals to
Q = cm/m!
The last expression represents a geometric progression with the first member Q·c/m and quotient c/m. Since m was chosen as an integer greater than c, the quotient of this geometric progression is less than 1.
End of proof.

Now we see that the members of polynomial series we considered above are bounded by members of a geometric progression with positive quotient smaller than 1. For geometric progressions that is a sufficient condition for their sum to converge. Therefore, the polynomial series is convergent.
This convergence, as was mentioned above, is true under assumption that all derivatives of the original function f(x) at point x0 are bounded by some constant M.

This condition on derivatives can be weakened in different ways, which we will not mention here. Also open remains a question of precision of the approximation with partial sums of a polynomial series. There are different approaches to evaluation of the quality of this approximation, that we leave for self-study.

### Unizor - Derivatives - One-Sided Function Limits

Notes to a video lecture on http://www.unizor.com

One-sided Function Limit

Definition 1
Real number L is the limit of function f(x) from the right (or is the right limit) as argument x approaches real number a if for any sequence {xn}, that approaches a while each element of this sequence is greater than a, the sequence {f(xn} converges to L.
Symbolically, it looks like this: limx→a+ f(x)=L
An equivalent definition using ε-δ formulation is as follows:
∀ε>0 ∃δ:
x∈(a,a+δ) ⇒ |f(x)−L| ≤ ε

Similar definition exists for the limit from the left.
Definition 2
Real number L is the limit of function f(x) from the left (or is the left limit) as argument x approaches real number a if for any sequence {xn}, that approaches a while each element of this sequence is less than a, the sequence {f(xn} converges to L.
Symbolically, it looks like this: limx→a− f(x)=L
An equivalent definition using ε-δ formulation is as follows:
∀ε>0 ∃δ:
x∈(a−δ,a) ⇒ |f(x)−L| ≤ ε

Theorem
If function f(x) converges to L as x→a, then this function converges to the same L as x→a+ or x→a−.

Proof
Both one-sided limits are supposed to be the same as a general limit. This follows from the fact that if f(xn)→L for any sequence of arguments {xn} approaching a, the same limit would be if arguments approach a only from the right or only from the left.

The converse statement is not, generally speaking, true.
For example, consider a function that is equal to 0 for all negative arguments and is equal to 1 for positive or zero arguments. This function has limit from the left 0 and limit from the right is 1.

However, if both one-sided limits exist and equal to each other, the general limit also exists and equal to these one-sided limits.

Theorem
Assume the following:
limx→a− f(x) = limx→a+ f(x) = L
Prove that
limx→a f(x) = L

Proof
Choose any positive constant ε.
Then we know that
∃δ1:x∈(a−δ1,a) ⇒ |f(x)−L| ≤ ε
and
∃δ2:x∈(a,a+δ2) ⇒ |f(x)−L| ≤ ε
Let δ=MIN(δ1,δ2).
Then both above conditions are met for this δ and we can state that
∃δ:x∈(a−δ,a+δ) ⇒ |f(x)−L| ≤ ε
which is the definition of a general limit at point x=a.

## Monday, November 21, 2016

### Unizor Derivatives - Constant Functions

Notes to a video lecture on http://www.unizor.com

Derivatives - Constant Function

The following statement is obvious.
If a smooth function f(x), defined on segment [a,b] (including endpoints), is constant, that is if
∀ x∈[a,b]: f(x)=f(a)=f(b),
then its derivative at any inner point of this interval equals to zero:
∀ x∈(a,b): f'(x) = 0

What is more interesting is the converse theorem.

Theorem 1

If a smooth function f(x), defined on segment [a,b] (including endpoints), has a derivative at any inner point equaled to zero, that is if
∀ x∈(a,b): f'(x) = 0,
then this function is constant on this segment, that is
∀ x∈[a,b]: f(x)=f(a)=f(b).

Proof

Let's choose any point x inside interval (a,b).
Now use Lagrange Theorem for our function f(x) on an segment [a, x] that starts at point a and ends at point x.
This theorem states that there exists a point x0∈(a, x) such that
f'(x0) = [f(x)−f(a)] / (x−a)
Since we know that the derivative of function f(x) on an interval (a,b) is zero at any point, we conclude that
0 = [f(x)−f(a)] / (x−a)
from which follows
f(x)=f(a).
Recall that point x was chosen as any point in interval (a,b). It means that f(x)=f(a) is true for any inner point of this interval.
Since function f(x) is smooth (which, in particular, implies continuity), the values at the end of this interval are also the same.
Hence, our function is constant on segment [a,b].

End of proof.

Theorem 2

If two smooth functions f(x) and g(x), defined on segment [a,b] (including endpoints), have equal derivatives, that is if
∀ x∈(a,b): f'(x) = g'(x)
then these functions are different only by a constant on this segment, that is
∃c:∀ x∈[a,b]: f(x)=g(x)+c.

Proof

Consider a new function
h(x)=f(x)−g(x).
Since derivatives of f(x) and g(x) are equal to each other, derivative of h(x) equals to zero as well:
h'(x) = f'(x)−g'(x)=0.
Now use the theorem above that states that if a smooth function h(x) has derivative equaled to zero at any inner point of a segment [a,b], then it is constant on this segment, that is
h(x) = c, where c=h(a)=h(b)
Therefore,
f(x)−g(x)=c for any x∈[a,b].

End of proof.

Important corollary
If we are given a derivative of some function and we have guessed an original function, from which this derivative was taken, we can say that any other function with the same derivative differs from the one we have guessed by a constant, and there are no other functions with this derivative.
For example, we can guess that, if derivative is f I(x)=x², then original function might be f(x)=x³/3. Now, based on the above theorem, we can state that an expression f(x)=x³/3+c describes all the functions that have derivative f I(x)=x², where c - any real number, and no other function with this derivative exists.

## Friday, November 18, 2016

### Unizor Derivatives - Cauchy Theorem

Notes to a video lecture on http://www.unizor.com

Derivatives -
Caushy Mean Value Theorem

Cauchy Mean Value Theorem

For two smooth functions f(x) and g(x), defined on segment [a,b] (including endpoints), there exist a point x0∈[a,b] such that the following is true:
f'(x0)/g'(x0) = [f(b)−f(a)]/[g(b)−g(a)]
(with obvious restrictions on denominators not equal to zero).

Proof

Proof of this theorem is based on Rolle's Theorem.
Consider a new function h(x) defined as:
h(x) = f(x) − g(x)·[f(b)−f(a)]/[g(b)−g(a)]

This function satisfies the conditions of Rolle's Theorem:
h(a) = f(a) −g(a)·[f(b)−f(a)]/[g(b)−g(a)] =
= [f(a)·g(b)−f(b)·(g(a)]/[g(b)−g(a)]
h(b) = f(b) − g(b)·[f(b)−f(a)]/[g(b)−g(a)] =
= [f(a)·g(b)−f(b)·(g(a)]/[g(b)−g(a)]

So, h(a) = h(b)

According to Rolle's Theorem, there exists point x0∈[a,b] such that
h'(x0) = 0

Let's find the derivative of function h(x) in terms of derivatives of the original functions f(x) and g(x):
h'(x) = f'(x) − g'(x)·[f(b)−f(a)]/[g(b)−g(a)]
Now the equality to 0 of the derivative of function h(x) at point x0 in terms of original functions f(x) and g(x) looks like this:
0 = f'(x0) − g'(x0)·[f(b)−f(a)]/[g(b)−g(a)]
from which follows
f'(x0)/g'(x0) = [f(b)−f(a)]/[g(b)−g(a)]

End of proof.

### Unizor - Derivatives - Lagrange Theorem

Notes to a video lecture on http://www.unizor.com

Derivatives -
Lagrange Mean Value Theorem

Lagrange Mean Value Theorem

For a smooth function f(x), defined on segment [a,b] (including endpoints), there exist a point x0∈[a,b] such that
f'(x0) = [f(b)−f(a)]/(b−a)

Geometrically, this theorem states that there is a point x0 inside segment [a,b], where a tangential line is parallel to a chord connecting endpoints of function f(x) on this segment.

Proof

Proof of this theorem is based on Rolle's Theorem.
Consider a new function g(x) that is equal to a difference between f(x) and a chord connecting two endpoints of a function on segment [a,b]:
g(x) = f(x) −
−{(x−a)·[f(b)−f(a)]/(b−a)+f(a)}

This function satisfies the conditions of Rolle's Theorem:
g(a) = f(a) −
−{(a−a)·[f(b)−f(a)]/(b−a)+f(a)} = f(a) − f(a) = 0
g(b) = f(b) −
−{(b−a)·[f(b)−f(a)]/(b−a)+f(a)} = f(b) − [f(b)−f(a)+f(a)] = 0
So, g(a) = g(b) = 0

According to Rolle's Theorem, there exists point x0∈[a,b] such that
g'(x0) = 0

Let's find the derivative of function g(x) in terms of derivative of the original function f(x):
g'(x) = f I(x) − [f(b)−f(a)]/(b−a)
Now the equality to 0 of the derivative of function g(x) at point x0 in terms of original function f(x) looks like this:
f'(x0) − [f(b)−f(a)]/(b−a) = 0 from which follows
f'(x0) = [f(b)−f(a)]/(b−a)

End of proof.

### Unizor - Derivatives - Rolle Theorem

Notes to a video lecture on http://www.unizor.com

Derivatives - Rolle Theorem

Rolle Theorem

If a smooth function f(x), defined on segment [a,b] (including endpoints), has equal values at both endpoints, that is f(a)=f(b), then there exists such point x0 ∈ [a,b] that its derivative at this point f I(x0) equals to zero:
f I(x0) = 0

Proof

Without pretending to be absolutely rigorous, the logical steps to prove this theorem might be as follows.

The function f(x) cannot be monotonically increasing because in this case its value at point x=b would be greater than that at point x=a.

Analogously, the function f(x) cannot be monotonically decreasing because in this case its value at point x=b would be less than that at point x=a.

Therefore, the function either is constant, in which case it's derivative at any internal point in the interval (a,b) equals to zero, or the function changes increasing to decreasing or decreasing to increasing behavior somewhere inside this interval.

Any point of change from increasing to decreasing is a local maximum, any point of change from decreasing to increasing is a local minimum. In both cases, according Fermat's Theorem, a derivative must be equal to zero at a point of change.

End of proof.

### Unizor - Derivatives - Fermat Theorem

Notes to a video lecture on http://www.unizor.com

Derivatives - Fermat Theorem
(internal local extremums)

First of all, let's talk about terminology.

Internal local extremum is a term used to characterize the behavior of a function defined on some, maybe infinite, interval (a,b) without endpoints (to enable approach to any point of this interval from both sides without restrictions). That's why we use the word internal.

Next word that requires some clarification is local. This word is used to demonstrate that certain characteristics of a function can be observed at some point where it is defined and in the immediate neighborhood of this point. Thus, local maximum (minimum) is a point, where the value of a function is greater (less) than in any other point in some neighborhood of this point, even a very small one.

Finally, extremum is a word that means maximum or minimum. A point where local extremum is attained is called stationary point of a function.

Another important note is that we will consider only differentiable functions, those that have derivatives at each point. Moreover, we assume that these derivatives are continuous functions and, in some cases, differentiable themselves to obtain derivatives of the higher order.
Most functions considered in this course are of this type - polynomial, exponential, logarithmic, trigonometric functions and their combinations.

So, we will talk about local maximum or minimum of sufficiently smooth (in terms of differentiability) functions. This property of smoothness will be assumed by default, even if not explicitly specified.

Fermat Theorem

If a smooth function f(x), defined on interval (a,b), has local extremum at point x0, then its derivative at this point f I(x0) equals to zero:
f I(x0) = 0

Geometrically, since a derivative is related to a tangent of a tangential line to a function, its equality to zero means the horizontal tangential line at a point of local maximum or minimum. The following picture illustrates this.

Proof

Let's consider local maximum first.
Intuitively, local maximum of function f(x) at point x0 means that within some narrow neighborhood of x0, on the left of x0, function f(x) monotonically increases and on the right of it - monotonically decreases.
As was demonstrated before, monotonically increasing functions have non-negative derivative and monotonically decreasing functions have non-positive derivative. That necessitates that at x0 the derivative, a continuous function, as we mentioned above, must be equal to zero.
Here is an illustration:

A little more rigorously, we can assume that a derivative f I(x0) does not equal to zero. So, it's either positive or negative.
As was demonstrated earlier, if a derivative is positive at some point, the function at this point and in the immediate neighborhood of it must be monotonically increasing, thus it cannot have local maximum at this point (values on the left of this point are less then those on the right).
Similarly, if a derivative is negative at some point, the function at this point and in the immediate neighborhood of it must be monotonically decreasing, thus it cannot have local maximum at this point (values on the left of this point are greater then those on the right).
Therefore, a derivative at this point must be equal to zero.

The proof for local minimum is absolutely similar to this.
It would be a nice exercise to right down a proof of it without looking into the proof for maximum offered above.

End of proof.

## Thursday, November 10, 2016

### Unizor - Derivatives - Easy Problems

Notes to a video lecture on http://www.unizor.com

Derivatives - Easy Problems

We recommend to go through these easy problems before watching the lecture with their solutions.
We offer a simple proof of the first theorem as a sample.

Theorem 1

Assume, function f(x) is differentiable (that is, has a derivative) at some point x0.
Symbolically, the following limit exists and equals to some constant K:
limx→x0[f(x)−f(x0)]/(x−x0) = K
or, in a short form, setting
f(x)−f(x0) = Δf(x) and
x−x0 = Δx,
this looks as follows
limΔx→0[Δf(x)/Δx] = K
Prove that it is continuous at this point.

Proof

Given:
Δf(x)/Δx → K as Δx → 0.
Therefore,
β(x) = Δf(x)/Δx − K
is infinitesimal variable
as Δx → 0.
From this we derive that
Δf(x) = [K+β(x)]·Δx
is also an infinitesimal if Δx → 0.
This is a definition of continuity of function f(x) at point x0.

Theorem 2

Prove that monotonically increasing in some interval differentiable function has non-negative derivative in this interval.

Theorem 3

Prove that monotonically decreasing in some interval differentiable function has non-positive derivative in this interval.

Theorem 4

Prove that, if a derivative of a differentiable function is positive in some interval, the function is monotonically increasing in this interval.

Theorem 5

Prove that, if a derivative of a differentiable function is negative in some interval, the function is monotonically decreasing in this interval.

## Tuesday, November 1, 2016

### Unizor- Derivatives - Compound Functions

Notes to a video lecture on http://www.unizor.com

Derivative Properties -
Compound Functions

Our purpose is to express the derivative of a compound function (a function of a function) in terms of derivatives of its components.

Assume that the real functionf(x) is defined anddifferentiable (that is, its derivative exist) on certain interval with a derivative f I(x).
Assume further that the real function g(x) is defined anddifferentiable on certain interval with a derivative g I(x) and its values are falling within the domain of function f(x).
Then we can talk about acompound functionh(x)=f(g(x)).

In other words, this compound function can be represented as
h(x)=f(y), where y=g(x).

Let's determine the derivative of this compound function.
The increment of functionh(x)=f(g(x)) is
Δh(x) = h(x+Δx)−h(x) =
= f(g(x+
Δx)) − f(g(x))

Using the definition of the increment of a function
g(x+Δx) = g(x) + Δg(x),
it would look like
Δh(x) = h(x+Δx)−h(x) =
= f(g(x)+
Δg(x)) − f(g(x))

Recalling the representationy=g(x), we can write this as
Δh(x) = f(y+Δy) − f(y)
where y=g(x).

Next operations to find a derivative are: dividing the function increment Δh(x) by an increment of an argument Δxand going to a limit as Δx→0.

Δh(x)/Δx =
= [f(y+
Δy) − f(y)]/Δx =
Δf(y)/Δx
where y=g(x).

To transform this into expressions that would lead us to separation of functions combined in this compound function, multiply and divide this by Δy.
Δh(x)/Δx =
= [Δf(y)/Δy]·[Δy/Δx]

Notice that if Δx0 then Δy0 since function y=g(x) is differentiable and, since Δy0, Δf(y)0 since function f(x) is differentiable.

Therefore, if Δx0,
f(y)/Δy] → df(y)/dy
where y=g(x) and
y/Δx] → dy/dx=dg(x)/dx

Finally, we came to an expression for a derivative of a compound function.
Δh(x)/Δx → [df(y)/dy]·[dy/dx]
In other words,
dh(x)/dx = [df(y)/dy]·[dy/dx]
where y=g(x).

In terms of original functions,
df(g(x))/dx =
= [df(y)/dy]·[dg(x)/dx]
where y=g(x).
This is a major formula of differentiation of a compound function.

Example 1
f(x)=cos(x); g(x)=x²
f(g(x))=cos(x²)
dcos(x²)/dx =
[dcos(y)/dy]·[d(x²)/dx] =
= [−sin(y)]·[2x]
where y=x².
Therefore,
dcos(x²)/dx = −2·x·sin(x²)

Example 2
f(x)=1/x(x); g(x)=cos(x)
f(g(x))=1/cos(x)=sec(x)
[sec(x)]I =
=[1/cos(x)]I =
=[1/y]I·[cos(x)]I =
=[−1/y²]·[−sin(x)]
where y=cos(x).
Therefore,
[sec(x)]I = sin(x)/cos²(x)

Example 3
f(x)=ex; g(x)=sin(x)
f(g(x))=esin(x)
[esin(x)]I =
=[ey]I·[sin(x)]I =
=[ey]·[cos(x)]
where y=sin(x).
Therefore,
[esin(x)]I = esin(x)·cos(x)

Example 4
f(x)=x²; g(x)=sin(x)
f(g(x))=(sin(x))²=sin²(x)
[sin²(x)]I =
=[]I·[sin(x)]I =
=[2y]·[cos(x)]
where y=sin(x).
Therefore,
[sin²(x)]I = 2·sin(x)·cos(x)

### Unizor - Derivatives - Examples - Logarithmic Functions

Notes to a video lecture on http://www.unizor.com

Derivative Examples -
Logarithmic Functions

1. f(x) = ln(x)
(ln(x) is a natural logarithm with base e - a fundamental constant in Calculus, approximately equal to 2.71)

f I(x) = 1/x

Proof
The function increment is
ln(x+Δx)−ln(x) =
ln[(x+Δx)/x] =
ln(1+Δx/x)

Now we can use an amazing limit
(1+x)1/x → e as x→0
where e is the same fundamental constant as above.
Based on this,
ln[(1+x)1/x] → ln(e) as x→0
Using the properties of logarithms, we can transform it into
[ln(1+x)]/x → 1 as x→0
(that is, x is infinitesimal variable)

Let's use this property in calculation of our derivative.
f I(x) =
= limΔx→0[ln(1+Δx/x)]/Δx =
(substitute δ=Δx/x)
= limδ→0[ln(1+δ)]/(x·δ) =
= {limδ→0[ln(1+δ)]/δ}/x

As we noted above,
[ln(1+x)]/x → 1 as x→0
In our case the role of infinitesimal x→0 is played by variable δ.
Therefore,
limδ→0[ln(1+δ)]/δ = 1
from which we conclude
f I(x) = 1/x

2. f(x) = logb(x)

f I(x) = 1/[x·ln(b)]

Proof

We will use the following property of logarithms that allows to change the base:
logb(x) = logc(x)/logc(b)

Using this, we, firstly, convertlogb(x) into natural logarithm with base e:
logb(x) = ln(x)/ln(b)

Now we see that functionlogb(x) differs from functionln(x) only by a factor 1/ln(b).
Therefore, considering the expression for a derivative ofln(x),
f I(x) = 1/[x·ln(b)]