Monday, September 16, 2024

Matrices+ 02 - Eigenvalues: UNIZOR.COM - Math+ & Problems - Matrices

Notes to a video lecture on http://www.unizor.com

Matrices+ 02
Matrix Eigenvalues


The concepts addressed in this lecture for two-dimensional real case are as well applicable to N-dimensional spaces and even to real or complex abstract vector spaces with linear transformations defined there.
Presentation in a two-dimensional real space is chosen for its relative simplicity and easy exemplification.

Let's consider a 2⨯2 matrix A as a linear operator in the two-dimensional Euclidean vector space. In other words, multiplication of any vector v on a coordinate plane by this 2⨯2 matrix A linearly transforms it into another vector on the plane w=A·v.
Assume, matrix A is
56
610

Let's see how this linear operator works, if applied to different vectors.

We will use a row-vector notation in the text for compactness, but column-vector notation in the transformation examples below.
The coordinates of our vectors we will enclose into double bars, like matrices, because a row-vector is a matrix with only one row, and a column-vector is a matrix with only one column.

Our first example of a vector to apply this linear transformation is v=||1,1||.
56
610
·
1
1
=
11
16
Obviously, the resulting vector w=||11,16|| and the original one v=||1,1|| are not collinear.

Applied to a different vector v=||3,−2||, we obtain somewhat unexpected result
56
610
·
3
−2
=
3
−2
Interestingly, the resulting vector w=||3,−2|| and the original one are the same. So, this operator leaves this particular vector in place. In other words, it retains the direction of this vector and multiplies its magnitude by a factor of 1.

Finally, let's applied our operator to a vector v=||2,3||.
56
610
·
2
3
=
28
42
Notice, the resulting vector w=||28,42|| is the original one v=||2,3|| multiplied by 14. So, this operator transforms this particular vector to a collinear one, just longer in magnitude by a factor of 14.

As we see, for this particular matrix we found two vectors that, if transformed by this matrix as by a linear operator, retain their direction, while change the magnitude by some factor.
These vectors are called eigenvectors. For each eigenvector there is a factor that characterizes the change in its magnitude if this matrix acts on it as an operator. This factor is called eigenvalue. This eigenvalue in the example above was 1 for v=||3,−2||, and 14 for v=||2,3||.

There are some questions one might ask.
1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
2. If yes, how to find them and how to find the corresponding multiplication factors?
3. How many such vectors exist, if any?
4. How to find all the multiplication factors for a particular matrix transformation?

Let's analyze the linear transformation by a matrix that leaves the direction of a vector without change, just changes the magnitude by some factor λ.

Assume, we have a matrix A=||ai,j||, where i,j∈{1,2}, in our two-dimensional Euclidean space.
This matrix converts any vector v=||v1,v2|| into some other vector, but we are looking for such vector v that is converted by this matrix into a collinear one.
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form
a1,1a1,2
a2,1a2,2
·
v1
v2
=
λ
·
v1
v2
which is equivalent to
(a1,1−λ)·v1 + a1,2·v2 = 0
a2,1·v1 + (a2,2−λ)·v2 = 0

This is a system of two linear equations with three unknowns λ, v1 and v2.

One trivial solution would be v1=0 and v2=0, in which case λ can take any value.
This is not a case worthy of analyzing.

If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.

Therefore, a necessary condition for existence of other than null-vector v is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0


Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D =
=(a1,1+a2,2)²−4·(a1,1·a2,2−a1,2·a2,1)
=(a1,1−a2,2)²+4·a1,2·a2,1

If D is negative, there are no real solutions for λ.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.

Consider now that we have determined λ and would like to find vectors transformed into collinear ones by matrix A with this exact factor of change in magnitude.

If some vector v=||v1,v2|| that is transformed into a collinear one with a factor λ exists, vector v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
=
s·(A·v) = s·(λ·v)= λ·(s·v)


Therefore, we don't need to determine exact values v1 and v2, we just need to determine only the direction of vector v=||v1,v2||, and this direction is determined by the factor v1/v2 or v2/v1 (to cover all cases, when one of them might be zero).

If v2≠0, the directions of a vector v and that of vector ||v1/v2,1|| are the same.
If v1≠0, the directions of a vector v and that of vector ||1,v2/v1|| are the same.

From this follows that, firstly, we can search for eigenvectors among those with v2≠0, restricting our search to vectors ||x=v1/v2,1||.
Then we can search for eigenvectors among those with v1≠0, restricting our search to vectors ||1,x=v1/v2||.
In both cases we will have to solve a system of two linear equations with two unknowns λ and x.

Searching for vectors ||x,1||
In this case the matrix equation that might deliver the required vector looks like this
a1,1a1,2
a2,1a2,2
·
x
1
=
λ
·
x
1
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·x+a1,2·1 = λ·x
a2,1·x+a2,2·1 = λ·1

Take the right side of the second equation λ and substitute into the right side of the first equation, obtaining a quadratic equation for x:
a1,1·x+a1,2 = (a2,1·x+a2,2)·x
or
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||x1,1|| and ||x2,1||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.

Searching for vectors ||1,x||
In this case the matrix equation that might deliver the required vector looks like this
a1,1a1,2
a2,1a2,2
·
1
x
=
λ
·
1
x
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·1+a1,2·x = λ·1
a2,1·1+a2,2·x = λ·x

Take the right side of the first equation λ and substitute into the right side of the second equation, obtaining a quadratic equation for x:
a2,1·1+a2,2·x = (a1,1·1+a1,2·x)·x
or
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||1,x1|| and ||1,x2||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.

Once again, let's emphasize important definitions.
Vectors transformed into collinear ones by a matrix of transformation are called eigenvectors or characteristic vectors for this matrix.
The factor λ corresponding to some eigenvector is called eigenvalue or characteristic value of the matrix and this eigenvector.

Let's determine eigenvectors and eigenvalues for a matrix A
56
610
used as an example above.

The quadratic equation to determine the multiplier λ for this matrix is
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0

which amounts to
λ² − 15λ + 14 = 0
with solutions
λ1 = 1 and λ2 = 14

Let's find the eigenvectors of this matrix.
The quadratic equation for eigenvectors of type ||x,1|| is
6x² + (10−5)x − 6 = 0 or
6x² + 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(−5±√25+4·36) =
= (1/12)·(−5±13)

Therefore,
x1 = 2/3
x2 = −3/2
Two eigenvectors are:
v1 = ||2/3,1|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||−3/2,1|| which is collinear to vector ||3,−2|| used in the example above.

The matrix transformation of these eigenvectors are
56
610
·
2/3
1
=
28/3
14
But the resulting vector ||28/3,14|| equals to 14·||2/3,1||, which means that eigenvector ||2/3,1|| has eigenvalue 14.
56
610
·
−3/2
1
=
−3/2
1
But the resulting vector ||−3/2,1|| equals to eigenvector ||−3/2,1||, which means that eigenvector ||−3/2,1|| has eigenvalue 1.

Not surprisingly, both eigenvectors found above have eigenvalues already found (1 and 14).

The quadratic equation for eigenvectors of type ||1,x|| is
6x² + (5−10)x − 6 = 0 or
6x² − 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(5±√25+4·36) =
= (1/12)·(5±13)

Therefore,
x1 = 3/2
x2 = −2/3
Two eigenvectors are:
v1 = ||1,3/2|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||1,−2/3|| which is collinear to vector ||3,−2|| used in the example above.
So, we did not gain any new eigenvalues by searching for vectors of a form ||1,x||.

The above calculations showed that for a given matrix we have two eigenvectors, each with its own eigenvalue.

Based on these calculations, we can now answer the questions presented before.

Q1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
A1. Not always, but only if the quadratic equations for x
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
and
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
where ||ai,j|| (i,j∈{1,2}) is a matrix of transformation, have real solutions.

Q2. If yes, how to find them?
A2. Solve the quadratic equations above and, for each real solutions x of the first equation, vector ||x,1|| is an eigenvector and, for each real solutions x of the second equation, vector ||1,x|| is an eigenvector. Then apply the matrix of transformation to each eigenvector ||x,1|| or ||1,x|| and compare the result with this vector. It should be equal to some eigenvalue λ multiplied by this eigenvector.

Q3. How many such vectors exist, if any?
A3. As many as real solutions have quadratic equations above, but no more than two.
Incidentally, in three-dimensional case our equations will be polynomial of the 3rd degree, and the number of solutions will be restricted to three.
In N-dimensional case this maximum number will be N.

Q4. How to find all the multiplication factors for a particular matrix transformation?
A4. Quadratic equation for eigenvalues
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0

can have 0, 1 or 2 real solutions.

The concept of eigenvectors and eigenvalues (characteristic vectors and characteristic values) can be extended to N-dimensional Euclidean vector spaces and even to abstract vector spaces, like, for example, a set of all real functions integrable on a segment ||0,1||.
The detail analysis of these cases is, however, beyond the current course, which aimed, primarily, to introduce advance concepts.


Problem A
Research conditions when a diagonal matrix (only elements along the main diagonal are not zero) has eigenvalues.

Solution A
Matrix of transformation A=||ai,j|| has zeros for i≠j.
So, it looks like this
a1,10
0a2,2

The equation for eigenvalues in this (a1,2=a2,1=0) case is
λ² − (a1,1+a2,2)·λ + a1,1·a2,2 = 0
with immediately obvious solutions
λ1=a1,1 and λ2=a2,2
So, the values along the main diagonal of a diagonal matrix are the eigenvalues of this matrix.

Determine the eigenvectors now among vectors ||x,1||.
Original quadratic equation for this case is
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0

With a2,1=a1,2=0 it looks simpler:
(a2,2−a1,1)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||0,1||.
The eigenvalue for this eigenvector is a2,2.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.

Determine the eigenvectors now among vectors ||1,x||.
Original quadratic equation for this case is
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0

With a2,1=a1,2=0 it looks simpler:
(a1,1−a2,2)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||1,0||.
The eigenvalue for this eigenvector is a1,1.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.

Answer A
If matrix of transformation is diagonal
a1,10
0a2,2
and a2,2≠a1,1,
the two eigenvectors are base unit vectors and the eigenvalues are a1,1 for base unit vector ||1,0|| and a2,2 for base unit vector ||0,1||.
In the case of a1,1=a2,2 any vector is an eigenvector with eigenvalue a1,1.


Problem B
Prove that symmetrical matrix always has real eigenvectors.

Solution B

Matrix of transformation A=||ai,j|| is symmetrical, which means a1,2=a2,1.

Recall that a necessary condition for existence of real eigenvalues λ is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0


Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D = (a1,1−a2,2)²+4·a1,2·a2,1
Since a1,2=a2,1, their product is non-negative, which makes the whole discriminant non-negative.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
So, one or two solutions always exist.

Thursday, August 29, 2024

Matrices+ 01 - Matrix as Operator: UNIZOR.COM - Math+ & Problems - Matrices

Notes to a video lecture on http://www.unizor.com

Matrices+ 01
Matrix as Operator


In the beginning matrices were just tables of real numbers with m rows and n columns, we called them mn matrices.
We knew how to add two matrices with the same numbers of rows and columns, how to multiply a matrix by a scalar number and how to multiply an mn matrix by an nk matrix.

Here is a short recap.

Addition of matrices
Let Am⨯n be an mn matrix with elements ai,j, where the first index signifies the row, where this element is located and the second index is a column.
Let Bm⨯n be another mn matrix with elements bi,j.
Then a new matrix C=A+B is an mn matrix with elements ci,j=ai,j+bi,j.

Multiplication of a matrix by a scalar
Let s be any real number (scalar) and Am⨯n be an mn matrix with elements ai,j.
Multiplication of s by Am⨯n produces a new matrix Bm⨯n with elements bi,j=s·ai,j.
Obviously, since the product of numbers is commutative and associative, the product of a matrix by a scalar is commutative, and the product of a matrix by two scalars is associative.

Product of two matrices
Let Am⨯n be an mn matrix with elements ai,j.
Let Bn⨯k be an nk matrix with elements bi,j.
It's important for a definition of a product of matrices that the number of columns in the first matrix Am⨯n equals to a number of rows in the second Bn⨯k (in our case this number is n.)
The product Cm⨯k of Am⨯n and Bn⨯k is mk matrix with elements cp,q calculated as sums of products of n elements of pth row of matrix Am⨯n by n elements of qth column of matrix Bn⨯k
p∈[1,m], ∀q∈[1,k]:
cp,q = Σ[1≤t≤n]ap,t·bt,q
It's important to note that, generally speaking, multiplication of two matrices IS NOT commutative, see Problem C below.
The product of three matrices, however, is associative, see Problem B below.

For our purposes we will only consider "square" matrices with the same number of rows and columns n and n-dimensional Euclidean vector space of sequences of n real numbers organized in a row (row-vector, which can be viewed as 1n matrix) or in a column (column-vector, which can be viewed as n1 matrix).

What happens if we multiply an nn matrix by a column-vector, which we can consider as an n1 matrix, according to the rules of multiplication of two matrices?
In theory, the multiplication is possible because the number of columns in the first matrix n equals to the number of rows in the second one (the same n) and, according to the rules of multiplication, the resulting matrix should have the number of rows of the first matrix n and the number of columns of the second one, that is 1. So, the result is an n1 matrix, that is a column-vector.

As we see, multiplication of our nn matrix by a column-vector with n rows results in another column-vector with n row.
In other word, multiplication by nn matrix on the left transforms one column-vector into another, that is this multiplication represents an operation in n-dimensional vector space of column-vectors. The nn matrix itself, therefore, acts as an operator in the vector space of column-vectors of n components.

Similar operations can be considered with row-vectors and their multiplication by matrices. The only difference is the order of this multiplication.
In a case of column-vectors we multiplied nn matrix by n1 column-vector getting another n1 column-vector.
In case of row-vectors, we change the order and multiply a 1n row-vector by nn matrix on the right, getting another 1n row-vector.
Therefore, an nn matrix can be considered an operator in a vector space of row-vectors, if we apply the multiplication by matrix from the right.

Let's examine the properties of the transformation of n1 column-vectors by multiplying them by nn matrices.

Problem A
Prove that multiplication by an nn matrix is a linear transformation in the n-dimensional Euclidean space ℝn of n1 column-vectors.
(a) ∀u∈ℝn, ∀ K∈ℝ, ∀An⨯n:
A·(K·u) = K·(A·u) = (K·A)·u
(b) ∀u,v∈ℝn, ∀An⨯n:
A·(u+v) = A·u + A·v
(c) ∀u∈ℝn, ∀An⨯n,Bn⨯n:
(A+B)·u = A·u + B·u

Hint A
The proof is easily obtained straight from the definition of matrix multiplications.


Problem B
Prove that consecutive multiplication by two nn matrices is associative in the n-dimensional Euclidean space ℝn of n1 column-vectors.
u∈ℝn, ∀An⨯n,Bn⨯n:
A·(B·u) = (A·B)·u

Hint B
It's all about changing of an order of summation.
Let's demonstrate it for n=2.

Given two matrices A and B and a column-vector u.

Matrix A
a1,1a1,2
a2,1a2,2

Matrix B
b1,1b1,2
b2,1b2,2

Column-vector u
u1
u2

Let v=B·u and w=A·(B·u)=A·v
Then column-vector v's components are
v1 = b1,1·u1 + b1,2·u2
v2 = b2,1·u1 + b2,2·u2

The components of vector w are
w1 = a1,1·v1 + a1,2·v2 =
= a1,1·(b1,1·u1 + b1,2·u2) +
+ a1,2·(b2,1·u1 + b2,2·u2) =
= a1,1·b1,1·u1 + a1,1·b1,2·u2 +
+ a1,2·b2,1·u1 + a1,2·b2,2·u2 =
= (a1,1·b1,1 + a1,2·b2,1)·u1 +
+ (a1,1·b1,2 + a1,2·b2,2)·u2


w2 = a2,1·v1 + a2,2·v2 =
= a2,1·(b1,1·u1 + b1,2·u2) +
+ a2,2·(b2,1·u1 + b2,2·u2) =
= a2,1·b1,1·u1 + a2,1·b1,2·u2 +
+ a2,2·b2,1·u1 + a2,2·b2,2·u2 =
= (a2,1·b1,1 + a2,2·b2,1)·u1 +
+ (a2,1·b1,2 + a2,2·b2,2)·u2


Let's calculate the product of two matrices A·B
Matrix A·B:
a1,1·b1,1+a1,2·b2,1a1,1·b1,2+a1,2·b2,2
a2,1·b1,1+a2,2·b2,1a2,1·b1,2+a2,2·b2,2

Notice that coefficients at u1 and u2 in expression for w1 above are the same as elements of the table (A·B) of the first row.
Analogously, coefficients at u1 and u2 in expression for w2 above are the same as elements of the table (A·B) of the second row.
That means that, if we multiply matrix (A·B) by a column-vector u, we will get the same vector w as above.
That proves the associativity for a two-dimensional case.

Problem C
Prove that consecutive multiplication by two nn matrices, generally speaking, IS NOT commutative.
That is, in general,
(A·B)·u) ≠ (B·A)·u

Proof C
To prove it, it is sufficient to present a particular case when a product of two matrices is not commutative.

Consider matrix A
12
34

Matrix B
−12
−34

Then matrix A·B will be
−710
−1522

Reverse multiplication B·A will be
56
910

As you see, matrices A·B and B·A are completely different, and, obviously, their product with most vectors will produce different results.
Take a vector with components (1,0), for example. Matrix A·B will transform it into (−7,−15), while matrix B·A will transform it into (5,9).
End of Proof.

Monday, August 26, 2024

Vectors+ 11 - 2D Vector · Complex scalar: UNIZOR.COM - Math+ & Problems ...

Notes to a video lecture on http://www.unizor.com

Vectors+ 11
2D Vector · Complex scalar


We know that a two-dimensional vector multiplied by a real scalar changes its length proportionally, but the direction remains the same for a positive multiplier or changes to opposite for a negative multiplier.
Let's examine how vectors on a two-dimensional plane are affected if multiplied by a non-zero complex scalar.

Our first task is to define a multiplication operation of a vector by a complex number.
To accomplish this, let's recall that any complex number a+i·b, where both a and b are real numbers and i²=−1, can be represented by a vector in a two-dimensional Euclidean space (that is, on the coordinate plane) with abscissa a and ordinate b.
This representation establishes one-to-one correspondence between vectors on a plane and complex numbers.

Using this representation, let's define an operation of multiplication of a vector on a coordinate plane {a,b} by a complex number z=x+i·y as follows:
1. Find a complex number that corresponds to our vector. So, if a vector has coordinates {a,b}, consider a complex number a+i·b.
2. Multiply this complex number by a multiplier z=x+i·y using the rules of multiplication of complex numbers. This means
(a+i·b)·z = (a+i·b)·(x+i·y) =
= a·x+a·i·y+i·b·x+i·b·i·y =
= (a·x−b·y) + i·(a·y+b·x)

3. Find the vector that corresponds to a result of multiplication of two complex number in a previous step. This vector should have abscissa a·x−b·y and ordinate a·y+b·x.
4. Declare the vector in step 3 as a result of an operation of multiplication of the original vector {a,b} by a complex multiplier z=x+i·y. So, the result of multiplication of vector {a,b} by a complex multiplier z=x+i·y is a vector {a·x−b·y,a·y+b·x}

Now we have to examine the geometric aspect of this operation.
For this, let's represent a multiplier z=x+i·y as
z = √x²+y²·(x/√x²+y² + i·y/√x²+y²)

Two numbers, x/√x²+y² and y/√x²+y², are both in the range from −1 to 1 and a sum of their squares equals to 1.
Find an angle φ such that
cos(φ)=x/√x²+y² and
sin(φ)=y/√x²+y².
Now the multiplier looks like
z = |z|·[cos(φ) + i·sin(φ)]
where |z| = √x²+y²

Using this representation, the product of a vector {a,b} by multiplier z=x+i·y looks like
{a·x−b·y,a·y+b·x} = {a',b'}
where
a' = |z|·[cos(φ)−b·sin(φ)] and
b' = |z|·[sin(φ)+b·cos(φ)]

The geometric meaning of the transformation from vector {a,b} to vector {a',b'} is a rotation of the vector by angle φ.
Here is why.
The length of a vector {a,b} is L=√a²+b².
If the angle this vector makes with an X-axis is α, the abscissa and ordinate of our vector can be expressed as
a = L·cos(α)
b = L·sin(α)

Using this representation, let's express the coordinates of a vector {a',b'} obtained as a result of multiplication of the original vector {a,b} by a complex
z=x+i·y=|z|·[cos(φ) + i·sin(φ)]
in terms of L and α.

a'=|z|·[cos(α)·cos(φ)−L·sin(α)·sin(φ)]
and
b'=|z|·[cos(α)·sin(φ)+L·sin(α)·cos(φ)]

Recall from Trigonometry:
cos(α+φ)=cos(α)·cos(φ)−sin(α)·sin(φ)
sin(α+φ)=cos(α)·sin(φ)+sin(α)·cos(φ)

Now you see that
a'=|z|·L·cos(α+φ)
b'=|z|·L·sin(α+φ)

So, the multiplication of a vector {a=L·cos(α),b=L·sin(α)} by a complex number z=x+i·y=|z|·[cos(φ)+i·sin(φ)], can be interpreted as changing the length of an original vector by a factor |z| and a rotation by angle φ.

Vectors+ 10 Complex Hilbert Space: UNIZOR.COM - Math+ & Problems - Vectors

Notes to a video lecture on http://www.unizor.com

Vectors+ 10
Complex Hilbert Spaces


As we know, a Hilbert space consists of
(a) an abstract vector space V with operation of addition
v1,v2V: v1+v2V
(b) a scalar space S of numbers (we consider only two cases: S=ℝ - a set of all real numbers or S=ℂ - a set of all complex numbers) with operation of multiplication of a scalar from this set S by a vector from vector space V
λS and ∀vV: λ·vV
(c) an operation of a scalar product of two vectors from V denoted as ⟨v1,v2⟩, resulting in a scalar from S
v1,v2V: ⟨v1,v2⟩∈S

Operations of addition of two vectors, resulting in a vector, multiplication of a vector by a scalar, resulting in a vector, and scalar product of two vectors, resulting in a scalar, must satisfy certain set of axioms presented in lecture Vectors 08 of this part of a course, including commutative, associative and distributive laws.

In all the previously considered cases our scalar space S was a set of all real numbers .
Now we will consider a set of all complex numbers as a scalar space S and examine the validity of our axioms as applied to a simple vector space.
By all accounts this should not present any problems since arithmetic operations with complex numbers are similar to those with real numbers.
There is, however, one complication.

One of the axioms of a scalar product was:
For any vector a from a vector space V, which is not a null-vector, its scalar product with itself is a positive real number
aV, a≠0: ⟨a,a > 0
This axiom is needed to introduce a concept of a length (or a magnitude, or a norm) of any vector as
||a|| = √⟨a,a⟩

Consider now a simple case of a vector space V - a set of all complex numbers with a scalar space S being a set of complex numbers as well.
We will use the word vector as an element of a vector space V and the word scalar as an element of a scalar space S, but both spaces are sets of all complex numbers .

Addition of two vectors (complex numbers) and multiplication of a vector (a complex number) by a scalar (also a complex number) are defined, as regular operations with complex numbers.
Scalar product of two vectors (two complex numbers) is defined as their regular operation of multiplication.
Let's check how valid our definition of a length of a vector is in this case with all the axioms previously introduced for a set of all real numbers being the scalar space S.

Vector a from V (a complex number) can be expressed as a1+i·a2, where a1 and a2 are also vectors from V, but belong to a subset of only real numbers, while i is an imaginary unit −1, an element of a scalar space S (also a set of all complex numbers).

Let's calculate a scalar product of vector a with itself using the old laws of operations previously defined
a1+i·a2,a1+i·a2 =
[since a1 and a2 are real numbers]
= a1²+2
i·a1·a2+·a2² =
= a1²+2
i·a1·a2−a2²

The problem is, this is not a positive number since it contains an imaginary part.
So, the set of axioms we introduced before for a scalar space being a set of all real numbers is contradictory in case of complex numbers as a scalar space for a simple case above.

To overcome this problem, we have to revise our axioms in such a way that they will hold for examples like above for complex scalars, while being held as well for real numbers as scalars, since real numbers are a subset of complex ones.

There are two small changes in the axioms for a scalar product that we introduce to solve this problem.
Let a and b be two vectors and λ - a complex number as a scalar.
(1) λ·a,b = λ·a,b = a,λ·b
(2) ⟨a,b = b,a
where horizontal bar above a complex number means complex conjugate number, that is a number x−i·y for a number x+i·y and a number x+i·y for a number x−i·y.
Incidentally, a conjugate to imaginary unit i (that is, 0+i·1) is −i (that is, 0−i·1).

First of all, if a complex number x+i·y is, actually, a real one (that is, if y=0), its conjugate number is the same as the original. So, in case of a scalar space being a set of all real numbers these axioms are exactly the same as the old previously accepted ones.

Secondly, for complex scalars these modified axioms solve the problem in a simple case of V=ℂ and S=ℂ mentioned before for a vector a=a1+i·a2, where a1 and a2 are from a subset of only real numbers.

Let's calculate a scalar product using the modified rules.
a1+i·a2,a1+i·a2 =
[using distributive law]
=
a1,a1 + a1,i·a2 +
+
i·a2,a1 + i·a2,i·a2 =
= a1² +
a1,i·a2 +
+
i·a2,a1 + a2·−i·i·a2 =
[since −i²=1 and addition is commutative]
= a1² + a2² +
a1,i·a2 + i·a2,a1 =
[using a newly modified rule]
λ·a·b⟩ = ⟨λ·a,b⟩ = ⟨a,λ·b⟩]
= a1² + a2² +
i·a1,a2 + i·a2,a1 =
= a1² + a2² +
(−i)·
a1,a2 + i·a2,a1 =
[since a1 and a2 are vectors from a subset of real numbers, their scalar product is commutative, and the last two terms cancel each other]
= a1² + a2²

which is a positive real number for any non-zero complex number.

This expression fully corresponds to a concept of an absolute value of a complex number a1+i·a2 and to a concept of a length of a vector in two-dimensional Euclidean space with abscissa a1 and ordinate a2 used as graphical representation of a complex number a1+i·a2.

Moreover, similar calculations can be applied to N-vector of complex numbers as a vector space with a set of all complex numbers as a scalar space.
Indeed, consider for brevity a case of N=2 since the more general case of any N is totally analogous.
The element of our two-dimensional vector space is a pair {a,b}, where both components are complex numbers.
Its scalar product with itself is defined as
⟨{a,b},{a,b}⟩ = (a·a) + (b·b)
and each term on the right of this equation, as we determined, is a positive real number.

As we see, a small modification of the axioms of scalar product in case of complex scalars solves the problem and, at the same time, does not change anything we knew about scalar product with real scalars.

As we did before, we can say that in case of a complex scalar space a scalar product of a non-null vector by itself is positive and the square root of it is its length or magnitude or norm
||a||² = a,a

Problem A
Prove the Cauchy-Schwarz-Bunyakovsky inequality for an abstract vector space V and a complex scalar space S=ℂ
a,bV: |a·b|² ≤ a·a·b·b
where absolute value of any complex number Z=X+i·Y is defined as
|Z|² = X² + Y² = Z·Z

Proof A
Consider any non-zero scalar (complex number) x and non-negative (by axiom) scalar product of vector a+xb by itself
0 ≤ a+x·b,a+x·b
Use distributive properties to open all parenthesis
0a,a+a,x·b+x·b,a+x·b,x·b
Using the modified commutative rules introduced in this lecture, we can transform individual members of this expression as follows
a,x·b = x·a,b
x·b,a = x·b,a
x·b,x·b = x·b,b

Now our inequality looks like
0 x·b,b+x·a,b+x·b,a+a,a

This inequality is true for any complex number x.
If vector b equals a null-vector, the inequality is held because b,b=0 and ⟨a,b=0 (see Problem A of lecture Vectors 08 of this part of a course).
Assume, vector b is not a null-vector, and let's see how our inequality looks for specific value of x:
x = −a,b/b,b

Let's evaluate each member of the inequality above.
x·b,b = a,b·a,b/b,b
x·a,b = −a,b·a,b/b,b
x·a·b = −a,b·a,b/b,b

Putting these values into our inequality and cancelling plus and minus of the same numbers, we obtain
0 ≤ a,b·a,b/b,b + a,a
Multiplying by positive ⟨b,b⟩ and separating members into different sides of an inequality, we obtain
a,b·a,ba,a·b,b
or, since |Z|²=Z·Z for any complex Z,
|a,b|² ≤ a,a·b,b
End of proof.

Thursday, August 15, 2024

Vectors+ 09 Example of Hilbert Space: UNIZOR.COM - Math+ & Problems - Ve...

Notes to a video lecture on http://www.unizor.com

Vectors+ 09
Examples of Hilbert Spaces


Let's illustrate our theory of Hilbert spaces with a few examples.

Example 1

Consider a set V of all polynomials of real argument x defined on a segment [0,1].
It's a linear vector space with any polynomial acting as a vector in this space because all the previously mentioned axioms for an abstract vector space are satisfied:
(A1) Addition of any two polynomials a(x) and b(x) is commutative
a(x),b(x)V: a(x) + b(x) = b(x) + a(x)
(A2) Addition of any three polynomials a(x), b(x) and c(x) is associative
a(x),b(x),c(x)V: [a(x)+b(x)]+c(x) =
= a
(x)+
[b(x)+c(x)]
(A3) There is one polynomial that is equal to 0 for any argument in segment [0,1] called null-polynomial, denoted 0(x) (that is, 0(x)=0 for any x of a domain [0,1]), with a property of not changing the value of any other polynomial a(x) if added to it
a(x)V: a(x) + 0(x) = a(x)
(A4) For any polynomial a(x) there is another polynomial called its opposite, denoted as −a(x), such that the sum of a polynomial and its opposite equals to null-polynomial (that is a polynomial equaled to zero for all arguments)
a(x)V−a(x)V: a(x)+(−a(x))=0(x)

(B1) Multiplication of any scalar (element of a set of all real numbers) α by any polynomial a(x) is commutative
a(x)V, ∀real α: α·a(x) = a(x)·α
(B2) Multiplication of any two scalars α and β by any polynomial a(x) is associative
a(x)V, ∀real α,β:
(α·β)·a(x) = α·(β·a(x))
(B3) Multiplication of any polynomial by scalar 0 results in null-polynomial
a(x)V: 0·a(x) = 0(x)
(B4) Multiplication of any polynomial by scalar 1 does not change the value of this polynomial a(x)
a(x)V: 1·a(x) = a(x)
(B5) Multiplication is distributive relatively to addition of polynomials. ∀a(x),b(x)V, ∀real α:
α·(a(x)+b(x)) = α·a(x)+α·b(x)
(B6) Multiplication is distributive relatively to addition of scalars.
a(x)V, ∀real α,β: (α+β)·a(x) = α·a(x)+β·a(x)

Let's define a scalar product of two polynomials as an integral of their algebraic product on a segment [0,1].
To differentiate a scalar product of two polynomials from their algebraic product under integration we will use notation [a(x)·b(x)] for a scalar product.
[a(x)·b(x)] = [0,1] a(x)·b(x) dx

This definition of a scalar product satisfies all the axioms we set for a scalar product in an abstract vector space.
(1) For any polynomial a(x) from V, which is not a null-polynomial, its scalar product with itself is a positive real number
a(x)V, a(x)≠0(x):
[0,1] a(x)·a(x) dx > 0
(2) For null-polynomial 0(x) its scalar product with itself is equal to zero
[0,1] 0(x)·0(x) dx = 0
(3) Scalar product of any two polynomials a(x) and b(x) is commutative because an algebraic multiplication of polynomials is commutative
a(x),b(x)V:
[0,1] a(x)·b(x) dx = [0,1] b(x)·a(x) dx
(4) Scalar product of any two polynomials a(x) and b(x) is proportional to their magnitude
a(x),b(x)V, for any real γ:
[0,1] ·a(x))·b(x) dx =
= γ·[0,1] a(x)·b(x) dx
(5) Scalar product is distributive relatively to addition of polynomials. ∀a(x),b(x),c(x)V:
[0,1](a(x)+b(x))·c(x) dx=
= [0,1] a(x)·c(x) dx + [0,1] b(x)·c(x) dx

Based on above axioms that are satisfied by polynomials with scalar product defined as we did, we can say that this set is pre-Hilbert space.
The only missing part to be a complete Hilbert space is that this set does not contain limits to certain sequences.
Indeed, we can approximate many smooth non-polynomial functions with sequences of polynomials (recall, for example, Taylor series).

However, the Cauchy-Schwartz-Bunyakovsky inequality was proven for any abstract vector space with scalar product (pre-Hilbert space), and we can apply it to our set of polynomials.
According to this inequality, the following is true for any pair of polynomials:
[a(x)·b(x)]² ≤ [a(x)·a(x)]·[b(x)·b(x)]
or, using our explicit definition of a scalar product,
[[0,1] a(x)·b(x) dx]² ≤
≤ [[0,1] (x) dx]·[[0,1] (x) dx]

Just out of curiosity, let's see how it looks for a(x)=xm and b(x)=xn.
In this case
a(x)·b(x) = xm+n
a²(x) = x2m
b²(x) = x2n
Calculating all the scalar products
[0,1] x(m+n) dx = 1/(m+n+1)
[0,1] x2m dx = 1/(2m+1)
[0,1] x2n dx = 1/(2n+1)
Now the Cauchy-Swartz-Bunyakovsky inequality looks like
1/(m+n+1)² ≤ 1/[(2m+1)(2n+1)]

The validity of this inequality is not obvious, so it would be nice to check if it's really true for any m and n.
To check it, let's transform it into an equivalent inequality between denominators with reversed sign of inequality
(m+n+1)² ≥ (2m+1)(2n+1)
Opening all the parenthesis leads us to this equivalent inequality
m²+n²+1+2mn+2m+2n ≥
≥ 4mn+2m+2n+1

After obvious simplification the resulting inequality looks like
m²+n²−2mn ≥ 0
which is always true because the left side equals to (m−n.
All transformations were invariant and reversible, which proves the original inequality.


Example 2

Elements of our new vector space are infinite sequences of real numbers {xn} (n changes from 1 to ∞) for which series Σn∈[1,∞) xn² converges to some limit.

Addition and multiplication by a scalar are defined as addition and multiplication individual members of the sequences involved.
These operations preserve the convergence of sum of squares of elements.

Scalar product is defined as
{xn}·{yn} = Σn∈[1,∞) xn·yn
In some sense this is an expansion of N-dimensional Euclidean space to infinite number of dimensions, as long as a scalar product is properly defined, which in our case is assured because of convergence of the sum of squares of the elements.
Indeed, this definition makes sense because each member of a sum that defines a scalar product is bounded
|xn·yn| ≤ ½(xn² + yn²)
and the sum of right side of this inequality for all n∈[1,∞) converges.

This set is Hilbert space (we skip the proof that this space is complete for brevity), its properties are very much the same as properties of N-dimensional Euclidean space.
All the axioms of Hilbert space are satisfied.
As a consequence, the Cauchy-Shwartz-Bunyakovsky inequality is [{xn}·{yn}]² ≤ [{xn}·{xn}]·[{yn}·{yn}]

Problem A

Given a set of all real two-dimensional vectors (a1,a2) with standard definitions of addition and multiplication by a scalar (real number)
(a1,a2) + (b1,b2) = (a1+b1,a2+b2)
λ·(a1,a2) = (λ·a1,λ·a2)
So, it's a linear vector space.

The scalar product we will define in a non-standard way:
(a1,a2)·(b1,b2) =
= a1·b1 + 2·a1·b2 + 2·a2·b1 + a2·b2


Is this vector space a Hilbert space?

Hint A
Check if a scalar product of some vector by itself is zero, while the vector is not a null-vector.

Solution A
Let's examine all vectors that have the second component equal to 1 and find the first component x, which breaks the rule of scalar product of a vector with itself to be positive, unless the vector is a null-vector
(x,1)·(x,1) = 0

According to our non-standard definition of a scalar product, this means the following for a1=b1=x and a2=b2=1
x·x + 2·x·1 + 2·1·x + 1·1 = 0
x² + 4·x + 1 = 0
x1 = −2 + √3
x2 = −2 − √3
So, both vectors (x1,1) and (x2,1) have the property that the scalar product of a vector by itself gives zero, while the vectors themselves are not null-vectors.
Indeed, for vector (x1,1) this scalar product with itself is
(x1,1)·(x1,1) = (−2+√3,1)·(−2+√3,1) =
= (4−4√3+3)+4·(−2+√3)+1 = 0

Therefore, thus defined scalar product does not satisfy the axiom for a scalar product in Hilbert space, and our space is not Hilbert's.

Problem B

Prove the parallelogram law in Hilbert space V
a,bV:
||a−b||² + ||a+b||² = 2||a||² + 2||b||²

Note B
For vectors in two-dimensional Euclidean space this statement geometrically mean that sum of squares of two diagonals in a parallelogram equals to sum of squares of all its sides.
The parallelogram law can be proven geometrically in this case, using, for example, the Theorem of Cosines.

Hint B
The definition of a norm or magnitude of a vector x in Hilbert space is
||x|| = √(x·x)
Using this, all you need to prove the parallelogram law is to open parenthesis is the magnitudes of a−b and a+b.

Saturday, August 10, 2024

Vectors+ 08 Hilbert Space: UNIZOR.COM - Math+ & Problems - Vectors

Notes to a video lecture on http://www.unizor.com

Vectors+ 08 - Hilbert Space

Let's continue building our abstract theory of vector spaces introduced in the previous lecture Vectors 07 of this Vectors chapter of this course Math+ & Problems on UNIZOR.COM.

Our next addition is a scalar product of two elements of an abstract vector space.
In case of N-dimensional Euclidian space with two vectors
R(R1,R2,...,RN) and
S(S1,S2,...,SN)
we defined scalar product as
R·S = R1·S1+R2·S2+...+RN·SN

Thus defined, the scalar product had certain properties and characteristics that we have proven based on this definition.
In case of abstract vectors spaces, the scalar product is not explicitly defined, but, instead, defined as any function of two vectors from our vector space that satisfies certain axioms that resemble the properties and characteristics of a scalar product of two vectors in N-dimensional Euclidian space.

Let's assume, we have a vector space V and a scalar space S associated with it - a set of all real numbers in our case.
All axioms needed for V and S to be a vector space were described in the previous lecture mentioned above.
Now we assume that for any two vectors a and b from V there exists a real number called their scalar product denoted a·b that satisfies the following set of axioms.

(1) For any vector a from V, which is not a null-vector, its scalar product with itself is a positive real number
aV, a≠0: a·a > 0
(2) For null-vector 0 its scalar product with itself is equal to zero
a=0: a·a = 0
(3) Scalar product of any two vectors a and b is commutative
a,bV: a·b = b·a
(4) Scalar product of any two vectors a and b is proportional to their magnitude
a,bV, ∀γS: (γ·a)·b = γ·(a·b)
(5) Scalar product is distributive relatively to addition of vectors. ∀a,b,cV: (a+b)·c = a·c+b·c

As before, a square root from a scalar product of a vector by itself will be called magnitude or length, or norm of this vector
||a|| = √(a·a)

Using the above defined scalar product, we can define a distance between vectors a and b as an absolute value of a magnitude of vector a+(−b), which we will write as a−b.
||a−b|| = √(a−b)·(a−b)

A vector space with a scalar product defined above is called a pre-Hilbert space.

To be a Hilbert space, we need one more condition - the completeness of vector space, which means that every converging in a certain way sequence of vectors has a vector that is the limit of this sequence within the same vector space.
More rigorously, the convergence is defined in terms of Cauchy criterion. It states that a sequence of vectors {ai} converges if
for any ε>0 there is a natural number N such that the distance between am and an is less than ε for any m,n ≥ N
ε>0N: m,n > N => ||am−an||<ε


Problem A
Prove that a scalar product of null-vector with any other is zero.

Proof A
aV: 0·a = (0·a)·a = 0·(a·a) = 0
End of proof.


Problem B
Prove that a scalar product changes the sign, if one of its components is replaced with its opposite.

Proof B
[Problem C from the previous lecture Vectors 07 stated that −a=−1·a]
a,bV: (−a)·b = (−1·a)·b = −1·(a·b)
End of proof.


Problem C
Prove the Cauchy-Schwartz-Bunyakovsky inequality
a,bV: (a·b)² ≤ (a·a)·(b·b).

Proof C
If either a or b equals to null-vector, we, obviously, get zero on both sides of inequality, which satisfies the sign .
Assume, both vectors are not-null.
Consider any non-zero γ and non-negative scalar product of a+γb by itself
0 ≤ (a+γ·b)·(a+γ·b)
Use commutative and distributive properties to open all parenthesis
0 ≤ a·a+·a·b+γ²·b·b
Set γ=−(a·b)/[(b·b)]
With this the inequality above takes form
0 ≤ a·a−2·(a·b)²/(b·b)+(a·b)²/(b·b)
Multiplying this inequality by a positive b·b, we obtain
0 ≤ (a·a)·(b·b)−2·(a·b)²+(a·b)²
which transforms into
(a·b)² ≤ (a·a)·(b·b)
End of proof.

Thursday, August 8, 2024

Vectors+ 07 Abstract Vector Space: UNIZOR.COM - Math+ & Problems - Vectors

Notes to a video lecture on http://www.unizor.com

Vectors+ 07 - Abstract Vector Space

This lecture represents a typical mathematical approach to move from a relatively simple and concrete object to some abstraction that allows to transfer properties of the original simple object to many others without proving these properties for each particular case.

Let's start with a simple prototype - a two-dimensional Euclidian plane with coordinates and vectors we studied before. Coordinates are real numbers, vectors can be added to each other and multiplied by a real numbers resulting in some other vector.

Our first step towards abstraction was to move to N-dimensional space, which retains most of the properties of two-dimensional one. However, this is not the exact step towards abstraction, because we still considered a concrete object - an N-dimensional space and demonstrated its properties.

The real abstraction is to choose any object that satisfies certain axioms and derive the properties from these axioms without referring to any concrete object.
Here is what can be done in this direction.

First of all, we introduce an abstract set V, whose elements will be called vectors. It's modelled after an N-dimensional Euclidean space as a prototype, but, in theory, it can be any set satisfying the rules below.

We also consider a set S, whose elements will be called scalars. Its prototype is a set of all real numbers used for multiplication by vectors to change their magnitude, and for our purposes, to concentrate attention on vectors, we will consider this set of scalars to be exactly that, a set of all real numbers. However, it can be some other set, like all complex numbers.

We postulate that these sets satisfy the following rules.

(A) The operation of addition is defined for each pair of vectors in V that establishes a correspondence of this pair of vectors to a new vector called their sum. What's important, we don't define this operation, we don't establish any process it should follow. The only requirement is that this operation must satisfy the following axioms.
(A1) Addition of any two vectors a and b is commutative
a,bV: a + b = b + a
(A2) Addition of any three vectors a, b and c is associative
a,b,cV: (a + b) + c = a + (b + c)
(A3) There is one vector called null-vector, denoted 0, with a property of not changing the value of any other vector a if added to it
aV: a + 0 = a
(A4) For any vector a there is another vector called its opposite, denoted as −a, such that the sum of a vector and its opposite equals to null-vector
aV ∃(−a)∈V: a + (−a) = 0
Note: in many cases an expression a+(−b) will be shortened to a−b. Sign '−' here does not mean a new operation of subtraction, but just an indication of addition with an opposite element.

(B) The operation of multiplication of vector by scalar is defined for each vector in V and each scalar in S, taken in any order. This operation establishes a correspondence of this vector and this scalar to a new vector called the product of a vector and a scalar.
This operation must satisfy the following axioms.
(B1) Multiplication of any scalar α by any vector a is commutative
aV, ∀αS: α·a = a·α
(B2) Multiplication of any two scalars α and β by any vector a is associative
aV, ∀α,βS: (α·β)·a = α·(β·a)
(B3) Multiplication of any vector by scalar 0 results in null-vector
aV: 0·a = 0
(B4) Multiplication of any vector by scalar 1 does not change the value of this vector a
aV: 1·a = a
(B5) Multiplication is distributive relatively to addition of vectors. ∀a,bV, ∀αS: α·(a+b) = α·a+α·b
(B6) Multiplication is distributive relatively to addition of scalars.
aV, ∀α,βS: (α+β)·a = α·a+β·a


Problem A
Prove that there must be only one null-vector in vector space V.

Proof A
Assume there are two null-vectors 01 and 02.
Since 02 is null-vector, its addition to 01 does not change 01.
01 + 02 = 01
Since 01 is null-vector, its addition to 02 does not change 02.
02 + 01 = 02
Since addition is commutative, left sides in the two equations above are equal.
01 + 02 = 02 + 01
Therefore, the right sides are equal:
01 = 02
End of proof.


Problem B
Prove that for any vector there must be only one opposite to it vector in vector space V.

Proof B
Assume that for some vector a there are two opposite vectors (−a)1 and (−a)2.
Since (−a)1 is an opposite to vector a, its addition to a results in null-vector.
Therefore,
((−a)1+a) + (−a)2 = 0 + (−a)2 = (−a)2
Since (−a)2 is an opposite to vector a, its addition to a results in null-vector.
Therefore,
(−a)1 + (a+(−a)2) = (−a)2 + 0 = (−a)2
Since addition is associative, left sides in the two equations above are equal.
Therefore, the right sides are equal:
(−a)1 = (−a)2
End of proof.


Problem C

Prove that for any vector its product with scalar −1 results in an opposite vector.

Proof C
Since vector multiplication by scalar is distributive,
(1+(−1))·a = 1·a + (−1)·a = a + (−1)·a
On the other hand, the same initial statement can be transformed differently
(1+(−1))·a = 0·a = 0
Since left sides in the two equations above are equal, right sides are equal as well
a + (−1)·a = 0
Therefore, (−1)·a satisfies the definition of a vector opposite to a. But, as has been proven in Problem B, there must be only one vector opposite to a, which we denoted as (−a).
Hence, (−1)·a = (−a)
End of proof.