Wednesday, October 9, 2024

Physics+ Work Lemmas: UNIZOR.COM - Classic Physics+ - Laws of Newton

Notes to a video lecture on UNIZOR.COM

Field Work Lemmas

In the previous lecture we have introduced the concepts of a field and field intensity force that is equal to a gradient of the field potential.
Also, we have proven that, dealing with such force, the work of this field intensity force along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points.

There is a converse theorem that states that, if the work performed by some force on an object depends on the object's position in the beginning and at the end of its movement and does not depend on a trajectory between these points, then this force can be represented as a gradient of some scalar function, the field potential.

This lecture presents certain auxiliary theorems (lemmas) that will help to prove the above mentioned theorem in the next lecture.


Lemma A

This lemma, in short, is about comparing the work performed by a force, when an object moves along the same trajectory in two opposite directions.

More rigorously, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
The coordinate components of the vector of force F at point (x,y,z) are
||Fx(x,y,z),Fy(x,y,z),Fz(x,y,z)||.

An object is moving along certain trajectory from point A to point B within this area, while the force F acts on it and performs certain work.
The coordinate components of the position vector r are
||x,y,z||.
The coordinate components of the infinitesimal increment, the differential, of the position vector dr are
||dx,dy,dz||.

Compare the work done by the force F along the object's trajectory from point A to point B with the work done when an object moves from point B to point A along the same trajectory in the opposite direction.

Solution A

Recall the definition of work performed by force F on an object moving along a trajectory described by position vector r from starting point A to finishing point B:
W[AB] = [AB]dW = [AB]F·dr
The above integral is an infinite sum of infinitesimal work increments dW(x,y,z) performed by force F(x,y,z) on an infinitesimal interval of a trajectory r(x,y,z) from point (x,y,z) to point (x+dx,y+dy,z+dz), where F·dr is a scalar product of two vectors, so
dW(x,y,z) = F(x,y,z)·dr(x,y,z) =
= Fx·dx+Fy·dy+Fz·dz


Assume, at some moment of time our object, moving from A to B, is at position (x,y,z).
During an infinitesimal increment of time it's new position will be (x+dx,y+dy,z+dz) and the force will do an infinitesimal amount of work
dW = Fx·dx+Fy·dy+Fz·dz.

Now assume that the object moves in the opposite direction from B to A along the same trajectory.
Being at the same position (x,y,z) at some moment in time, the infinitesimal increments dx, dy and dz will have signs opposite to those when an object moved from A to B, while the force vector will be the same.

Therefore, the differential of work dW will also be of an opposite sign, and subsequent integration will result in total work done by the force on trajectory from B to A to have the same magnitude but opposite sign comparing with object movement from A to B.

Answer A
W[AB]=[AB]F·dr =
= −[BA]F·dr = −W[BA]


Lemma B

As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.

An object is moving along certain trajectory within this area and the force F(x,y,z) acts on it, performing some work along its trajectory.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.

It's given that the work of this force on an object moving along any trajectory between any pair of points A and B depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.

Prove that the work this force performs on an object moving along a closed trajectory, when points A and B coincide, equals to zero.

Proof B

In our case of a closed trajectory the endpoints A and B coincide. So, we will deal only with point A.

Choose any closed trajectory, its starting and ending point A and point M on it that does not coincide with point A.
Now we have two different paths from A to M, let's call them path #1 and path #2.

If an object moves along a closed trajectory, it moves from point A to point M along path #1 and then moves from point M to point A along path #2.

According to the condition of the problem, a work W1[AM] performed by a force acting on our object along a path #1 from A to M should be equal to a work W2[AM] performed by a force acting on our object along a path #2 from A to M:
W1[AM] = W2[AM]

As has been proven in the Lemma 1, during the second part of the trajectory, when object moves from M to A along path #2, the work of a force is of the same magnitude as if an object moved from A to M along the same path #2 but with an opposite sign:
W2[MA] = −W2[AM]

Therefore, the total work along path #1 from A to M followed by moving from M to A along path #2 equals to
W1[AM]2[MA] = W1[AM]+W2[MA] =
= W1[AM]W2[AM] =
= W1[AM]W1[AM] = 0


Lemma C

This lemma is a converse to Lemma B.
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.

An object is moving along certain trajectory within this area, and the force F(x,y,z) acts on it, performing some work.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.

It's given that the work of this force on an object moving along any closed trajectory that starts and ends at the same point equals to zero.

Prove that the work this force performs on an object moving from any fixed point A to any fixed point B does not depend on trajectory between these points.

Proof C

Choose any two paths from point A to point B - path #1 and path #2.
We will use the symbols introduced in Lemma 2.
As stated in the condition of this lemma,
W1[AB]2[BA] =
= W1[AB] + W2[BA] = 0
At the same time
W2[BA] = −W2[AB]
Therefore,
0 = W1[AB] + W2[BA] =
= W1[AB]W2[AB]
from which follows that
W1[AB] = W2[AB]

As we see, a statement that the work performed by a force on an object moving along any closed trajectory equals to zero is equivalent to a statement that the work performed by a force on an object moving from any point A to any point B does not depend on a trajectory an object moves between these points.


Lemma D

As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
It's given that the work of this force on an object moving along any trajectory between any pair of points depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.

Assume, there are three points in the area where the force is acting, A, B and C.
Consider amounts of work the force performs on an object during its movements between these points
W[AB], W[AC], W[BC].

Prove that
W[BC] = W[AC]W[AB].

Proof D

From Lemma B above follows that the amount of work performed by a force on an object moving along a closed trajectory from A to B to C and back to A equals to zero
W[AB] + W[BC] + W[CA] = 0

From Lemma A above follows that, if object moves in the opposite direction along the same trajectory, the amount of work performed by a force is the same in magnitude but opposite in sign to the amount of work performed on a directly moving object
W[AC] = −W[CA]

Therefore,
to B to C and back to A equals to zero
W[AB] + W[BC]W[AC] = 0
from which follows that
W[BC] = W[AC]W[AB].

Coincidentally, this equality reminds the rule about a difference between two vectors
BC = ACAB

Tuesday, October 1, 2024

Physics+ Field, Potential: UNIZOR.COM - Classic Physics+ - Laws of Newto...

Notes to a video lecture on UNIZOR.COM

Field, Potential

The prototype for an abstract concept of a field presented below is a gravitational field. The force of this field acting on a unit mass is a prototype for a concept of field intensity.

On UNIZOR.COM these concepts were introduced in the Physics 4 Teens course, part Energy, chapters Energy of Gravitational Field and Gravitational Potential.
Here we will formally define these and other concepts and derive a few important field properties.

Field is an area (a subset) of points P{x,y,z} in our three-dimensional space with a vector of force F(P) called field intensity defined at each point P of this area and a real function U(P) called potential defined at exactly the same points, when the following equation between a force and a potential at each point is held:
P{x,y,z}: F(P) = −∇U(P)
where symbol ∇ signifies gradient of a function U - a vector of function's partial derivatives by each coordinate ∇U(P) = U(x,y,z) =
= ||∂U/∂x,∂U/∂y,∂U/∂z||


It should be noted that in many cases authors do not differentiate between the field and the field intensity force, defining the field as the force, which in our opinion is misleading.
That's why we define field as an area of a space, which corresponds to the usual meaning of this word, and field intensity as a force acting inside this area.

Let's assume that a material point acted upon by the force of a field is moving during the time t period from t=t1 to t=t2 from point A to point B along some trajectory
P(t) = {x(t),y(t),z(t)}
where P(t1)=A and P(t2)=B

The first important property of a field is that the work of the field intensity force F(P) along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points. In other words, no matter how an object that experiences a field force moves from point A to point B, the work of the force remains the same and depends only on the field potential at end points A and B.

Here is the proof of this statement.
Assume, at moment t1 our object is at point A, and it moves along a trajectory r(t)=||x(t),y(t),z(t)|| until at moment t2 it reaches point B.
The field intensity F is defined for all points of a field, including the points along the trajectory of an object and is equal to F(t)=F(x(t),y(t),z(t)).

Work performed by any force during the observed time is, by definition,
Wt∈[t1,t2] = t∈[t1,t2]F(t)·dr(t)
In our case of a field force
F(t) = F(x(t),y(t),z(t)) =
= −∇U(x(t),y(t),z(t)) =
= −||∂U/∂x,∂U/∂y,∂U/∂z||

and
dr(t)=||dx(t),dy(t),dz(t)||

Let's evaluate the scalar product of two vectors under the integral
||∂U/∂x,∂U/∂y,∂U/∂z||·
||dx(t),dy(t),dz(t)|| =
= (∂U/∂x)·dx(t) +
+ (∂U/∂y)·dy(t) +
+ (∂U/∂z)·dz(t) =
= dU(x(t),y(t),z(t)) = dU(t)

Above is a full differential (infinitesimal increment) of function U(t)=U(x(t),y(t),z(t)) on infinitesimal time interval from t to t+dt.

Therefore, the work performed by a field force F(t) equals to
Wt∈[t1,t2] = −t∈[t1,t2]dU(t) =
= −U(t2) + U(t1) = U(t1) − U(t2)

As we see, the total amount of work performed by a field intensity force depends only on the field potentials at end points and does not depend on the path (trajectory) an object took to reach from start to finish.

An obvious consequence of this property of the field is that an amount of work the field intensity force performed when an object moves along a trajectory with the finishing point coinciding with the starting one equals to zero.

Recall that amount of work the field intensity force performed when an object moves from one point to another equals to an increment of the object's kinetic energy (see the previous lecture Newton's Laws of this chapter of a course)
Wt∈[t1,t2] = T(t2) − T(t1)
Therefore,
T(t2) − T(t1) = U(t1) − U(t2)
from which follows the Law of Conservation of Energy
T(t1) + U(t1) = T(t2) + U(t2)
It states that the sum of kinetic and potential energy is not changing during an object's movement within a field.

Sunday, September 29, 2024

Physics+ Recap of Newton's Laws: UNIZOR.COM - Classic Physics+ - Laws of...

Notes to a video lecture on UNIZOR.COM

Recap of Newton's Laws

Laws of Mechanics introduced by Newton assume existing and usage for our purposes so called inertial frames of reference (systems of coordinates) where these laws are held. These frames of reference allow to express the position and velocity of objects under observation.
Also assumed is a concept of force as the action that affects the movements of physical objects.

The important items to be considered when discussing Newton's Laws are the concepts of a material point (an object of zero dimensions but having some inertial mass), space coordinates (usually, Euclidean coordinates in three-dimensional space) and velocities (vectors, components of which are derivatives of corresponding coordinates).

We will usually use symbol m for inertial mass of an object, vector r=(x,y,z) for Euclidean coordinates, vector v=(vx,vy,vz) for velocity vector and vector p=m·v for momentum of an object of mass m moving with velocity vector v.
We will further assume that space coordinates and velocities are functions of time t and v(t)=dr(t)/dt.

The First Newton's Law states that a material point that is not acted upon by external forces maintains constant velocity, so
dv(t)/dt = 0

This law can be derived from the Second Newton's Law that deals with vectors of forces F=(Fx,Fy,Fz).
This law relates the vector of force and a speed of change in the momentum of an object upon which this force acts.
dp(t)/dt = d(m·v(t))/dt = F(t)

By multiplying all terms by dt, this same law can be also expressed in terms of infinitesimal increment of momentum during an infinitesimal time interval as a result of an impulse of a force, the product of a force vector by that same infinitesimal time interval:
dp(t) = d(m·v(t)) = F(t)·dt

Finally, the Third Newton's Law states that action of one object upon another is always symmetrical. If the force FAB is exerted by object A upon object B, the same by magnitude and opposite by direction force FBA is exerted by object B upon object A.
FAB = −FBA

Important notes
(a) Mass is additive. The mass of two objects combined together is a sum of their masses.
(b) Vectors of forces acting on the same object can be added by the rules of vector algebra, resulting in one vector, whose action is the same as a combination of actions of individual forces.
(c) Each vector equation mentioned above can be broken into three individual equations for each coordinate.
(d) All statements and equations above should be taken as axioms, because they are in good agreement with our day-to-day practice. They represent a theoretical model that we can study further and, based on them, derive numerous properties of moving objects.

Conservation of momentum
Let's do the following experiment.
Two material points A and B are connected with a massless rigid rod. We take this pair and throw it out to open space, where no other forces act on these two material points except a force of one object upon another via a rigid rod that connects them.
Let's analyze the change of momentum of these two objects with time.
Assume, at time t1 the momentum of our objects are pA(t1) and pB(t1). As time goes, our objects move in space in some way that depends on initial push and subsequent interaction with each other via a rod that connects them. At the end of our experiment at time t2 our objects have momentum pA(t2) and pB(t2).
Momentum pA(t2) is the result of an object A initial momentum pA(t1) and combined (that is, integrated) infinitesimal increments of this momentum during the experiment
pA(t2) = pA(t1) + t∈[t1,t2]dpA(t)
As mentioned above, in general,
dp(t) = F(t)·dt
In our case the only force acting on object A is the force FBA(t) exerted by object B.
Therefore,
pA(t2) = pA(t1) + t∈[t1,t2]FBA(t)·dt
Similarly, considering an object B and force acting on it from object A, we have
pB(t2) = pB(t1) + t∈[t1,t2]FAB(t)·dt
Combining these two statements to have a total momentum of the system of these two objects in the beginning and at the end of experiment and taking into consideration the Third Newton's Law FAB=−FBA, we obtain
pA(t2)+pB(t2) = pA(t1)+pB(t1)
That is, the total momentum of a closed system (no external forces) is constant.
This is a simple derivation of the Law of Conservation of Momentum.
Granted, it is proven here only in a simple case of two objects, but the proof can be easily extended to a case of any closed system with any number of objects acting upon each other without external forces.

Work and Kinetic Energy
Assume that a material point moves in three-dimensional Euclidean space from time t1 to time t2 along a trajectory described by time-dependent vector r(t) and there is a time-dependent vector of force F(t) acting upon it.

Work performed by this force during the observed time is, by definition,
Wt∈[t1,t2] = t∈[t1,t2]F(t)·dr(t)
Notice that differential of a vector r(t) is a velocity vector v(t) multiplied by differential of time dt.
Also notice that in the formula above we deal with a scalar (dot) product of two vectors - F(t) and dr(t).
According to Newton's Second Law,
dp(t)/dt = F(t)
Also,
dr(t) = v·dt = (p/m)·dt

Therefore,
Wt∈[t1,t2] = t∈[t1,t2](1/m)·p(t)·dp(t)
Integrating this, we obtain
Wt∈[t1,t2]=(1/m)[p²(t2)−p²(t1)]/2=
=[v²(t2)−v²(t1)]/2 = T(t2)−T(t1)
where T(t)=m·v²(t)/2 is kinetic energy of an object of mass m moving with velocity v(t).

In other words, work performed by a force equals to an increment of kinetic energy of an object this force acts upon.
Another formulation might be that work performed upon an object is transformed into its kinetic energy.

Friday, September 27, 2024

Physics+ Introduction: UNIZOR.COM - Physics+

Notes to a video lecture on UNIZOR.COM

Classic Physics+ Introduction

This course contains material not usually addressed in high school course of Physics. However, it's still a part of Classic Physics, and it's essential to understand the concepts presented here, as they play a very important role in contemporary Physics.

In the previous course Physics 4 Teens, part Waves, chapter Phenomena of Light, lecture Angle Refraction we have mentioned the intuitively understandable and natural Fermat's Principle of the Least Time.
Based on this principled we derived the optimal trajectory and angle of refraction of the ray of light going from one medium to another with a different refraction index.

Briefly speaking, if a ray of light moves along certain trajectory from point A to point B, the time it spends during this movement should be less than if it moved along any other trajectory.
If both points A and B are in empty space, the trajectory will be a straight line.
If, however, there are different media between them and, consequently the light propagates there with different speed, the Principle of Least Time can help to determine the angle of refraction on each change of medium along a trajectory to minimize the time to travel.

What's important to pay attention to in this phenomena is that there are many trajectories to reach point B from point A, but the ray of light chooses the one that minimizes certain numerical characteristic that depends on an entire trajectory - time of travel.

There is nothing wrong with application of Newton's Laws to find the trajectory of movement, but in many practical cases the complexity of such an approach is very high, so it would be quite a challenging endeavor.

Generalizing from the above example, the points A and B might not be real points in our three-dimensional Euclidean space, but some numerical characteristics of a state of a physical system under our observation. It can be a combination of spherical coordinates and velocities, for example, or positions relative to the center of our galaxy and impulses etc.
In any case, it's intuitively easy to accept that the change of a system from one state to another should be going along such a trajectory that minimizes or maximizing some numerical characteristic of an entire trajectory.

We will introduce a quantity that depends on an entire trajectory of movement of a physical system from one state to another. Stationary value (minimum, maximum, saddle point) of this quantity characterizes the trajectory of movement, which is similar to the Fermat's principle that a trajectory of the ray of light is the one that minimizes the time light travels from one point to another.

At the end of 18th century Italian-French mathematician Joseph-Louis Lagrange has developed exactly this theory, suggested a function that depends on system's characteristics and shown that finding the stationary point of this function leads to system of differential equations identical to Newtonian Laws.
This function was called action.
What was quite important, this approach to finding the trajectory of complex systems significantly simplified the calculations comparing to directly applying Newton's Laws.

Yet another approach, based on Lagrange work, was suggested by Irish mathematician and astronomer William Hamilton in 1833. His approach allowed to build a system that successfully bridged Classic Physics with Quantum one.

Details of both Lagrangian and Hamiltonian approach to formulate Classic Mechanics are the subject of this course.

Sunday, September 22, 2024

Matrices+ 03 - Eigenvalues in 3D: UNIZOR.COM - Math+ & Problems - Matrices

Notes to a video lecture on http://www.unizor.com

Matrices+ 02
Eigenvalues in 3D


Problem A

Find all eigenvalues and eigenvectors of this 33 matrix, if it's known that one of the eigenvalues is 10.
48−4
4−2−4
−8−4−10


Note A
We specify one of the eigenvalues because general calculation of all eigenvalues in 3D leads to a polynomial equation of the 3rd degree.
Since we want to avoid the necessity to solve it, we specify one eigenvalue, which leads to finding the other two by solving a quadratic equation, which should not present any problem.

Solution A

If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form for 33 matrix
a1,1a1,2a1,3
a2,1a2,2a2,3
a3,1a3,2a3,3
·
v1
v2
v3
=
λ
·
v1
v2
v3
which is equivalent to
(a1,1−λ)·v1+a1,2·v2+a1,3·v3 = 0
a2,1·v1+(a2,2−λ)·v2+a2,3·v3 = 0
a3,1·v1+a3,2·v2+(a3,3−λ)·v3 = 0

This is a system of three linear equations with four unknowns λ, v1, v2 and v3.

One trivial solution would be v1=0, v2=0 and v3=0, in which case λ can take any value.
This is not a case worthy of analyzing.

If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.

Therefore, a necessary condition for existence of eigenvectors v other than null-vector is
det(A) =
= (a1,1−λ)·(a2,2−λ)·(a3,3−λ)+
+a2,1·a3,2·a1,3+a1,2·a2,3·a3,1
−a1,3·(a2,2−λ)·a3,1
−a1,2·a2,1·(a3,3−λ)−
−(a1,1−λ)·a2,3·a3,2 = 0

Using values of a matrix of this problem, this equation is
(4−λ)·(−2−λ)·(−10−λ)+
+8·(−4)·(−8)+4·(−4)·(−4)−
−(−4)·(−2−λ)·(−8)−
−8·4·(−10−λ)−
−(4−λ)·(−4)·(−4) = 0
Simplifying this equation, we get
−λ³−8·λ²+108·λ+720=0

First of all, we can check if the eigenvalue 10 given in the problem is the root of this equation.
Indeed, the following is true.
−10³−8·10²+108·10+720=0

Since we know one root λ1=10 of cubic equation, we can represent the left side of this equation as a product of (λ−10) and a quadratic polynomial with easily calculated coefficients, getting equation
(λ−10)·(−λ²−18·λ−72) = 0

To get all the roots of the original cubic equation, we have to solve a quadratic equation
−λ²−18·λ−72 = 0
or, using a canonical form,
λ²+18·λ+72 = 0
Its roots are
λ2,3 = −9±√81−72 = −9±3

So, we have three eigenvalues for our matrix: −12, −6 and 10.

Consider now that we have determined eigenvalue λ and would like to find eigenvector v=||v1,v2,v3|| transformed into a collinear one by matrix A with this exact factor of change in magnitude.

If some vector v=||v1,v2,v3|| that is transformed into a collinear one with a factor λ exists, vector v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
=
s·(A·v) = s·(λ·v)= λ·(s·v)


Therefore, we don't need to determine exact values v1, v2 and v3, we just need to determine only the direction of vector v=||v1,v2,v3||.

Let's start with the first eigenvalue λ1=−12.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−12·v1
4·v1−2·v2−4·v3=−12·v2
−8·v1−4·v2−10·v3=−12·v3
Bringing everything to the left side, get this system
16·v1+8·v2−4·v3=0
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0

As expected, the determinant of the coefficients of this system is zero and this system of equations is linearly dependent (third equation multiplied by −2 gives the first), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0
Divide by 2 all of them to simplify and find v3 from the second equation
2·v1+5·v2−2·v3=0
v3 = 4·v1+2·v2
Substituting v3 into the first equation:
2·v1+5·v2−2·(4·v1+2·v2)=0
or
−6·v1+v2=0
Therefore,
v2=6·v1 and v3 = 16·v1

Regardless of the value of v1, vector ||v1,6·v1,16·v1|| is an eigenvector.
Set v1=1 for simplicity, and vector ||1,6,16|| should be an eigenvector.

Let's check it out.
48−4
4−2−4
−8−4−10
·
1
6
16
=
=
4+48−64
4−12−64
−8−24−160
=
−12
−72
−192
=
=
−12·
1
6
16
Which confirms that vector ||1,6,16|| (and any collinear to it) is an eigenvector with −12 as its eigenvalue.

Let's do the same calculations with the second eigenvalue λ2=−6.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−6·v1
4·v1−2·v2−4·v3=−6·v2
−8·v1−4·v2−10·v3=−6·v3
Bringing everything to the left side, get this system
10·v1+8·v2−4·v3=0
4·v1+4·v2−4·v3=0
−8·v1−4·v2−4·v3=0
Dividing the first equation by 2, the second - by 4 and the third - by −4, all these equations yield a simpler system
5·v1+4·v2−2·v3=0
v1+v2−v3=0
2·v1+v2+v3=0

As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the third equation equals the first minus triple the second one), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1+v2−v3=0
2·v1+v2+v3=0

Find v3 from the first equation
v3 = v1+v2
Substituting v3 into the second equation:
2·v1+v2+(v1+v2)=0
or
3·v1+2·v2=0

Therefore,
v2=−(3/2)·v1 and v3 = −(1/2)·v1

Regardless of the value of v1, vector ||v1,−(3/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,−3,−1|| should be an eigenvector.

Let's check it out.
48−4
4−2−4
−8−4−10
·
2
−3
−1
=
=
8−24+4
8+6+4
−16+12+10
=
−12
18
6
=
=
−6·
2
−3
−1
Which confirms that vector ||2,−3,−1|| (and any collinear to it) is an eigenvector with −6 as its eigenvalue.

Finally, let's do the same for the third eigenvalue 10.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=10·v1
4·v1−2·v2−4·v3=10·v2
−8·v1−4·v2−10·v3=10·v3
Bringing everything to the left side, get this system
−6·v1+8·v2−4·v3=0
4·v1−12·v2−4·v3=0
−8·v1−4·v2−20·v3=0
Dividing the first equation by −2, the second - by 4 and the third - by −4, all these equations yield a simpler system
3·v1−4·v2+2·v3=0
v1−3·v2−v3=0
2·v1+v2+5·v3=0

As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the first equation multiplied by 7 equals the first multiplied by 11 plus the third multiplied by 5), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1−3·v2−v3=0
2·v1+v2+5·v3=0

Find v3 from the first equation
v3 = v1−3·v2
Substituting v3 into the second equation:
2·v1+v2+5·(v1−3·v2)=0
or
7·v1−14·v2=0
or
v1−2·v2=0

Therefore,
v2=(1/2)·v1 and v3 = −(1/2)·v1

Regardless of the value of v1, vector ||v1,(1/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,1,−1|| should be an eigenvector.

Let's check it out.
48−4
4−2−4
−8−4−10
·
2
1
−1
=
=
8+8+4
8−2+4
−16−4+10
=
20
10
−10
=
=
10·
2
1
−1
Which confirms that vector ||2,1,−1|| (and any collinear to it) is an eigenvector with 10 as its eigenvalue.

Answer A

Matrix
48−4
4−2−4
−8−4−10
has three eigenvalues:
−12, −6 and 10.
Their corresponding eigenvectors are:
||1,6,16||, ||2,−3,−1|| and ||2,1,−1||.
Of cause, any vector collinear to a particular eigenvector would also be an eigenvector with the same eigenvalue.

Monday, September 16, 2024

Matrices+ 02 - Eigenvalues: UNIZOR.COM - Math+ & Problems - Matrices

Notes to a video lecture on http://www.unizor.com

Matrices+ 02
Matrix Eigenvalues


The concepts addressed in this lecture for two-dimensional real case are as well applicable to N-dimensional spaces and even to real or complex abstract vector spaces with linear transformations defined there.
Presentation in a two-dimensional real space is chosen for its relative simplicity and easy exemplification.

Let's consider a 2⨯2 matrix A as a linear operator in the two-dimensional Euclidean vector space. In other words, multiplication of any vector v on a coordinate plane by this 2⨯2 matrix A linearly transforms it into another vector on the plane w=A·v.
Assume, matrix A is
56
610

Let's see how this linear operator works, if applied to different vectors.

We will use a row-vector notation in the text for compactness, but column-vector notation in the transformation examples below.
The coordinates of our vectors we will enclose into double bars, like matrices, because a row-vector is a matrix with only one row, and a column-vector is a matrix with only one column.

Our first example of a vector to apply this linear transformation is v=||1,1||.
56
610
·
1
1
=
11
16
Obviously, the resulting vector w=||11,16|| and the original one v=||1,1|| are not collinear.

Applied to a different vector v=||3,−2||, we obtain somewhat unexpected result
56
610
·
3
−2
=
3
−2
Interestingly, the resulting vector w=||3,−2|| and the original one are the same. So, this operator leaves this particular vector in place. In other words, it retains the direction of this vector and multiplies its magnitude by a factor of 1.

Finally, let's applied our operator to a vector v=||2,3||.
56
610
·
2
3
=
28
42
Notice, the resulting vector w=||28,42|| is the original one v=||2,3|| multiplied by 14. So, this operator transforms this particular vector to a collinear one, just longer in magnitude by a factor of 14.

As we see, for this particular matrix we found two vectors that, if transformed by this matrix as by a linear operator, retain their direction, while change the magnitude by some factor.
These vectors are called eigenvectors. For each eigenvector there is a factor that characterizes the change in its magnitude if this matrix acts on it as an operator. This factor is called eigenvalue. This eigenvalue in the example above was 1 for v=||3,−2||, and 14 for v=||2,3||.

There are some questions one might ask.
1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
2. If yes, how to find them and how to find the corresponding multiplication factors?
3. How many such vectors exist, if any?
4. How to find all the multiplication factors for a particular matrix transformation?

Let's analyze the linear transformation by a matrix that leaves the direction of a vector without change, just changes the magnitude by some factor λ.

Assume, we have a matrix A=||ai,j||, where i,j∈{1,2}, in our two-dimensional Euclidean space.
This matrix converts any vector v=||v1,v2|| into some other vector, but we are looking for such vector v that is converted by this matrix into a collinear one.
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form
a1,1a1,2
a2,1a2,2
·
v1
v2
=
λ
·
v1
v2
which is equivalent to
(a1,1−λ)·v1 + a1,2·v2 = 0
a2,1·v1 + (a2,2−λ)·v2 = 0

This is a system of two linear equations with three unknowns λ, v1 and v2.

One trivial solution would be v1=0 and v2=0, in which case λ can take any value.
This is not a case worthy of analyzing.

If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.

Therefore, a necessary condition for existence of other than null-vector v is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0


Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D =
=(a1,1+a2,2)²−4·(a1,1·a2,2−a1,2·a2,1)
=(a1,1−a2,2)²+4·a1,2·a2,1

If D is negative, there are no real solutions for λ.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.

Consider now that we have determined λ and would like to find vectors transformed into collinear ones by matrix A with this exact factor of change in magnitude.

If some vector v=||v1,v2|| that is transformed into a collinear one with a factor λ exists, vector v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
=
s·(A·v) = s·(λ·v)= λ·(s·v)


Therefore, we don't need to determine exact values v1 and v2, we just need to determine only the direction of vector v=||v1,v2||, and this direction is determined by the factor v1/v2 or v2/v1 (to cover all cases, when one of them might be zero).

If v2≠0, the directions of a vector v and that of vector ||v1/v2,1|| are the same.
If v1≠0, the directions of a vector v and that of vector ||1,v2/v1|| are the same.

From this follows that, firstly, we can search for eigenvectors among those with v2≠0, restricting our search to vectors ||x=v1/v2,1||.
Then we can search for eigenvectors among those with v1≠0, restricting our search to vectors ||1,x=v1/v2||.
In both cases we will have to solve a system of two linear equations with two unknowns λ and x.

Searching for vectors ||x,1||
In this case the matrix equation that might deliver the required vector looks like this
a1,1a1,2
a2,1a2,2
·
x
1
=
λ
·
x
1
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·x+a1,2·1 = λ·x
a2,1·x+a2,2·1 = λ·1

Take the right side of the second equation λ and substitute into the right side of the first equation, obtaining a quadratic equation for x:
a1,1·x+a1,2 = (a2,1·x+a2,2)·x
or
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||x1,1|| and ||x2,1||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.

Searching for vectors ||1,x||
In this case the matrix equation that might deliver the required vector looks like this
a1,1a1,2
a2,1a2,2
·
1
x
=
λ
·
1
x
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·1+a1,2·x = λ·1
a2,1·1+a2,2·x = λ·x

Take the right side of the first equation λ and substitute into the right side of the second equation, obtaining a quadratic equation for x:
a2,1·1+a2,2·x = (a1,1·1+a1,2·x)·x
or
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||1,x1|| and ||1,x2||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.

Once again, let's emphasize important definitions.
Vectors transformed into collinear ones by a matrix of transformation are called eigenvectors or characteristic vectors for this matrix.
The factor λ corresponding to some eigenvector is called eigenvalue or characteristic value of the matrix and this eigenvector.

Let's determine eigenvectors and eigenvalues for a matrix A
56
610
used as an example above.

The quadratic equation to determine the multiplier λ for this matrix is
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0

which amounts to
λ² − 15λ + 14 = 0
with solutions
λ1 = 1 and λ2 = 14

Let's find the eigenvectors of this matrix.
The quadratic equation for eigenvectors of type ||x,1|| is
6x² + (10−5)x − 6 = 0 or
6x² + 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(−5±√25+4·36) =
= (1/12)·(−5±13)

Therefore,
x1 = 2/3
x2 = −3/2
Two eigenvectors are:
v1 = ||2/3,1|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||−3/2,1|| which is collinear to vector ||3,−2|| used in the example above.

The matrix transformation of these eigenvectors are
56
610
·
2/3
1
=
28/3
14
But the resulting vector ||28/3,14|| equals to 14·||2/3,1||, which means that eigenvector ||2/3,1|| has eigenvalue 14.
56
610
·
−3/2
1
=
−3/2
1
But the resulting vector ||−3/2,1|| equals to eigenvector ||−3/2,1||, which means that eigenvector ||−3/2,1|| has eigenvalue 1.

Not surprisingly, both eigenvectors found above have eigenvalues already found (1 and 14).

The quadratic equation for eigenvectors of type ||1,x|| is
6x² + (5−10)x − 6 = 0 or
6x² − 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(5±√25+4·36) =
= (1/12)·(5±13)

Therefore,
x1 = 3/2
x2 = −2/3
Two eigenvectors are:
v1 = ||1,3/2|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||1,−2/3|| which is collinear to vector ||3,−2|| used in the example above.
So, we did not gain any new eigenvalues by searching for vectors of a form ||1,x||.

The above calculations showed that for a given matrix we have two eigenvectors, each with its own eigenvalue.

Based on these calculations, we can now answer the questions presented before.

Q1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
A1. Not always, but only if the quadratic equations for x
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
and
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
where ||ai,j|| (i,j∈{1,2}) is a matrix of transformation, have real solutions.

Q2. If yes, how to find them?
A2. Solve the quadratic equations above and, for each real solutions x of the first equation, vector ||x,1|| is an eigenvector and, for each real solutions x of the second equation, vector ||1,x|| is an eigenvector. Then apply the matrix of transformation to each eigenvector ||x,1|| or ||1,x|| and compare the result with this vector. It should be equal to some eigenvalue λ multiplied by this eigenvector.

Q3. How many such vectors exist, if any?
A3. As many as real solutions have quadratic equations above, but no more than two.
Incidentally, in three-dimensional case our equations will be polynomial of the 3rd degree, and the number of solutions will be restricted to three.
In N-dimensional case this maximum number will be N.

Q4. How to find all the multiplication factors for a particular matrix transformation?
A4. Quadratic equation for eigenvalues
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0

can have 0, 1 or 2 real solutions.

The concept of eigenvectors and eigenvalues (characteristic vectors and characteristic values) can be extended to N-dimensional Euclidean vector spaces and even to abstract vector spaces, like, for example, a set of all real functions integrable on a segment ||0,1||.
The detail analysis of these cases is, however, beyond the current course, which aimed, primarily, to introduce advance concepts.


Problem A
Research conditions when a diagonal matrix (only elements along the main diagonal are not zero) has eigenvalues.

Solution A
Matrix of transformation A=||ai,j|| has zeros for i≠j.
So, it looks like this
a1,10
0a2,2

The equation for eigenvalues in this (a1,2=a2,1=0) case is
λ² − (a1,1+a2,2)·λ + a1,1·a2,2 = 0
with immediately obvious solutions
λ1=a1,1 and λ2=a2,2
So, the values along the main diagonal of a diagonal matrix are the eigenvalues of this matrix.

Determine the eigenvectors now among vectors ||x,1||.
Original quadratic equation for this case is
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0

With a2,1=a1,2=0 it looks simpler:
(a2,2−a1,1)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||0,1||.
The eigenvalue for this eigenvector is a2,2.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.

Determine the eigenvectors now among vectors ||1,x||.
Original quadratic equation for this case is
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0

With a2,1=a1,2=0 it looks simpler:
(a1,1−a2,2)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||1,0||.
The eigenvalue for this eigenvector is a1,1.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.

Answer A
If matrix of transformation is diagonal
a1,10
0a2,2
and a2,2≠a1,1,
the two eigenvectors are base unit vectors and the eigenvalues are a1,1 for base unit vector ||1,0|| and a2,2 for base unit vector ||0,1||.
In the case of a1,1=a2,2 any vector is an eigenvector with eigenvalue a1,1.


Problem B
Prove that symmetrical matrix always has real eigenvectors.

Solution B

Matrix of transformation A=||ai,j|| is symmetrical, which means a1,2=a2,1.

Recall that a necessary condition for existence of real eigenvalues λ is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0


Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D = (a1,1−a2,2)²+4·a1,2·a2,1
Since a1,2=a2,1, their product is non-negative, which makes the whole discriminant non-negative.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
So, one or two solutions always exist.