Notes to a video lecture on UNIZOR.COM
Laws of Newton - Problem 1
Problem A
Prove that gravitational force of a point mass is conservative.
That is, prove that the work performed by the force of gravity of a point mass onto an object moving along a trajectory from point A to point B depends only on positions of these endpoints and independent of the trajectory an object moves along.
Proof
As was proven in an earlier lecture Field, Potential of this chapter of the course, to prove that a force is conservative, it is sufficient to show that the force is a negative gradient of some scalar function called potential.
According to the Newton's Law of Gravitation, the vector of the force of gravity produced by a point mass M and attracting a point mass m positioned on a distance r from mass M is directed along the line connecting them towards a mass M, and its magnitude equals to
F = G·M·m/r²
Let's define Cartesian coordinates with a center at a point mass M.
Coordinates of point mass M is (0,0,0).
Vector r = ||x,y,z|| represents a position of a point mass m in this system.
Now we can express the force as a vector in this system using the fact that vector r/r represents a unit vector directed from mass M to mass m.
F = −(G·M·m)·(r/r³)
In this formula we have added to the magnitude of force a multiplier r/r that represents the unit radial vector directed from M to m and the minus sign to change the direction of the vector towards mass M because the gravity attracts.
Now we will define a scalar function, the gradient of which equals to the vector of gravitational force.
Consider for now only a variable part of the vector of force r/r³.
Its representation in coordinate form is ||x/r³,y/r³,z/r³||, where r=(x²+y²+z²)½.
Let's define a function
R(x,y,z) = R(r) =
= 1/r=1/(x²+y²+z²)½.
Its gradient
∇R = ||∂R/∂x,∂R/∂x,∂R/∂z||
can be explicitly calculated as
∂R/∂x = −½(x²+y²+z²)−3/2·2x =
= −x/(x²+y²+z²)³ = −x/r³
Analogously,
∂R/∂y = −½(x²+y²+z²)−3/2·2y =
= −y/(x²+y²+z²)³ = −y/r³
∂R/∂z = −½(x²+y²+z²)−3/2·2z =
= −z/(x²+y²+z²)³ = −z/r³
Comparing this with an expression for the force of gravity, we see that the difference between vector of force
F = −(G·M·m)·(r/r³)
and the gradient of defined above function R(x)
∇R=||−x/r³,−x/r³,−x/r³||=−r/r³
is only a constant multiplier.
Hence, F = (G·M·m)·∇R
Therefore, scalar function
U(r) = −(G·M·m)·R(r) =
= −(G·M·m)/r
where r=(x²+y²+z²)½
has the required property, its negative gradient equals to a vector of the gravitational force.
From this follows that the force of gravity is conservative.
Note 1
This function depends not only on the field properies (mass M of the source of the field and distance r from it), but also linearly depends on a property of another object (mass m).
To make a concept of potential a property of the field only, the function U(r) is called a field potential when mass m is a unit of mass, in which case
U(r) = −G·M/r
Note 2
We can prove that gravity is a conservative force directly by following the same logic we used to prove that, if the force can be represented by a gradient of a potential, the work performed by this force is independent of the trajectory.
Consider now any two points in space A and B and some trajectory that point mass m takes to move from A to B.
The work done by any force on an object moving along some path consists of all small amounts of work the force performs on any small piece of trajectory and, by definition, equals to an infinite sum (that is, integral along a path) of infinitesimal increments *that is, differentials) of work, each of which is a scalar product of the vector of force F and infinitesimal vector of the increment of position along a trajectory dr
W[AB] = ∫[AB]dW = ∫[AB]F·dr
To calculate the scalar product F·dr, we can express both in coordinate form
F = −(G·M·m/r³)·||x,y,z|| =
= −(G·M·m/r³)·r
dr = ||dx,dy,dz||
F·dr = −(G·M·m/r³)·
·(x·dx+y·dy+z·dz) - a scalar
Notice that
x·dx + y·dy + z·dz =
d(x²/2 + y²/2 + z²/2) =
= (1/2)d(r²) = r·dr
Therefore,
F·dr = −(G·M·m/r³)·r·dr =
= −(G·M·m/r²)·dr =
= −(G·M·m)·d(1/r) =
= d(−G·M·m/r)
Since F·dr is a full differential of some function, integral of it along a path from A to B equals to a difference of the values of this function at the points of limits of integration
∫[AB]F·dr = ∫[AB]d(−G·M·m/r) =
= −G·M·m/r(B)+G·M·m/r(A) =
= G·M·m/r(A) − G·M·m/r(B)
which proves that work performed by the force of gravity on an object moving along some trajectory is independent of the trajectory, that is it proves that gravitational force is conservative.
Sunday, October 13, 2024
Friday, October 11, 2024
Physics+ Potential Theorem: UNIZOR.COM - Classic Physics+ - Laws of Newton
Notes to a video lecture on UNIZOR.COM
Potential Theorem
As you recall, we have defined a field as an area (a subset) of points in our three-dimensional space with a vector of force F(x,y,z) called field intensity defined at each point of this area and a real scalar function U(x,y,z) called potential defined at all these points, when the following equation between a force and a potential at each point P(x,y,z) is held:
F(x,y,z) = −∇U(x,y,z)
where symbol ∇ signifies gradient of a function U(x,y,z) - a vector of function's partial derivatives by each coordinate
∇U(x,y,z) =
= ||∂U/∂x, ∂U/∂y, ∂U/∂z||
We have also proven that the work performed by such a field intensity function on an object moving along some trajectory depends only on the endpoints of the object's movement and is independent of a path chosen between these two endpoints.
Thus, independence of work of the trajectory is a necessary condition of the existence of the field potential function U(x,y,z), whose gradient with a minus sign equals to the field intensity force F(x,y,z).
In this lecture we will prove that the condition of work being independent of a trajectory between the endpoints of an object's movement is also a sufficient condition for the field force to be a gradient of some scalar function - the field potential.
Theorem
A vector of force F(x,y,z) is defined at each point P(x,y,z) of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object, acted upon by this force, moves within this area.
It's given that the work performed by this force on an object moving along some trajectory between any two points depends only on positions of these endpoints and is independent of the trajectory between them.
Prove that there exists a scalar function of position U(x,y,z) called potential such that the vector of force equals to a negative gradient of this potential, that is
F(x,y,z) = −∇U(x,y,z) =
= −||∂U/∂x, ∂U/∂y, ∂U/∂z||
Since vector F(x,y,z) at point P(x,y,z) can be expressed in coordinate form as
||Fx(x,y,z),Fy(x,y,z),Fz(x,y,z)||
the above statement can be formulated in coordinate form as
Fx(x,y,z) = −∂U(x,y,z)/∂x
Fy(x,y,z) = −∂U(x,y,z)/∂y
Fz(x,y,z) = −∂U(x,y,z)/∂z
Proof
As a proof, we will explicitly define a potential function and prove that it satisfies the required equalities.
Choose arbitrarily some fixed point A(x0,y0,z0) in the area where our force is defined, that we will use as the beginning of some trajectory.
Choose any other point B(x,y,z) there, where we will explicitly define the scalar function U(x,y,z) (the potential) that satisfies the conditions of the problem.
Let's chose any particular path from point A(x0,y0,z0) to point B(x,y,z) and define the function U(x,y,z) (the field potential) for point B(x,y,z) as the work of force F performed during an object's movement along a chosen path from A(x0,y0,z0) to B(x,y,z) with a negative sign.
This definition of a potential is quite legitimate since the work performed by force F does not depend on a path chosen, but depends only on position of points A and B.
So, by definition, for any point B(x,y,z) where the field is defined
U(B) = U(x,y,z) = −W[AB]
Let's chose a point C(x+dx,y+dy,z+dz) on infinitesimal distance from point B(x,y,z).
Let dr be a vector of displacement from B to C:
dr = ||dx,dy,dz||
According to the definition of work, when the field intensity force F acts on an object that moves from point B(x,y,z) to an infinitesimally close to it point C(x+dx,y+dy,z+dz), the infinitesimal amount of work performed by the force is equal to
dW[BC] = F(x,y,z)·dr =
= Fx(x,y,z)·dx + Fy(x,y,z)·dy +
+ Fz(x,y,z)·dz
At the same time, using the Lemma D of the previous lecture Work Lemmas, this same amount of work equals to
dW[BC] = W[AC] − W[AB]
where fixed point A(x0,y0,z0) was arbitrarily chosen above.
Expressions W[AB] and W[AC] represent the amount of work the field force performs, if an object moves from point A to B and from points A to C correspondingly. These same amounts were used to define a potential at these point
W[AB] = −U(x,y,z)
W[AC] = −U(x+dx,y+dy,z+dz).
Therefore,
dW[BC]= −U(x+dx,y+dy,z+dz)+
+ U(x,y,z) = −dU(x,y,z)
where dU(x,y,z) is a full differential (infinitesimal increment) of function U(x,y,z) on an interval from B(x,y,z) to C(x+dx,y+dy,z+dz).
As known from Calculus, the full differential of a function can be expressed in terms of partial derivatives and differentials of arguments (see Partial Derivatives - Basic Properties lecture of the Calculus chapter in the course Math 4 Teens on UNIZOR.COM).
Therefore,
dW[BC] = −dU(x,y,z) =
= −(∂U(x,y,z)/∂x)·dx −
− (∂U(x,y,z)/∂y)·dy −
− (∂U(x,y,z)/∂z)·dz
Comparing this expression of dW[BC] with the one in terms of the field force components above, we come to an equation
Fx(x,y,z)·dx + Fy(x,y,z)·dy +
+ Fz(x,y,z)·dz =
= −(∂U(x,y,z)/∂x)·dx −
− (∂U(x,y,z)/∂y)·dy −
− (∂U(x,y,z)/∂z)·dz
While dx, dy and dz are infinitesimal increments of position along some infinitesimal displacement for each coordinate, the direction of this displacement can be chosen freely.
Choosing infinitesimal dx and dy=0, dz=0 leads to an equality
Fx(x,y,z) = −∂U(x,y,z)/∂x
Similarly, leaving only dy or dz as infinitesimal increments and setting displacement along other coordinates to 0, we obtain the equalities
Fy(x,y,z) = −∂U(x,y,z)/∂y
Fz(x,y,z) = −∂U(x,y,z)/∂z
End of Proof
The question now arises, is the field potential U(x,y,z) uniquely defined by the field force intensity?
The answer is NO, since we have chosen point A(x0,y0,z0) as, basically, any fixed reference point where the movement of an object begins.
We can choose any other point A'(x1,y1,z1) as the beginning, and the work W(x,y,z) will be different. More precisely, it will differ by the amount of work the force performs moving an object from A to A'.
This means that our field potential is not uniquely defined by the field force, only its partial derivatives are, since they must correspond to force components. This is similar to the fact that, given a derivative of a function, the function is defined as an integral from a derivative plus some freely chosen constant.
Traditionally, for gravitational or electrostatic fields, as the staring point, physicists choose a point infinitely far from the source of the field force. There the force is equal to zero. With this convention the field potential U(x,y,z) is fully defined and equals to the negative work performed by the force F(x,y,z) to move an object from an infinitely far point to any point P(x,y,z).
Potential Theorem
As you recall, we have defined a field as an area (a subset) of points in our three-dimensional space with a vector of force F(x,y,z) called field intensity defined at each point of this area and a real scalar function U(x,y,z) called potential defined at all these points, when the following equation between a force and a potential at each point P(x,y,z) is held:
F(x,y,z) = −∇U(x,y,z)
where symbol ∇ signifies gradient of a function U(x,y,z) - a vector of function's partial derivatives by each coordinate
∇U(x,y,z) =
= ||∂U/∂x, ∂U/∂y, ∂U/∂z||
We have also proven that the work performed by such a field intensity function on an object moving along some trajectory depends only on the endpoints of the object's movement and is independent of a path chosen between these two endpoints.
Thus, independence of work of the trajectory is a necessary condition of the existence of the field potential function U(x,y,z), whose gradient with a minus sign equals to the field intensity force F(x,y,z).
In this lecture we will prove that the condition of work being independent of a trajectory between the endpoints of an object's movement is also a sufficient condition for the field force to be a gradient of some scalar function - the field potential.
Theorem
A vector of force F(x,y,z) is defined at each point P(x,y,z) of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object, acted upon by this force, moves within this area.
It's given that the work performed by this force on an object moving along some trajectory between any two points depends only on positions of these endpoints and is independent of the trajectory between them.
Prove that there exists a scalar function of position U(x,y,z) called potential such that the vector of force equals to a negative gradient of this potential, that is
F(x,y,z) = −∇U(x,y,z) =
= −||∂U/∂x, ∂U/∂y, ∂U/∂z||
Since vector F(x,y,z) at point P(x,y,z) can be expressed in coordinate form as
||Fx(x,y,z),Fy(x,y,z),Fz(x,y,z)||
the above statement can be formulated in coordinate form as
Fx(x,y,z) = −∂U(x,y,z)/∂x
Fy(x,y,z) = −∂U(x,y,z)/∂y
Fz(x,y,z) = −∂U(x,y,z)/∂z
Proof
As a proof, we will explicitly define a potential function and prove that it satisfies the required equalities.
Choose arbitrarily some fixed point A(x0,y0,z0) in the area where our force is defined, that we will use as the beginning of some trajectory.
Choose any other point B(x,y,z) there, where we will explicitly define the scalar function U(x,y,z) (the potential) that satisfies the conditions of the problem.
Let's chose any particular path from point A(x0,y0,z0) to point B(x,y,z) and define the function U(x,y,z) (the field potential) for point B(x,y,z) as the work of force F performed during an object's movement along a chosen path from A(x0,y0,z0) to B(x,y,z) with a negative sign.
This definition of a potential is quite legitimate since the work performed by force F does not depend on a path chosen, but depends only on position of points A and B.
So, by definition, for any point B(x,y,z) where the field is defined
U(B) = U(x,y,z) = −W[AB]
Let's chose a point C(x+dx,y+dy,z+dz) on infinitesimal distance from point B(x,y,z).
Let dr be a vector of displacement from B to C:
dr = ||dx,dy,dz||
According to the definition of work, when the field intensity force F acts on an object that moves from point B(x,y,z) to an infinitesimally close to it point C(x+dx,y+dy,z+dz), the infinitesimal amount of work performed by the force is equal to
dW[BC] = F(x,y,z)·dr =
= Fx(x,y,z)·dx + Fy(x,y,z)·dy +
+ Fz(x,y,z)·dz
At the same time, using the Lemma D of the previous lecture Work Lemmas, this same amount of work equals to
dW[BC] = W[AC] − W[AB]
where fixed point A(x0,y0,z0) was arbitrarily chosen above.
Expressions W[AB] and W[AC] represent the amount of work the field force performs, if an object moves from point A to B and from points A to C correspondingly. These same amounts were used to define a potential at these point
W[AB] = −U(x,y,z)
W[AC] = −U(x+dx,y+dy,z+dz).
Therefore,
dW[BC]= −U(x+dx,y+dy,z+dz)+
+ U(x,y,z) = −dU(x,y,z)
where dU(x,y,z) is a full differential (infinitesimal increment) of function U(x,y,z) on an interval from B(x,y,z) to C(x+dx,y+dy,z+dz).
As known from Calculus, the full differential of a function can be expressed in terms of partial derivatives and differentials of arguments (see Partial Derivatives - Basic Properties lecture of the Calculus chapter in the course Math 4 Teens on UNIZOR.COM).
Therefore,
dW[BC] = −dU(x,y,z) =
= −(∂U(x,y,z)/∂x)·dx −
− (∂U(x,y,z)/∂y)·dy −
− (∂U(x,y,z)/∂z)·dz
Comparing this expression of dW[BC] with the one in terms of the field force components above, we come to an equation
Fx(x,y,z)·dx + Fy(x,y,z)·dy +
+ Fz(x,y,z)·dz =
= −(∂U(x,y,z)/∂x)·dx −
− (∂U(x,y,z)/∂y)·dy −
− (∂U(x,y,z)/∂z)·dz
While dx, dy and dz are infinitesimal increments of position along some infinitesimal displacement for each coordinate, the direction of this displacement can be chosen freely.
Choosing infinitesimal dx and dy=0, dz=0 leads to an equality
Fx(x,y,z) = −∂U(x,y,z)/∂x
Similarly, leaving only dy or dz as infinitesimal increments and setting displacement along other coordinates to 0, we obtain the equalities
Fy(x,y,z) = −∂U(x,y,z)/∂y
Fz(x,y,z) = −∂U(x,y,z)/∂z
End of Proof
The question now arises, is the field potential U(x,y,z) uniquely defined by the field force intensity?
The answer is NO, since we have chosen point A(x0,y0,z0) as, basically, any fixed reference point where the movement of an object begins.
We can choose any other point A'(x1,y1,z1) as the beginning, and the work W(x,y,z) will be different. More precisely, it will differ by the amount of work the force performs moving an object from A to A'.
This means that our field potential is not uniquely defined by the field force, only its partial derivatives are, since they must correspond to force components. This is similar to the fact that, given a derivative of a function, the function is defined as an integral from a derivative plus some freely chosen constant.
Traditionally, for gravitational or electrostatic fields, as the staring point, physicists choose a point infinitely far from the source of the field force. There the force is equal to zero. With this convention the field potential U(x,y,z) is fully defined and equals to the negative work performed by the force F(x,y,z) to move an object from an infinitely far point to any point P(x,y,z).
Wednesday, October 9, 2024
Physics+ Work Lemmas: UNIZOR.COM - Classic Physics+ - Laws of Newton
Notes to a video lecture on UNIZOR.COM
Field Work Lemmas
In the previous lecture we have introduced the concepts of a field and field intensity force that is equal to a gradient of the field potential.
Also, we have proven that, dealing with such force, the work of this field intensity force along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points.
There is a converse theorem that states that, if the work performed by some force on an object depends on the object's position in the beginning and at the end of its movement and does not depend on a trajectory between these points, then this force can be represented as a gradient of some scalar function, the field potential.
This lecture presents certain auxiliary theorems (lemmas) that will help to prove the above mentioned theorem in the next lecture.
Lemma A
This lemma, in short, is about comparing the work performed by a force, when an object moves along the same trajectory in two opposite directions.
More rigorously, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
The coordinate components of the vector of force F at point (x,y,z) are
||Fx(x,y,z),Fy(x,y,z),Fz(x,y,z)||.
An object is moving along certain trajectory from point A to point B within this area, while the force F acts on it and performs certain work.
The coordinate components of the position vector r are
||x,y,z||.
The coordinate components of the infinitesimal increment, the differential, of the position vector dr are
||dx,dy,dz||.
Compare the work done by the force F along the object's trajectory from point A to point B with the work done when an object moves from point B to point A along the same trajectory in the opposite direction.
Solution A
Recall the definition of work performed by force F on an object moving along a trajectory described by position vector r from starting point A to finishing point B:
W[AB] = ∫[AB]dW = ∫[AB]F·dr
The above integral is an infinite sum of infinitesimal work increments dW(x,y,z) performed by force F(x,y,z) on an infinitesimal interval of a trajectory r(x,y,z) from point (x,y,z) to point (x+dx,y+dy,z+dz), where F·dr is a scalar product of two vectors, so
dW(x,y,z) = F(x,y,z)·dr(x,y,z) =
= Fx·dx+Fy·dy+Fz·dz
Assume, at some moment of time our object, moving from A to B, is at position (x,y,z).
During an infinitesimal increment of time it's new position will be (x+dx,y+dy,z+dz) and the force will do an infinitesimal amount of work
dW = Fx·dx+Fy·dy+Fz·dz.
Now assume that the object moves in the opposite direction from B to A along the same trajectory.
Being at the same position (x,y,z) at some moment in time, the infinitesimal increments dx, dy and dz will have signs opposite to those when an object moved from A to B, while the force vector will be the same.
Therefore, the differential of work dW will also be of an opposite sign, and subsequent integration will result in total work done by the force on trajectory from B to A to have the same magnitude but opposite sign comparing with object movement from A to B.
Answer A
W[AB]=∫[AB]F·dr =
= −∫[BA]F·dr = −W[BA]
Lemma B
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object is moving along certain trajectory within this area and the force F(x,y,z) acts on it, performing some work along its trajectory.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.
It's given that the work of this force on an object moving along any trajectory between any pair of points A and B depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.
Prove that the work this force performs on an object moving along a closed trajectory, when points A and B coincide, equals to zero.
Proof B
In our case of a closed trajectory the endpoints A and B coincide. So, we will deal only with point A.
Choose any closed trajectory, its starting and ending point A and point M on it that does not coincide with point A.
Now we have two different paths from A to M, let's call them path #1 and path #2.
If an object moves along a closed trajectory, it moves from point A to point M along path #1 and then moves from point M to point A along path #2.
According to the condition of the problem, a work W1[AM] performed by a force acting on our object along a path #1 from A to M should be equal to a work W2[AM] performed by a force acting on our object along a path #2 from A to M:
W1[AM] = W2[AM]
As has been proven in the Lemma 1, during the second part of the trajectory, when object moves from M to A along path #2, the work of a force is of the same magnitude as if an object moved from A to M along the same path #2 but with an opposite sign:
W2[MA] = −W2[AM]
Therefore, the total work along path #1 from A to M followed by moving from M to A along path #2 equals to
W1[AM]2[MA] = W1[AM]+W2[MA] =
= W1[AM] − W2[AM] =
= W1[AM] − W1[AM] = 0
Lemma C
This lemma is a converse to Lemma B.
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object is moving along certain trajectory within this area, and the force F(x,y,z) acts on it, performing some work.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.
It's given that the work of this force on an object moving along any closed trajectory that starts and ends at the same point equals to zero.
Prove that the work this force performs on an object moving from any fixed point A to any fixed point B does not depend on trajectory between these points.
Proof C
Choose any two paths from point A to point B - path #1 and path #2.
We will use the symbols introduced in Lemma 2.
As stated in the condition of this lemma,
W1[AB]2[BA] =
= W1[AB] + W2[BA] = 0
At the same time
W2[BA] = −W2[AB]
Therefore,
0 = W1[AB] + W2[BA] =
= W1[AB] − W2[AB]
from which follows that
W1[AB] = W2[AB]
As we see, a statement that the work performed by a force on an object moving along any closed trajectory equals to zero is equivalent to a statement that the work performed by a force on an object moving from any point A to any point B does not depend on a trajectory an object moves between these points.
Lemma D
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
It's given that the work of this force on an object moving along any trajectory between any pair of points depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.
Assume, there are three points in the area where the force is acting, A, B and C.
Consider amounts of work the force performs on an object during its movements between these points
W[AB], W[AC], W[BC].
Prove that
W[BC] = W[AC] − W[AB].
Proof D
From Lemma B above follows that the amount of work performed by a force on an object moving along a closed trajectory from A to B to C and back to A equals to zero
W[AB] + W[BC] + W[CA] = 0
From Lemma A above follows that, if object moves in the opposite direction along the same trajectory, the amount of work performed by a force is the same in magnitude but opposite in sign to the amount of work performed on a directly moving object
W[AC] = −W[CA]
Therefore,
to B to C and back to A equals to zero
W[AB] + W[BC] − W[AC] = 0
from which follows that
W[BC] = W[AC] − W[AB].
Coincidentally, this equality reminds the rule about a difference between two vectors
BC = AC − AB
Field Work Lemmas
In the previous lecture we have introduced the concepts of a field and field intensity force that is equal to a gradient of the field potential.
Also, we have proven that, dealing with such force, the work of this field intensity force along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points.
There is a converse theorem that states that, if the work performed by some force on an object depends on the object's position in the beginning and at the end of its movement and does not depend on a trajectory between these points, then this force can be represented as a gradient of some scalar function, the field potential.
This lecture presents certain auxiliary theorems (lemmas) that will help to prove the above mentioned theorem in the next lecture.
Lemma A
This lemma, in short, is about comparing the work performed by a force, when an object moves along the same trajectory in two opposite directions.
More rigorously, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
The coordinate components of the vector of force F at point (x,y,z) are
||Fx(x,y,z),Fy(x,y,z),Fz(x,y,z)||.
An object is moving along certain trajectory from point A to point B within this area, while the force F acts on it and performs certain work.
The coordinate components of the position vector r are
||x,y,z||.
The coordinate components of the infinitesimal increment, the differential, of the position vector dr are
||dx,dy,dz||.
Compare the work done by the force F along the object's trajectory from point A to point B with the work done when an object moves from point B to point A along the same trajectory in the opposite direction.
Solution A
Recall the definition of work performed by force F on an object moving along a trajectory described by position vector r from starting point A to finishing point B:
W[AB] = ∫[AB]dW = ∫[AB]F·dr
The above integral is an infinite sum of infinitesimal work increments dW(x,y,z) performed by force F(x,y,z) on an infinitesimal interval of a trajectory r(x,y,z) from point (x,y,z) to point (x+dx,y+dy,z+dz), where F·dr is a scalar product of two vectors, so
dW(x,y,z) = F(x,y,z)·dr(x,y,z) =
= Fx·dx+Fy·dy+Fz·dz
Assume, at some moment of time our object, moving from A to B, is at position (x,y,z).
During an infinitesimal increment of time it's new position will be (x+dx,y+dy,z+dz) and the force will do an infinitesimal amount of work
dW = Fx·dx+Fy·dy+Fz·dz.
Now assume that the object moves in the opposite direction from B to A along the same trajectory.
Being at the same position (x,y,z) at some moment in time, the infinitesimal increments dx, dy and dz will have signs opposite to those when an object moved from A to B, while the force vector will be the same.
Therefore, the differential of work dW will also be of an opposite sign, and subsequent integration will result in total work done by the force on trajectory from B to A to have the same magnitude but opposite sign comparing with object movement from A to B.
Answer A
W[AB]=∫[AB]F·dr =
= −∫[BA]F·dr = −W[BA]
Lemma B
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object is moving along certain trajectory within this area and the force F(x,y,z) acts on it, performing some work along its trajectory.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.
It's given that the work of this force on an object moving along any trajectory between any pair of points A and B depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.
Prove that the work this force performs on an object moving along a closed trajectory, when points A and B coincide, equals to zero.
Proof B
In our case of a closed trajectory the endpoints A and B coincide. So, we will deal only with point A.
Choose any closed trajectory, its starting and ending point A and point M on it that does not coincide with point A.
Now we have two different paths from A to M, let's call them path #1 and path #2.
If an object moves along a closed trajectory, it moves from point A to point M along path #1 and then moves from point M to point A along path #2.
According to the condition of the problem, a work W1[AM] performed by a force acting on our object along a path #1 from A to M should be equal to a work W2[AM] performed by a force acting on our object along a path #2 from A to M:
W1[AM] = W2[AM]
As has been proven in the Lemma 1, during the second part of the trajectory, when object moves from M to A along path #2, the work of a force is of the same magnitude as if an object moved from A to M along the same path #2 but with an opposite sign:
W2[MA] = −W2[AM]
Therefore, the total work along path #1 from A to M followed by moving from M to A along path #2 equals to
W1[AM]2[MA] = W1[AM]+W2[MA] =
= W1[AM] − W2[AM] =
= W1[AM] − W1[AM] = 0
Lemma C
This lemma is a converse to Lemma B.
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
An object is moving along certain trajectory within this area, and the force F(x,y,z) acts on it, performing some work.
The object's position at time t is r(t)=||x(t),y(t),z(t)||.
It's given that the work of this force on an object moving along any closed trajectory that starts and ends at the same point equals to zero.
Prove that the work this force performs on an object moving from any fixed point A to any fixed point B does not depend on trajectory between these points.
Proof C
Choose any two paths from point A to point B - path #1 and path #2.
We will use the symbols introduced in Lemma 2.
As stated in the condition of this lemma,
W1[AB]2[BA] =
= W1[AB] + W2[BA] = 0
At the same time
W2[BA] = −W2[AB]
Therefore,
0 = W1[AB] + W2[BA] =
= W1[AB] − W2[AB]
from which follows that
W1[AB] = W2[AB]
As we see, a statement that the work performed by a force on an object moving along any closed trajectory equals to zero is equivalent to a statement that the work performed by a force on an object moving from any point A to any point B does not depend on a trajectory an object moves between these points.
Lemma D
As in Lemma A, assume, some vector of force F(x,y,z) is defined at each point of a certain area within our three-dimensional Euclidean space with Cartesian coordinates.
It's given that the work of this force on an object moving along any trajectory between any pair of points depends only on a choice of these two endpoints and does not depend on a choice of trajectory between them.
Assume, there are three points in the area where the force is acting, A, B and C.
Consider amounts of work the force performs on an object during its movements between these points
W[AB], W[AC], W[BC].
Prove that
W[BC] = W[AC] − W[AB].
Proof D
From Lemma B above follows that the amount of work performed by a force on an object moving along a closed trajectory from A to B to C and back to A equals to zero
W[AB] + W[BC] + W[CA] = 0
From Lemma A above follows that, if object moves in the opposite direction along the same trajectory, the amount of work performed by a force is the same in magnitude but opposite in sign to the amount of work performed on a directly moving object
W[AC] = −W[CA]
Therefore,
to B to C and back to A equals to zero
W[AB] + W[BC] − W[AC] = 0
from which follows that
W[BC] = W[AC] − W[AB].
Coincidentally, this equality reminds the rule about a difference between two vectors
BC = AC − AB
Tuesday, October 1, 2024
Physics+ Field, Potential: UNIZOR.COM - Classic Physics+ - Laws of Newto...
Notes to a video lecture on UNIZOR.COM
Field, Potential
The prototype for an abstract concept of a field presented below is a gravitational field. The force of this field acting on a unit mass is a prototype for a concept of field intensity.
On UNIZOR.COM these concepts were introduced in the Physics 4 Teens course, part Energy, chapters Energy of Gravitational Field and Gravitational Potential.
Here we will formally define these and other concepts and derive a few important field properties.
Field is an area (a subset) of points P{x,y,z} in our three-dimensional space with a vector of force F(P) called field intensity defined at each point P of this area and a real function U(P) called potential defined at exactly the same points, when the following equation between a force and a potential at each point is held:
∀P{x,y,z}: F(P) = −∇U(P)
where symbol ∇ signifies gradient of a function U - a vector of function's partial derivatives by each coordinate ∇U(P) = ∇U(x,y,z) =
= ||∂U/∂x,∂U/∂y,∂U/∂z||
It should be noted that in many cases authors do not differentiate between the field and the field intensity force, defining the field as the force, which in our opinion is misleading.
That's why we define field as an area of a space, which corresponds to the usual meaning of this word, and field intensity as a force acting inside this area.
Let's assume that a material point acted upon by the force of a field is moving during the time t period from t=t1 to t=t2 from point A to point B along some trajectory
P(t) = {x(t),y(t),z(t)}
where P(t1)=A and P(t2)=B
The first important property of a field is that the work of the field intensity force F(P) along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points. In other words, no matter how an object that experiences a field force moves from point A to point B, the work of the force remains the same and depends only on the field potential at end points A and B.
Here is the proof of this statement.
Assume, at moment t1 our object is at point A, and it moves along a trajectoryr(t)=||x(t),y(t),z(t)||  until at moment t2 it reaches point B.
The field intensity F is defined for all points of a field, including the points along the trajectory of an object and is equal toF(t)=F(x(t),y(t),z(t)). 
Work performed by any force during the observed time is, by definition,
Wt∈[t1,t2] = ∫t∈[t1,t2]F(t)·dr(t)
In our case of a field force
F(t) = F(x(t),y(t),z(t)) =
= −∇U(x(t),y(t),z(t)) =
= −||∂U/∂x,∂U/∂y,∂U/∂z||
and
dr(t)=||dx(t),dy(t),dz(t)||
Let's evaluate the scalar product of two vectors under the integral
||∂U/∂x,∂U/∂y,∂U/∂z||·
||dx(t),dy(t),dz(t)|| =
= (∂U/∂x)·dx(t) +
+ (∂U/∂y)·dy(t) +
+ (∂U/∂z)·dz(t) =
= dU(x(t),y(t),z(t)) = dU(t)
Above is a full differential (infinitesimal increment) of function U(t)=U(x(t),y(t),z(t)) on infinitesimal time interval from t to t+dt.
Therefore, the work performed by a field force F(t) equals to
Wt∈[t1,t2] = −∫t∈[t1,t2]dU(t) =
= −U(t2) + U(t1) = U(t1) − U(t2)
As we see, the total amount of work performed by a field intensity force depends only on the field potentials at end points and does not depend on the path (trajectory) an object took to reach from start to finish.
An obvious consequence of this property of the field is that an amount of work the field intensity force performed when an object moves along a trajectory with the finishing point coinciding with the starting one equals to zero.
Recall that amount of work the field intensity force performed when an object moves from one point to another equals to an increment of the object's kinetic energy (see the previous lecture Newton's Laws of this chapter of a course)
Wt∈[t1,t2] = T(t2) − T(t1)
Therefore,
T(t2) − T(t1) = U(t1) − U(t2)
from which follows the Law of Conservation of Energy
T(t1) + U(t1) = T(t2) + U(t2)
It states that the sum of kinetic and potential energy is not changing during an object's movement within a field.
Field, Potential
The prototype for an abstract concept of a field presented below is a gravitational field. The force of this field acting on a unit mass is a prototype for a concept of field intensity.
On UNIZOR.COM these concepts were introduced in the Physics 4 Teens course, part Energy, chapters Energy of Gravitational Field and Gravitational Potential.
Here we will formally define these and other concepts and derive a few important field properties.
Field is an area (a subset) of points P{x,y,z} in our three-dimensional space with a vector of force F(P) called field intensity defined at each point P of this area and a real function U(P) called potential defined at exactly the same points, when the following equation between a force and a potential at each point is held:
∀P{x,y,z}: F(P) = −∇U(P)
where symbol ∇ signifies gradient of a function U - a vector of function's partial derivatives by each coordinate ∇U(P) = ∇U(x,y,z) =
= ||∂U/∂x,∂U/∂y,∂U/∂z||
It should be noted that in many cases authors do not differentiate between the field and the field intensity force, defining the field as the force, which in our opinion is misleading.
That's why we define field as an area of a space, which corresponds to the usual meaning of this word, and field intensity as a force acting inside this area.
Let's assume that a material point acted upon by the force of a field is moving during the time t period from t=t1 to t=t2 from point A to point B along some trajectory
P(t) = {x(t),y(t),z(t)}
where P(t1)=A and P(t2)=B
The first important property of a field is that the work of the field intensity force F(P) along any trajectory of an object moving in the field depends only on the field potential at the beginning and at the end of a trajectory and is independent of a path between these two points. In other words, no matter how an object that experiences a field force moves from point A to point B, the work of the force remains the same and depends only on the field potential at end points A and B.
Here is the proof of this statement.
Assume, at moment t1 our object is at point A, and it moves along a trajectory
The field intensity F is defined for all points of a field, including the points along the trajectory of an object and is equal to
Work performed by any force during the observed time is, by definition,
Wt∈[t1,t2] = ∫t∈[t1,t2]F(t)·dr(t)
In our case of a field force
F(t) = F(x(t),y(t),z(t)) =
= −∇U(x(t),y(t),z(t)) =
= −||∂U/∂x,∂U/∂y,∂U/∂z||
and
dr(t)=||dx(t),dy(t),dz(t)||
Let's evaluate the scalar product of two vectors under the integral
||∂U/∂x,∂U/∂y,∂U/∂z||·
||dx(t),dy(t),dz(t)|| =
= (∂U/∂x)·dx(t) +
+ (∂U/∂y)·dy(t) +
+ (∂U/∂z)·dz(t) =
= dU(x(t),y(t),z(t)) = dU(t)
Above is a full differential (infinitesimal increment) of function U(t)=U(x(t),y(t),z(t)) on infinitesimal time interval from t to t+dt.
Therefore, the work performed by a field force F(t) equals to
Wt∈[t1,t2] = −∫t∈[t1,t2]dU(t) =
= −U(t2) + U(t1) = U(t1) − U(t2)
As we see, the total amount of work performed by a field intensity force depends only on the field potentials at end points and does not depend on the path (trajectory) an object took to reach from start to finish.
An obvious consequence of this property of the field is that an amount of work the field intensity force performed when an object moves along a trajectory with the finishing point coinciding with the starting one equals to zero.
Recall that amount of work the field intensity force performed when an object moves from one point to another equals to an increment of the object's kinetic energy (see the previous lecture Newton's Laws of this chapter of a course)
Wt∈[t1,t2] = T(t2) − T(t1)
Therefore,
T(t2) − T(t1) = U(t1) − U(t2)
from which follows the Law of Conservation of Energy
T(t1) + U(t1) = T(t2) + U(t2)
It states that the sum of kinetic and potential energy is not changing during an object's movement within a field.
Sunday, September 29, 2024
Physics+ Recap of Newton's Laws: UNIZOR.COM - Classic Physics+ - Laws of...
Notes to a video lecture on UNIZOR.COM
Recap of Newton's Laws
Laws of Mechanics introduced by Newton assume existing and usage for our purposes so called inertial frames of reference (systems of coordinates) where these laws are held. These frames of reference allow to express the position and velocity of objects under observation.
Also assumed is a concept of force as the action that affects the movements of physical objects.
The important items to be considered when discussing Newton's Laws are the concepts of a material point (an object of zero dimensions but having some inertial mass), space coordinates (usually, Euclidean coordinates in three-dimensional space) and velocities (vectors, components of which are derivatives of corresponding coordinates).
We will usually use symbol m for inertial mass of an object, vectorr=(x,y,z)  for Euclidean coordinates, vector v=(vx,vy,vz)  for velocity vector and vector p=m·v  for momentum of an object of mass m moving with velocity vector v.
We will further assume that space coordinates and velocities are functions of time t andv(t)=dr(t)/dt. 
The First Newton's Law states that a material point that is not acted upon by external forces maintains constant velocity, so
dv(t)/dt = 0
This law can be derived from the Second Newton's Law that deals with vectors of forcesF=(Fx,Fy,Fz). 
This law relates the vector of force and a speed of change in the momentum of an object upon which this force acts.
dp(t)/dt = d(m·v(t))/dt = F(t)
By multiplying all terms by dt, this same law can be also expressed in terms of infinitesimal increment of momentum during an infinitesimal time interval as a result of an impulse of a force, the product of a force vector by that same infinitesimal time interval:
dp(t) = d(m·v(t)) = F(t)·dt
Finally, the Third Newton's Law states that action of one object upon another is always symmetrical. If the force FAB is exerted by object A upon object B, the same by magnitude and opposite by direction force FBA is exerted by object B upon object A.
FAB = −FBA
Important notes
(a) Mass is additive. The mass of two objects combined together is a sum of their masses.
(b) Vectors of forces acting on the same object can be added by the rules of vector algebra, resulting in one vector, whose action is the same as a combination of actions of individual forces.
(c) Each vector equation mentioned above can be broken into three individual equations for each coordinate.
(d) All statements and equations above should be taken as axioms, because they are in good agreement with our day-to-day practice. They represent a theoretical model that we can study further and, based on them, derive numerous properties of moving objects.
Conservation of momentum
Let's do the following experiment.
Two material points A and B are connected with a massless rigid rod. We take this pair and throw it out to open space, where no other forces act on these two material points except a force of one object upon another via a rigid rod that connects them.
Let's analyze the change of momentum of these two objects with time.
Assume, at time t1 the momentum of our objects are pA(t1) and pB(t1). As time goes, our objects move in space in some way that depends on initial push and subsequent interaction with each other via a rod that connects them. At the end of our experiment at time t2 our objects have momentum pA(t2) and pB(t2).
Momentum pA(t2) is the result of an object A initial momentum pA(t1) and combined (that is, integrated) infinitesimal increments of this momentum during the experiment
pA(t2) = pA(t1) + ∫t∈[t1,t2]dpA(t)
As mentioned above, in general,
dp(t) = F(t)·dt
In our case the only force acting on object A is the force FBA(t) exerted by object B.
Therefore,
pA(t2) = pA(t1) + ∫t∈[t1,t2]FBA(t)·dt
Similarly, considering an object B and force acting on it from object A, we have
pB(t2) = pB(t1) + ∫t∈[t1,t2]FAB(t)·dt
Combining these two statements to have a total momentum of the system of these two objects in the beginning and at the end of experiment and taking into consideration the Third Newton's LawFAB=−FBA , we obtain
pA(t2)+pB(t2) = pA(t1)+pB(t1)
That is, the total momentum of a closed system (no external forces) is constant.
This is a simple derivation of the Law of Conservation of Momentum.
Granted, it is proven here only in a simple case of two objects, but the proof can be easily extended to a case of any closed system with any number of objects acting upon each other without external forces.
Work and Kinetic Energy
Assume that a material point moves in three-dimensional Euclidean space from time t1 to time t2 along a trajectory described by time-dependent vector r(t) and there is a time-dependent vector of force F(t) acting upon it.
Work performed by this force during the observed time is, by definition,
Wt∈[t1,t2] = ∫t∈[t1,t2]F(t)·dr(t)
Notice that differential of a vector r(t) is a velocity vector v(t) multiplied by differential of time dt.
Also notice that in the formula above we deal with a scalar (dot) product of two vectors - F(t) and dr(t).
According to Newton's Second Law,
dp(t)/dt = F(t)
Also,
dr(t) = v·dt = (p/m)·dt
Therefore,
Wt∈[t1,t2] = ∫t∈[t1,t2](1/m)·p(t)·dp(t)
Integrating this, we obtain
Wt∈[t1,t2]=(1/m)[p²(t2)−p²(t1)]/2=
=m·[v²(t2)−v²(t1)]/2 = T(t2)−T(t1)
where T(t)=m·v²(t)/2 is kinetic energy of an object of mass m moving with velocity v(t).
In other words, work performed by a force equals to an increment of kinetic energy of an object this force acts upon.
Another formulation might be that work performed upon an object is transformed into its kinetic energy.
Recap of Newton's Laws
Laws of Mechanics introduced by Newton assume existing and usage for our purposes so called inertial frames of reference (systems of coordinates) where these laws are held. These frames of reference allow to express the position and velocity of objects under observation.
Also assumed is a concept of force as the action that affects the movements of physical objects.
The important items to be considered when discussing Newton's Laws are the concepts of a material point (an object of zero dimensions but having some inertial mass), space coordinates (usually, Euclidean coordinates in three-dimensional space) and velocities (vectors, components of which are derivatives of corresponding coordinates).
We will usually use symbol m for inertial mass of an object, vector
We will further assume that space coordinates and velocities are functions of time t and
The First Newton's Law states that a material point that is not acted upon by external forces maintains constant velocity, so
dv(t)/dt = 0
This law can be derived from the Second Newton's Law that deals with vectors of forces
This law relates the vector of force and a speed of change in the momentum of an object upon which this force acts.
dp(t)/dt = d(m·v(t))/dt = F(t)
By multiplying all terms by dt, this same law can be also expressed in terms of infinitesimal increment of momentum during an infinitesimal time interval as a result of an impulse of a force, the product of a force vector by that same infinitesimal time interval:
dp(t) = d(m·v(t)) = F(t)·dt
Finally, the Third Newton's Law states that action of one object upon another is always symmetrical. If the force FAB is exerted by object A upon object B, the same by magnitude and opposite by direction force FBA is exerted by object B upon object A.
FAB = −FBA
Important notes
(a) Mass is additive. The mass of two objects combined together is a sum of their masses.
(b) Vectors of forces acting on the same object can be added by the rules of vector algebra, resulting in one vector, whose action is the same as a combination of actions of individual forces.
(c) Each vector equation mentioned above can be broken into three individual equations for each coordinate.
(d) All statements and equations above should be taken as axioms, because they are in good agreement with our day-to-day practice. They represent a theoretical model that we can study further and, based on them, derive numerous properties of moving objects.
Conservation of momentum
Let's do the following experiment.
Two material points A and B are connected with a massless rigid rod. We take this pair and throw it out to open space, where no other forces act on these two material points except a force of one object upon another via a rigid rod that connects them.
Let's analyze the change of momentum of these two objects with time.
Assume, at time t1 the momentum of our objects are pA(t1) and pB(t1). As time goes, our objects move in space in some way that depends on initial push and subsequent interaction with each other via a rod that connects them. At the end of our experiment at time t2 our objects have momentum pA(t2) and pB(t2).
Momentum pA(t2) is the result of an object A initial momentum pA(t1) and combined (that is, integrated) infinitesimal increments of this momentum during the experiment
pA(t2) = pA(t1) + ∫t∈[t1,t2]dpA(t)
As mentioned above, in general,
dp(t) = F(t)·dt
In our case the only force acting on object A is the force FBA(t) exerted by object B.
Therefore,
pA(t2) = pA(t1) + ∫t∈[t1,t2]FBA(t)·dt
Similarly, considering an object B and force acting on it from object A, we have
pB(t2) = pB(t1) + ∫t∈[t1,t2]FAB(t)·dt
Combining these two statements to have a total momentum of the system of these two objects in the beginning and at the end of experiment and taking into consideration the Third Newton's Law
pA(t2)+pB(t2) = pA(t1)+pB(t1)
That is, the total momentum of a closed system (no external forces) is constant.
This is a simple derivation of the Law of Conservation of Momentum.
Granted, it is proven here only in a simple case of two objects, but the proof can be easily extended to a case of any closed system with any number of objects acting upon each other without external forces.
Work and Kinetic Energy
Assume that a material point moves in three-dimensional Euclidean space from time t1 to time t2 along a trajectory described by time-dependent vector r(t) and there is a time-dependent vector of force F(t) acting upon it.
Work performed by this force during the observed time is, by definition,
Wt∈[t1,t2] = ∫t∈[t1,t2]F(t)·dr(t)
Notice that differential of a vector r(t) is a velocity vector v(t) multiplied by differential of time dt.
Also notice that in the formula above we deal with a scalar (dot) product of two vectors - F(t) and dr(t).
According to Newton's Second Law,
dp(t)/dt = F(t)
Also,
dr(t) = v·dt = (p/m)·dt
Therefore,
Wt∈[t1,t2] = ∫t∈[t1,t2](1/m)·p(t)·dp(t)
Integrating this, we obtain
Wt∈[t1,t2]=(1/m)[p²(t2)−p²(t1)]/2=
=m·[v²(t2)−v²(t1)]/2 = T(t2)−T(t1)
where T(t)=m·v²(t)/2 is kinetic energy of an object of mass m moving with velocity v(t).
In other words, work performed by a force equals to an increment of kinetic energy of an object this force acts upon.
Another formulation might be that work performed upon an object is transformed into its kinetic energy.
Friday, September 27, 2024
Physics+ Introduction: UNIZOR.COM - Physics+
Notes to a video lecture on UNIZOR.COM
Classic Physics+ Introduction
This course contains material not usually addressed in high school course of Physics. However, it's still a part of Classic Physics, and it's essential to understand the concepts presented here, as they play a very important role in contemporary Physics.
In the previous course Physics 4 Teens, part Waves, chapter Phenomena of Light, lecture Angle Refraction we have mentioned the intuitively understandable and natural Fermat's Principle of the Least Time.
Based on this principled we derived the optimal trajectory and angle of refraction of the ray of light going from one medium to another with a different refraction index.
Briefly speaking, if a ray of light moves along certain trajectory from point A to point B, the time it spends during this movement should be less than if it moved along any other trajectory.
If both points A and B are in empty space, the trajectory will be a straight line.
If, however, there are different media between them and, consequently the light propagates there with different speed, the Principle of Least Time can help to determine the angle of refraction on each change of medium along a trajectory to minimize the time to travel.
What's important to pay attention to in this phenomena is that there are many trajectories to reach point B from point A, but the ray of light chooses the one that minimizes certain numerical characteristic that depends on an entire trajectory - time of travel.
There is nothing wrong with application of Newton's Laws to find the trajectory of movement, but in many practical cases the complexity of such an approach is very high, so it would be quite a challenging endeavor.
Generalizing from the above example, the points A and B might not be real points in our three-dimensional Euclidean space, but some numerical characteristics of a state of a physical system under our observation. It can be a combination of spherical coordinates and velocities, for example, or positions relative to the center of our galaxy and impulses etc.
In any case, it's intuitively easy to accept that the change of a system from one state to another should be going along such a trajectory that minimizes or maximizing some numerical characteristic of an entire trajectory.
We will introduce a quantity that depends on an entire trajectory of movement of a physical system from one state to another. Stationary value (minimum, maximum, saddle point) of this quantity characterizes the trajectory of movement, which is similar to the Fermat's principle that a trajectory of the ray of light is the one that minimizes the time light travels from one point to another.
At the end of 18th century Italian-French mathematician Joseph-Louis Lagrange has developed exactly this theory, suggested a function that depends on system's characteristics and shown that finding the stationary point of this function leads to system of differential equations identical to Newtonian Laws.
This function was called action.
What was quite important, this approach to finding the trajectory of complex systems significantly simplified the calculations comparing to directly applying Newton's Laws.
Yet another approach, based on Lagrange work, was suggested by Irish mathematician and astronomer William Hamilton in 1833. His approach allowed to build a system that successfully bridged Classic Physics with Quantum one.
Details of both Lagrangian and Hamiltonian approach to formulate Classic Mechanics are the subject of this course.
Classic Physics+ Introduction
This course contains material not usually addressed in high school course of Physics. However, it's still a part of Classic Physics, and it's essential to understand the concepts presented here, as they play a very important role in contemporary Physics.
In the previous course Physics 4 Teens, part Waves, chapter Phenomena of Light, lecture Angle Refraction we have mentioned the intuitively understandable and natural Fermat's Principle of the Least Time.
Based on this principled we derived the optimal trajectory and angle of refraction of the ray of light going from one medium to another with a different refraction index.
Briefly speaking, if a ray of light moves along certain trajectory from point A to point B, the time it spends during this movement should be less than if it moved along any other trajectory.
If both points A and B are in empty space, the trajectory will be a straight line.
If, however, there are different media between them and, consequently the light propagates there with different speed, the Principle of Least Time can help to determine the angle of refraction on each change of medium along a trajectory to minimize the time to travel.
What's important to pay attention to in this phenomena is that there are many trajectories to reach point B from point A, but the ray of light chooses the one that minimizes certain numerical characteristic that depends on an entire trajectory - time of travel.
There is nothing wrong with application of Newton's Laws to find the trajectory of movement, but in many practical cases the complexity of such an approach is very high, so it would be quite a challenging endeavor.
Generalizing from the above example, the points A and B might not be real points in our three-dimensional Euclidean space, but some numerical characteristics of a state of a physical system under our observation. It can be a combination of spherical coordinates and velocities, for example, or positions relative to the center of our galaxy and impulses etc.
In any case, it's intuitively easy to accept that the change of a system from one state to another should be going along such a trajectory that minimizes or maximizing some numerical characteristic of an entire trajectory.
We will introduce a quantity that depends on an entire trajectory of movement of a physical system from one state to another. Stationary value (minimum, maximum, saddle point) of this quantity characterizes the trajectory of movement, which is similar to the Fermat's principle that a trajectory of the ray of light is the one that minimizes the time light travels from one point to another.
At the end of 18th century Italian-French mathematician Joseph-Louis Lagrange has developed exactly this theory, suggested a function that depends on system's characteristics and shown that finding the stationary point of this function leads to system of differential equations identical to Newtonian Laws.
This function was called action.
What was quite important, this approach to finding the trajectory of complex systems significantly simplified the calculations comparing to directly applying Newton's Laws.
Yet another approach, based on Lagrange work, was suggested by Irish mathematician and astronomer William Hamilton in 1833. His approach allowed to build a system that successfully bridged Classic Physics with Quantum one.
Details of both Lagrangian and Hamiltonian approach to formulate Classic Mechanics are the subject of this course.
Sunday, September 22, 2024
Matrices+ 03 - Eigenvalues in 3D: UNIZOR.COM - Math+ & Problems - Matrices
Notes to a video lecture on http://www.unizor.com
Matrices+ 02
Eigenvalues in 3D
Problem A
Find all eigenvalues and eigenvectors of this 3⨯3 matrix, if it's known that one of the eigenvalues is 10.
Note A
We specify one of the eigenvalues because general calculation of all eigenvalues in 3D leads to a polynomial equation of the 3rd degree.
Since we want to avoid the necessity to solve it, we specify one eigenvalue, which leads to finding the other two by solving a quadratic equation, which should not present any problem.
Solution A
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form for 3⨯3 matrix
which is equivalent to
(a1,1−λ)·v1+a1,2·v2+a1,3·v3 = 0
a2,1·v1+(a2,2−λ)·v2+a2,3·v3 = 0
a3,1·v1+a3,2·v2+(a3,3−λ)·v3 = 0
This is a system of three linear equations with four unknowns λ, v1, v2 and v3.
One trivial solution would be v1=0, v2=0 and v3=0, in which case λ can take any value.
This is not a case worthy of analyzing.
If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.
Therefore, a necessary condition for existence of eigenvectors v other than null-vector is
det(A) =
= (a1,1−λ)·(a2,2−λ)·(a3,3−λ)+
+a2,1·a3,2·a1,3+a1,2·a2,3·a3,1−
−a1,3·(a2,2−λ)·a3,1−
−a1,2·a2,1·(a3,3−λ)−
−(a1,1−λ)·a2,3·a3,2 = 0
Using values of a matrix of this problem, this equation is
(4−λ)·(−2−λ)·(−10−λ)+
+8·(−4)·(−8)+4·(−4)·(−4)−
−(−4)·(−2−λ)·(−8)−
−8·4·(−10−λ)−
−(4−λ)·(−4)·(−4) = 0
Simplifying this equation, we get
−λ³−8·λ²+108·λ+720=0
First of all, we can check if the eigenvalue 10 given in the problem is the root of this equation.
Indeed, the following is true.
−10³−8·10²+108·10+720=0
Since we know one root λ1=10 of cubic equation, we can represent the left side of this equation as a product of (λ−10) and a quadratic polynomial with easily calculated coefficients, getting equation
(λ−10)·(−λ²−18·λ−72) = 0
To get all the roots of the original cubic equation, we have to solve a quadratic equation
−λ²−18·λ−72 = 0
or, using a canonical form,
λ²+18·λ+72 = 0
Its roots are
λ2,3 = −9±√81−72 = −9±3
So, we have three eigenvalues for our matrix: −12, −6 and 10.
Consider now that we have determined eigenvalue λ and would like to find eigenvector v=||v1,v2,v3|| transformed into a collinear one by matrix A with this exact factor of change in magnitude.
If some vector v=||v1,v2,v3|| that is transformed into a collinear one with a factor λ exists, vector s·v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
= s·(A·v) = s·(λ·v)= λ·(s·v)
Therefore, we don't need to determine exact values v1, v2 and v3, we just need to determine only the direction of vectorv=||v1,v2,v3||. 
Let's start with the first eigenvalue λ1=−12.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−12·v1
4·v1−2·v2−4·v3=−12·v2
−8·v1−4·v2−10·v3=−12·v3
Bringing everything to the left side, get this system
16·v1+8·v2−4·v3=0
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0
As expected, the determinant of the coefficients of this system is zero and this system of equations is linearly dependent (third equation multiplied by −2 gives the first), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0
Divide by 2 all of them to simplify and find v3 from the second equation
2·v1+5·v2−2·v3=0
v3 = 4·v1+2·v2
Substituting v3 into the first equation:
2·v1+5·v2−2·(4·v1+2·v2)=0
or
−6·v1+v2=0
Therefore,
v2=6·v1 and v3 = 16·v1
Regardless of the value of v1, vector ||v1,6·v1,16·v1|| is an eigenvector.
Set v1=1 for simplicity, and vector ||1,6,16|| should be an eigenvector.
Let's check it out.
Which confirms that vector ||1,6,16|| (and any collinear to it) is an eigenvector with −12 as its eigenvalue.
Let's do the same calculations with the second eigenvalue λ2=−6.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−6·v1
4·v1−2·v2−4·v3=−6·v2
−8·v1−4·v2−10·v3=−6·v3
Bringing everything to the left side, get this system
10·v1+8·v2−4·v3=0
4·v1+4·v2−4·v3=0
−8·v1−4·v2−4·v3=0
Dividing the first equation by 2, the second - by 4 and the third - by −4, all these equations yield a simpler system
5·v1+4·v2−2·v3=0
v1+v2−v3=0
2·v1+v2+v3=0
As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the third equation equals the first minus triple the second one), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1+v2−v3=0
2·v1+v2+v3=0
Find v3 from the first equation
v3 = v1+v2
Substituting v3 into the second equation:
2·v1+v2+(v1+v2)=0
or
3·v1+2·v2=0
Therefore,
v2=−(3/2)·v1 and v3 = −(1/2)·v1
Regardless of the value of v1, vector ||v1,−(3/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,−3,−1|| should be an eigenvector.
Let's check it out.
Which confirms that vector ||2,−3,−1|| (and any collinear to it) is an eigenvector with −6 as its eigenvalue.
Finally, let's do the same for the third eigenvalue 10.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=10·v1
4·v1−2·v2−4·v3=10·v2
−8·v1−4·v2−10·v3=10·v3
Bringing everything to the left side, get this system
−6·v1+8·v2−4·v3=0
4·v1−12·v2−4·v3=0
−8·v1−4·v2−20·v3=0
Dividing the first equation by −2, the second - by 4 and the third - by −4, all these equations yield a simpler system
3·v1−4·v2+2·v3=0
v1−3·v2−v3=0
2·v1+v2+5·v3=0
As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the first equation multiplied by 7 equals the first multiplied by 11 plus the third multiplied by 5), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1−3·v2−v3=0
2·v1+v2+5·v3=0
Find v3 from the first equation
v3 = v1−3·v2
Substituting v3 into the second equation:
2·v1+v2+5·(v1−3·v2)=0
or
7·v1−14·v2=0
or
v1−2·v2=0
Therefore,
v2=(1/2)·v1 and v3 = −(1/2)·v1
Regardless of the value of v1, vector ||v1,(1/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,1,−1|| should be an eigenvector.
Let's check it out.
Which confirms that vector ||2,1,−1|| (and any collinear to it) is an eigenvector with 10 as its eigenvalue.
Answer A
Matrix
has three eigenvalues:
−12, −6 and 10.
Their corresponding eigenvectors are:
||1,6,16||, ||2,−3,−1|| and ||2,1,−1||.
Of cause, any vector collinear to a particular eigenvector would also be an eigenvector with the same eigenvalue.
Matrices+ 02
Eigenvalues in 3D
Problem A
Find all eigenvalues and eigenvectors of this 3⨯3 matrix, if it's known that one of the eigenvalues is 10.
| 4 | 8 | −4 | 
| 4 | −2 | −4 | 
| −8 | −4 | −10 | 
Note A
We specify one of the eigenvalues because general calculation of all eigenvalues in 3D leads to a polynomial equation of the 3rd degree.
Since we want to avoid the necessity to solve it, we specify one eigenvalue, which leads to finding the other two by solving a quadratic equation, which should not present any problem.
Solution A
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form for 3⨯3 matrix
| 
 | 
 | 
 | 
 | 
 | 
 | 
 | 
(a1,1−λ)·v1+a1,2·v2+a1,3·v3 = 0
a2,1·v1+(a2,2−λ)·v2+a2,3·v3 = 0
a3,1·v1+a3,2·v2+(a3,3−λ)·v3 = 0
This is a system of three linear equations with four unknowns λ, v1, v2 and v3.
One trivial solution would be v1=0, v2=0 and v3=0, in which case λ can take any value.
This is not a case worthy of analyzing.
If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.
Therefore, a necessary condition for existence of eigenvectors v other than null-vector is
det(A) =
= (a1,1−λ)·(a2,2−λ)·(a3,3−λ)+
+a2,1·a3,2·a1,3+a1,2·a2,3·a3,1−
−a1,3·(a2,2−λ)·a3,1−
−a1,2·a2,1·(a3,3−λ)−
−(a1,1−λ)·a2,3·a3,2 = 0
Using values of a matrix of this problem, this equation is
(4−λ)·(−2−λ)·(−10−λ)+
+8·(−4)·(−8)+4·(−4)·(−4)−
−(−4)·(−2−λ)·(−8)−
−8·4·(−10−λ)−
−(4−λ)·(−4)·(−4) = 0
Simplifying this equation, we get
−λ³−8·λ²+108·λ+720=0
First of all, we can check if the eigenvalue 10 given in the problem is the root of this equation.
Indeed, the following is true.
−10³−8·10²+108·10+720=0
Since we know one root λ1=10 of cubic equation, we can represent the left side of this equation as a product of (λ−10) and a quadratic polynomial with easily calculated coefficients, getting equation
(λ−10)·(−λ²−18·λ−72) = 0
To get all the roots of the original cubic equation, we have to solve a quadratic equation
−λ²−18·λ−72 = 0
or, using a canonical form,
λ²+18·λ+72 = 0
Its roots are
λ2,3 = −9±√81−72 = −9±3
So, we have three eigenvalues for our matrix: −12, −6 and 10.
Consider now that we have determined eigenvalue λ and would like to find eigenvector v=||v1,v2,v3|| transformed into a collinear one by matrix A with this exact factor of change in magnitude.
If some vector v=||v1,v2,v3|| that is transformed into a collinear one with a factor λ exists, vector s·v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
= s·(A·v) = s·(λ·v)= λ·(s·v)
Therefore, we don't need to determine exact values v1, v2 and v3, we just need to determine only the direction of vector
Let's start with the first eigenvalue λ1=−12.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−12·v1
4·v1−2·v2−4·v3=−12·v2
−8·v1−4·v2−10·v3=−12·v3
Bringing everything to the left side, get this system
16·v1+8·v2−4·v3=0
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0
As expected, the determinant of the coefficients of this system is zero and this system of equations is linearly dependent (third equation multiplied by −2 gives the first), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
4·v1+10·v2−4·v3=0
−8·v1−4·v2+2·v3=0
Divide by 2 all of them to simplify and find v3 from the second equation
2·v1+5·v2−2·v3=0
v3 = 4·v1+2·v2
Substituting v3 into the first equation:
2·v1+5·v2−2·(4·v1+2·v2)=0
or
−6·v1+v2=0
Therefore,
v2=6·v1 and v3 = 16·v1
Regardless of the value of v1, vector ||v1,6·v1,16·v1|| is an eigenvector.
Set v1=1 for simplicity, and vector ||1,6,16|| should be an eigenvector.
Let's check it out.
| 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
Let's do the same calculations with the second eigenvalue λ2=−6.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=−6·v1
4·v1−2·v2−4·v3=−6·v2
−8·v1−4·v2−10·v3=−6·v3
Bringing everything to the left side, get this system
10·v1+8·v2−4·v3=0
4·v1+4·v2−4·v3=0
−8·v1−4·v2−4·v3=0
Dividing the first equation by 2, the second - by 4 and the third - by −4, all these equations yield a simpler system
5·v1+4·v2−2·v3=0
v1+v2−v3=0
2·v1+v2+v3=0
As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the third equation equals the first minus triple the second one), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1+v2−v3=0
2·v1+v2+v3=0
Find v3 from the first equation
v3 = v1+v2
Substituting v3 into the second equation:
2·v1+v2+(v1+v2)=0
or
3·v1+2·v2=0
Therefore,
v2=−(3/2)·v1 and v3 = −(1/2)·v1
Regardless of the value of v1, vector ||v1,−(3/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,−3,−1|| should be an eigenvector.
Let's check it out.
| 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
Finally, let's do the same for the third eigenvalue 10.
Using it in the equation A·v=λ·v, we have the following system of equations
4·v1+8·v2−4·v3=10·v1
4·v1−2·v2−4·v3=10·v2
−8·v1−4·v2−10·v3=10·v3
Bringing everything to the left side, get this system
−6·v1+8·v2−4·v3=0
4·v1−12·v2−4·v3=0
−8·v1−4·v2−20·v3=0
Dividing the first equation by −2, the second - by 4 and the third - by −4, all these equations yield a simpler system
3·v1−4·v2+2·v3=0
v1−3·v2−v3=0
2·v1+v2+5·v3=0
As expected, the determinant of the coefficients of this system is zero and the equations are linearly dependent (the first equation multiplied by 7 equals the first multiplied by 11 plus the third multiplied by 5), as it should be, because otherwise the only solution would be a null-vector.
So, we drop the first equation and resolve the remaining two equations for v2 and v3 in terms of v1.
Here is what remains
v1−3·v2−v3=0
2·v1+v2+5·v3=0
Find v3 from the first equation
v3 = v1−3·v2
Substituting v3 into the second equation:
2·v1+v2+5·(v1−3·v2)=0
or
7·v1−14·v2=0
or
v1−2·v2=0
Therefore,
v2=(1/2)·v1 and v3 = −(1/2)·v1
Regardless of the value of v1, vector ||v1,(1/2)·v1,−(1/2)·v1|| is an eigenvector.
Set v1=2 for simplicity, and vector ||2,1,−1|| should be an eigenvector.
Let's check it out.
| 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
Answer A
Matrix
| 
 | 
−12, −6 and 10.
Their corresponding eigenvectors are:
||1,6,16||, ||2,−3,−1|| and ||2,1,−1||.
Of cause, any vector collinear to a particular eigenvector would also be an eigenvector with the same eigenvalue.
Monday, September 16, 2024
Matrices+ 02 - Eigenvalues: UNIZOR.COM - Math+ & Problems - Matrices
Notes to a video lecture on http://www.unizor.com
Matrices+ 02
Matrix Eigenvalues
The concepts addressed in this lecture for two-dimensional real case are as well applicable to N-dimensional spaces and even to real or complex abstract vector spaces with linear transformations defined there.
Presentation in a two-dimensional real space is chosen for its relative simplicity and easy exemplification.
Let's consider a 2⨯2 matrix A as a linear operator in the two-dimensional Euclidean vector space. In other words, multiplication of any vector v on a coordinate plane by this 2⨯2 matrix A linearly transforms it into another vector on the plane w=A·v.
Assume, matrix A is
Let's see how this linear operator works, if applied to different vectors.
We will use a row-vector notation in the text for compactness, but column-vector notation in the transformation examples below.
The coordinates of our vectors we will enclose into double bars, like matrices, because a row-vector is a matrix with only one row, and a column-vector is a matrix with only one column.
Our first example of a vector to apply this linear transformation is v=||1,1||.
Obviously, the resulting vector w=||11,16||  and the original one v=||1,1||  are not collinear.
Applied to a different vectorv=||3,−2||,  we obtain somewhat unexpected result
Interestingly, the resulting vector w=||3,−2||  and the original one are the same. So, this operator leaves this particular vector in place. In other words, it retains the direction of this vector and multiplies its magnitude by a factor of 1.
Finally, let's applied our operator to a vectorv=||2,3||. 
Notice, the resulting vector w=||28,42||  is the original one v=||2,3||  multiplied by 14. So, this operator transforms this particular vector to a collinear one, just longer in magnitude by a factor of 14.
As we see, for this particular matrix we found two vectors that, if transformed by this matrix as by a linear operator, retain their direction, while change the magnitude by some factor.
These vectors are called eigenvectors. For each eigenvector there is a factor that characterizes the change in its magnitude if this matrix acts on it as an operator. This factor is called eigenvalue. This eigenvalue in the example above was 1 forv=||3,−2||,  and 14 for v=||2,3||. 
There are some questions one might ask.
1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
2. If yes, how to find them and how to find the corresponding multiplication factors?
3. How many such vectors exist, if any?
4. How to find all the multiplication factors for a particular matrix transformation?
Let's analyze the linear transformation by a matrix that leaves the direction of a vector without change, just changes the magnitude by some factor λ.
Assume, we have a matrix A=||ai,j||, where i,j∈{1,2}, in our two-dimensional Euclidean space.
This matrix converts any vectorv=||v1,v2||  into some other vector, but we are looking for such vector v that is converted by this matrix into a collinear one.
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form
which is equivalent to
(a1,1−λ)·v1 + a1,2·v2 = 0
a2,1·v1 + (a2,2−λ)·v2 = 0
This is a system of two linear equations with three unknowns λ, v1 and v2.
One trivial solution would be v1=0 and v2=0, in which case λ can take any value.
This is not a case worthy of analyzing.
If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.
Therefore, a necessary condition for existence of other than null-vector v is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D =
=(a1,1+a2,2)²−4·(a1,1·a2,2−a1,2·a2,1)
=(a1,1−a2,2)²+4·a1,2·a2,1
If D is negative, there are no real solutions for λ.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
Consider now that we have determined λ and would like to find vectors transformed into collinear ones by matrix A with this exact factor of change in magnitude.
If some vector v=||v1,v2|| that is transformed into a collinear one with a factor λ exists, vector s·v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
= s·(A·v) = s·(λ·v)= λ·(s·v)
Therefore, we don't need to determine exact values v1 and v2, we just need to determine only the direction of vectorv=||v1,v2||,  and this direction is determined by the factor v1/v2 or v2/v1 (to cover all cases, when one of them might be zero).
If v2≠0, the directions of a vector v and that of vector ||v1/v2,1|| are the same.
If v1≠0, the directions of a vector v and that of vector ||1,v2/v1|| are the same.
From this follows that, firstly, we can search for eigenvectors among those with v2≠0, restricting our search to vectors ||x=v1/v2,1||.
Then we can search for eigenvectors among those with v1≠0, restricting our search to vectors ||1,x=v1/v2||.
In both cases we will have to solve a system of two linear equations with two unknowns λ and x.
Searching for vectors ||x,1||
In this case the matrix equation that might deliver the required vector looks like this
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·x+a1,2·1 = λ·x
a2,1·x+a2,2·1 = λ·1
Take the right side of the second equation λ and substitute into the right side of the first equation, obtaining a quadratic equation for x:
a1,1·x+a1,2 = (a2,1·x+a2,2)·x
or
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||x1,1|| and ||x2,1||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.
Searching for vectors ||1,x||
In this case the matrix equation that might deliver the required vector looks like this
Performing the matrix by vector multiplication on the left side and scalar by vector on the right side and equating each component, we obtain a system of two equations with two unknowns - λ and x:
a1,1·1+a1,2·x = λ·1
a2,1·1+a2,2·x = λ·x
Take the right side of the first equation λ and substitute into the right side of the second equation, obtaining a quadratic equation for x:
a2,1·1+a2,2·x = (a1,1·1+a1,2·x)·x
or
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||1,x1|| and ||1,x2||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.
Once again, let's emphasize important definitions.
Vectors transformed into collinear ones by a matrix of transformation are called eigenvectors or characteristic vectors for this matrix.
The factor λ corresponding to some eigenvector is called eigenvalue or characteristic value of the matrix and this eigenvector.
Let's determine eigenvectors and eigenvalues for a matrix A
used as an example above.
The quadratic equation to determine the multiplier λ for this matrix is
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
which amounts to
λ² − 15λ + 14 = 0
with solutions
λ1 = 1 and λ2 = 14
Let's find the eigenvectors of this matrix.
The quadratic equation for eigenvectors of type ||x,1|| is
6x² + (10−5)x − 6 = 0 or
6x² + 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(−5±√25+4·36) =
= (1/12)·(−5±13)
Therefore,
x1 = 2/3
x2 = −3/2
Two eigenvectors are:
v1 = ||2/3,1|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||−3/2,1|| which is collinear to vector ||3,−2|| used in the example above.
The matrix transformation of these eigenvectors are
But the resulting vector ||28/3,14||  equals to 14·||2/3,1||, which means that eigenvector ||2/3,1|| has eigenvalue 14.
But the resulting vector ||−3/2,1|| equals to eigenvector ||−3/2,1||, which means that eigenvector ||−3/2,1|| has eigenvalue 1.
Not surprisingly, both eigenvectors found above have eigenvalues already found (1 and 14).
The quadratic equation for eigenvectors of type ||1,x|| is
6x² + (5−10)x − 6 = 0 or
6x² − 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(5±√25+4·36) =
= (1/12)·(5±13)
Therefore,
x1 = 3/2
x2 = −2/3
Two eigenvectors are:
v1 = ||1,3/2|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||1,−2/3|| which is collinear to vector ||3,−2|| used in the example above.
So, we did not gain any new eigenvalues by searching for vectors of a form ||1,x||.
The above calculations showed that for a given matrix we have two eigenvectors, each with its own eigenvalue.
Based on these calculations, we can now answer the questions presented before.
Q1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
A1. Not always, but only if the quadratic equations for x
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
and
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
where ||ai,j|| (i,j∈{1,2}) is a matrix of transformation, have real solutions.
Q2. If yes, how to find them?
A2. Solve the quadratic equations above and, for each real solutions x of the first equation, vector ||x,1|| is an eigenvector and, for each real solutions x of the second equation, vector ||1,x|| is an eigenvector. Then apply the matrix of transformation to each eigenvector ||x,1|| or ||1,x|| and compare the result with this vector. It should be equal to some eigenvalue λ multiplied by this eigenvector.
Q3. How many such vectors exist, if any?
A3. As many as real solutions have quadratic equations above, but no more than two.
Incidentally, in three-dimensional case our equations will be polynomial of the 3rd degree, and the number of solutions will be restricted to three.
In N-dimensional case this maximum number will be N.
Q4. How to find all the multiplication factors for a particular matrix transformation?
A4. Quadratic equation for eigenvalues
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
can have 0, 1 or 2 real solutions.
The concept of eigenvectors and eigenvalues (characteristic vectors and characteristic values) can be extended to N-dimensional Euclidean vector spaces and even to abstract vector spaces, like, for example, a set of all real functions integrable on a segment ||0,1||.
The detail analysis of these cases is, however, beyond the current course, which aimed, primarily, to introduce advance concepts.
Problem A
Research conditions when a diagonal matrix (only elements along the main diagonal are not zero) has eigenvalues.
Solution A
Matrix of transformation A=||ai,j|| has zeros for i≠j.
So, it looks like this
The equation for eigenvalues in this (a1,2=a2,1=0) case is
λ² − (a1,1+a2,2)·λ + a1,1·a2,2 = 0
with immediately obvious solutions
λ1=a1,1 and λ2=a2,2
So, the values along the main diagonal of a diagonal matrix are the eigenvalues of this matrix.
Determine the eigenvectors now among vectors ||x,1||.
Original quadratic equation for this case is
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
With a2,1=a1,2=0 it looks simpler:
(a2,2−a1,1)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||0,1||.
The eigenvalue for this eigenvector is a2,2.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.
Determine the eigenvectors now among vectors ||1,x||.
Original quadratic equation for this case is
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
With a2,1=a1,2=0 it looks simpler:
(a1,1−a2,2)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||1,0||.
The eigenvalue for this eigenvector is a1,1.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.
Answer A
If matrix of transformation is diagonal
and a2,2≠a1,1,
the two eigenvectors are base unit vectors and the eigenvalues are a1,1 for base unit vector ||1,0|| and a2,2 for base unit vector ||0,1||.
In the case of a1,1=a2,2 any vector is an eigenvector with eigenvalue a1,1.
Problem B
Prove that symmetrical matrix always has real eigenvectors.
Solution B
Matrix of transformation A=||ai,j|| is symmetrical, which means a1,2=a2,1.
Recall that a necessary condition for existence of real eigenvalues λ is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D = (a1,1−a2,2)²+4·a1,2·a2,1
Since a1,2=a2,1, their product is non-negative, which makes the whole discriminant non-negative.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
So, one or two solutions always exist.
Matrices+ 02
Matrix Eigenvalues
The concepts addressed in this lecture for two-dimensional real case are as well applicable to N-dimensional spaces and even to real or complex abstract vector spaces with linear transformations defined there.
Presentation in a two-dimensional real space is chosen for its relative simplicity and easy exemplification.
Let's consider a 2⨯2 matrix A as a linear operator in the two-dimensional Euclidean vector space. In other words, multiplication of any vector v on a coordinate plane by this 2⨯2 matrix A linearly transforms it into another vector on the plane w=A·v.
Assume, matrix A is
| 5 | 6 | 
| 6 | 10 | 
Let's see how this linear operator works, if applied to different vectors.
We will use a row-vector notation in the text for compactness, but column-vector notation in the transformation examples below.
The coordinates of our vectors we will enclose into double bars, like matrices, because a row-vector is a matrix with only one row, and a column-vector is a matrix with only one column.
Our first example of a vector to apply this linear transformation is v=||1,1||.
| 
 | 
 | 
 | 
 | 
 | 
Applied to a different vector
| 
 | 
 | 
 | 
 | 
 | 
Finally, let's applied our operator to a vector
| 
 | 
 | 
 | 
 | 
 | 
As we see, for this particular matrix we found two vectors that, if transformed by this matrix as by a linear operator, retain their direction, while change the magnitude by some factor.
These vectors are called eigenvectors. For each eigenvector there is a factor that characterizes the change in its magnitude if this matrix acts on it as an operator. This factor is called eigenvalue. This eigenvalue in the example above was 1 for
There are some questions one might ask.
1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
2. If yes, how to find them and how to find the corresponding multiplication factors?
3. How many such vectors exist, if any?
4. How to find all the multiplication factors for a particular matrix transformation?
Let's analyze the linear transformation by a matrix that leaves the direction of a vector without change, just changes the magnitude by some factor λ.
Assume, we have a matrix A=||ai,j||, where i,j∈{1,2}, in our two-dimensional Euclidean space.
This matrix converts any vector
If matrix A transforms vector v to a collinear one with the magnitude of the original one multiplied by a factor λ, the following matrix equation must hold
A·v = λ·v
or in coordinate form
| 
 | 
 | 
 | 
 | 
 | 
 | 
 | 
(a1,1−λ)·v1 + a1,2·v2 = 0
a2,1·v1 + (a2,2−λ)·v2 = 0
This is a system of two linear equations with three unknowns λ, v1 and v2.
One trivial solution would be v1=0 and v2=0, in which case λ can take any value.
This is not a case worthy of analyzing.
If the matrix of coefficients of this system has a non-zero determinant, this trivial solution would be the only one.
Therefore, if we are looking for a non-trivial solution, the matrix's determinant must be zero, which gives a specific condition on the value of λ.
Therefore, a necessary condition for existence of other than null-vector v is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D =
=(a1,1+a2,2)²−4·(a1,1·a2,2−a1,2·a2,1)
=(a1,1−a2,2)²+4·a1,2·a2,1
If D is negative, there are no real solutions for λ.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
Consider now that we have determined λ and would like to find vectors transformed into collinear ones by matrix A with this exact factor of change in magnitude.
If some vector v=||v1,v2|| that is transformed into a collinear one with a factor λ exists, vector s·v, where s is any real non-zero number, would have exactly the same quality because of associativity and commutativity of multiplication by a scalar.
A·(s·v) = (A·s)·v = (s·A)·v =
= s·(A·v) = s·(λ·v)= λ·(s·v)
Therefore, we don't need to determine exact values v1 and v2, we just need to determine only the direction of vector
If v2≠0, the directions of a vector v and that of vector ||v1/v2,1|| are the same.
If v1≠0, the directions of a vector v and that of vector ||1,v2/v1|| are the same.
From this follows that, firstly, we can search for eigenvectors among those with v2≠0, restricting our search to vectors ||x=v1/v2,1||.
Then we can search for eigenvectors among those with v1≠0, restricting our search to vectors ||1,x=v1/v2||.
In both cases we will have to solve a system of two linear equations with two unknowns λ and x.
Searching for vectors ||x,1||
In this case the matrix equation that might deliver the required vector looks like this
| 
 | 
 | 
 | 
 | 
 | 
 | 
 | 
a1,1·x+a1,2·1 = λ·x
a2,1·x+a2,2·1 = λ·1
Take the right side of the second equation λ and substitute into the right side of the first equation, obtaining a quadratic equation for x:
a1,1·x+a1,2 = (a2,1·x+a2,2)·x
or
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||x1,1|| and ||x2,1||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.
Searching for vectors ||1,x||
In this case the matrix equation that might deliver the required vector looks like this
| 
 | 
 | 
 | 
 | 
 | 
 | 
 | 
a1,1·1+a1,2·x = λ·1
a2,1·1+a2,2·x = λ·x
Take the right side of the first equation λ and substitute into the right side of the second equation, obtaining a quadratic equation for x:
a2,1·1+a2,2·x = (a1,1·1+a1,2·x)·x
or
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
Two solutions for this equations x1,2, assuming they are real values, produce two vectors ||1,x1|| and ||1,x2||, each of which satisfy the condition of collinearity after the matrix transformation.
Generally speaking, the factor λ will be different for each such vector.
Once again, let's emphasize important definitions.
Vectors transformed into collinear ones by a matrix of transformation are called eigenvectors or characteristic vectors for this matrix.
The factor λ corresponding to some eigenvector is called eigenvalue or characteristic value of the matrix and this eigenvector.
Let's determine eigenvectors and eigenvalues for a matrix A
| 5 | 6 | 
| 6 | 10 | 
The quadratic equation to determine the multiplier λ for this matrix is
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
which amounts to
λ² − 15λ + 14 = 0
with solutions
λ1 = 1 and λ2 = 14
Let's find the eigenvectors of this matrix.
The quadratic equation for eigenvectors of type ||x,1|| is
6x² + (10−5)x − 6 = 0 or
6x² + 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(−5±√25+4·36) =
= (1/12)·(−5±13)
Therefore,
x1 = 2/3
x2 = −3/2
Two eigenvectors are:
v1 = ||2/3,1|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||−3/2,1|| which is collinear to vector ||3,−2|| used in the example above.
The matrix transformation of these eigenvectors are
| 
 | 
 | 
 | 
 | 
 | 
| 
 | 
 | 
 | 
 | 
 | 
Not surprisingly, both eigenvectors found above have eigenvalues already found (1 and 14).
The quadratic equation for eigenvectors of type ||1,x|| is
6x² + (5−10)x − 6 = 0 or
6x² − 5x − 6 = 0 or
Solutions are
x1,2 = (1/12)·(5±√25+4·36) =
= (1/12)·(5±13)
Therefore,
x1 = 3/2
x2 = −2/3
Two eigenvectors are:
v1 = ||1,3/2|| which is collinear to vector ||2,3|| used in the example above and
v2 = ||1,−2/3|| which is collinear to vector ||3,−2|| used in the example above.
So, we did not gain any new eigenvalues by searching for vectors of a form ||1,x||.
The above calculations showed that for a given matrix we have two eigenvectors, each with its own eigenvalue.
Based on these calculations, we can now answer the questions presented before.
Q1. Are there always some particular vectors that retain the direction if transformed by some particular matrix?
A1. Not always, but only if the quadratic equations for x
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
and
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
where ||ai,j|| (i,j∈{1,2}) is a matrix of transformation, have real solutions.
Q2. If yes, how to find them?
A2. Solve the quadratic equations above and, for each real solutions x of the first equation, vector ||x,1|| is an eigenvector and, for each real solutions x of the second equation, vector ||1,x|| is an eigenvector. Then apply the matrix of transformation to each eigenvector ||x,1|| or ||1,x|| and compare the result with this vector. It should be equal to some eigenvalue λ multiplied by this eigenvector.
Q3. How many such vectors exist, if any?
A3. As many as real solutions have quadratic equations above, but no more than two.
Incidentally, in three-dimensional case our equations will be polynomial of the 3rd degree, and the number of solutions will be restricted to three.
In N-dimensional case this maximum number will be N.
Q4. How to find all the multiplication factors for a particular matrix transformation?
A4. Quadratic equation for eigenvalues
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
can have 0, 1 or 2 real solutions.
The concept of eigenvectors and eigenvalues (characteristic vectors and characteristic values) can be extended to N-dimensional Euclidean vector spaces and even to abstract vector spaces, like, for example, a set of all real functions integrable on a segment ||0,1||.
The detail analysis of these cases is, however, beyond the current course, which aimed, primarily, to introduce advance concepts.
Problem A
Research conditions when a diagonal matrix (only elements along the main diagonal are not zero) has eigenvalues.
Solution A
Matrix of transformation A=||ai,j|| has zeros for i≠j.
So, it looks like this
| a1,1 | 0 | 
| 0 | a2,2 | 
The equation for eigenvalues in this (a1,2=a2,1=0) case is
λ² − (a1,1+a2,2)·λ + a1,1·a2,2 = 0
with immediately obvious solutions
λ1=a1,1 and λ2=a2,2
So, the values along the main diagonal of a diagonal matrix are the eigenvalues of this matrix.
Determine the eigenvectors now among vectors ||x,1||.
Original quadratic equation for this case is
a2,1·x² + (a2,2−a1,1)·x − a1,2 = 0
With a2,1=a1,2=0 it looks simpler:
(a2,2−a1,1)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||0,1||.
The eigenvalue for this eigenvector is a2,2.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.
Determine the eigenvectors now among vectors ||1,x||.
Original quadratic equation for this case is
a1,2·x² + (a1,1−a2,2)·x − a2,1 = 0
With a2,1=a1,2=0 it looks simpler:
(a1,1−a2,2)·x = 0
From this we conclude that, if a2,2≠a1,1, the only solution is x=0, so our eigenvector is ||1,0||.
The eigenvalue for this eigenvector is a1,1.
If a2,2=a1,1, any x is good enough, so any vector is an eigenvector.
Answer A
If matrix of transformation is diagonal
| a1,1 | 0 | 
| 0 | a2,2 | 
the two eigenvectors are base unit vectors and the eigenvalues are a1,1 for base unit vector ||1,0|| and a2,2 for base unit vector ||0,1||.
In the case of a1,1=a2,2 any vector is an eigenvector with eigenvalue a1,1.
Problem B
Prove that symmetrical matrix always has real eigenvectors.
Solution B
Matrix of transformation A=||ai,j|| is symmetrical, which means a1,2=a2,1.
Recall that a necessary condition for existence of real eigenvalues λ is
(a1,1−λ)·(a2,2−λ) − a1,2· a2,1 = 0
or
λ² − (a1,1+a2,2)·λ +
+ a1,1·a2,2−a1,2·a2,1 = 0
Since we are looking for real values of λ, we have to examine a discriminant D of his quadratic equation.
D = (a1,1−a2,2)²+4·a1,2·a2,1
Since a1,2=a2,1, their product is non-negative, which makes the whole discriminant non-negative.
If D is zero, there is one real solutions for λ.
If D is positive, there are two real solutions for λ.
So, one or two solutions always exist.
Subscribe to:
Comments (Atom)
 
 

