Thursday, July 10, 2025

Physics+ MIN/MAX of Functional: UNIZOR.COM - Physics+ 4 All - Lagrangian

Notes to a video lecture on UNIZOR.COM

Lagrangian - Definition of
Min/Max of Functional


In this lecture we will discuss a concept of a local minimum or maximum of a functional.

Consider an N-dimensional real function f(x1,...,xN) defined on a domain of sets of N real numbers.
We can always assume that these N real numbers are represented by a point in N-dimensional vector space with Cartesian coordinates and point O(0,...0) as the origin.

What is the meaning of a statement that this function has a local minimum at point
P(x1,...,xN)?

In plain language it means that within a sufficiently small neighborhood around point P, no matter where we move from point P, the value of our function at that new point will be greater or equal than f(P).

Let's formalize this definition in a way that will be used to define a local minimum of a functional.

Firstly, for convenience, we will use vectors originated at the origin of coordinates and ending at some point instead of N-dimensional coordinates of that point.
So, vector OP that stretches from the origin of coordinates O to point P will replace coordinates (x1,...,xN) of that point.
Using this, our function can be viewed as f(OP).

A "sufficiently small neighborhood" of point P(x1,...,xN) (or of vector OP) can be described as all points Q(x1,...,xN) on a sufficiently small distance from point P according to a regular definition of distance in Cartesian coordinates (or as all vectors OQ also originated at the origin of coordinates such that magnitude of a difference vector |OQOP|=|PQ| is sufficiently small).

Since we are dealing with N-dimensional Cartesian space, we know how to determine the distance between two points P and Q or a magnitude of a vector PQ that represents a difference between two vectors OQOP.

We can also approach it differently getting an equivalent definition of a minimum.
Consider any vector e of unit length.
Now, all vectors OP+t·e, where t is some "sufficiently small" real values and e can be any vector of unit length, describe "sufficiently small neighborhood" of vector OP.

This representation of "sufficiently small neighborhood" might be more convenient since it depends on a single real value t.

Using the above, we can define a local minimum of N-dimensional function f() as follows.
Vector OP is defined as a local minimum of function f() if there exists a real positive τ such that
f(OP)f(OP+t·e)
for all 0 ≤ t ≤ τ and
for any unit vector e.

Obviously, local maximum can be defined analogously by changing "less than" to "greater than" in the above definition.

For a function of one argument we usually look for a local minimum by solving an equation with a function's derivative equal to zero.
This is a very geometrical approach of looking for minimum because on one side of a minimum our function is decreasing, on another - increasing, so a derivative is changing its sign from negative for decreasing interval to positive for increasing, and, therefore, must be equal to zero at the point of minimum itself.

With functions of two and more arguments this geometric logic is not so obvious, but our alternative way of defining a minimum using unit vector e originated at the point of minimum and scalar multiplier t helps to return to geometrical meaning of a local minimum.

You can imagine that each direction of unit vector e defines a plane parallel to Z-axis going through point P and unit vector e cutting a paraboloid on a picture above with a parabola as an intersection.
This parabola is a function of one variable t and, therefore, at point of minimum P must have its derivative by t equal to zero.
This derivative is called directional derivative with vector e being a direction.

Indeed, if a directional derivative by t of f(OP+t·e) is zero at point P for each unit vector e, regardless of its direction, then point P is a good candidate for a local minimum (or maximum).
This approach allows us to deal with one-dimensional case many times (for each direction of unit vector e) instead of once but for more complicated case of multiple dimensions.

Of course, the number of possible directions of unit vector e is infinite, but it's not difficult to prove that if directional derivatives along each and every coordinate axis (partial derivative) is zero, a derivative along any other direction will be zero as well.

So, for a function of N variables it's sufficient to check N first derivatives of this function, and finding all minimum and maximum points requires solving a system of N partial derivative equation with N unknowns. Might not be simple but doable.

Let's try to transfer the above definition of a minimum of a function defined on N-dimensional vector space to a functional defined on a set of functions.

First of all, we will concentrate only on sets of "nice" functions - those defined on some segment [a,b] (including the ends) and differentiable, at least to a derivative of a second order.

Secondly, to make analogy with vectors even better, we introduce a scalar product [·] of these "nice" two functions
[f(x)·g(x)] = [a,b] f(x)·g(x)·dx

Now our functions behave pretty much like vectors and we will try to transfer the definition of a minimum from a function defined on N-dimensional vector space to a functional defined on an infinite set of "nice" functions.

Consider functional F(f) defined for each function f(x) from a set of "nice" functions defined above.
Assume, we define a function f0(x) as a point where the functional F(f) has a local minimum.
It implies that there is a neighborhood of function f0(x) such that for any function f(x) located within this neighborhood
F(f0(x)) ≤ F(f(x))

The problem with this definition is that we have not defined a concept of "neighborhood" yet.
But that is not difficult provided we have defined a scalar product of two functions.

Recall that a magnitude of a vector can be defined as a square root of its scalar product with itself
|v| = √[v·v]
So, the distance between two points P and Q in N-dimensional space, which is the length of vector PQ=OQOP, can be expressed as the magnitude of this vector using its scalar product with itself.

Replacing function with a functional and vector in N-dimensional space with a "nice" function, we can define a neighborhood of a function f as a set of all functions g such that magnitude of a difference between functions
||g−f|| = √[(g−f)·(g−f)]
is sufficiently small.

As in a case of N-dimensional vector space, let's consider an alternative definition that will allow us to use differentiation to find a point of local minimum of a functional.

Consider any "nice" function f0(x) in a sense described above where functional F(f) has a local minimum.
Also consider any other "nice" function h(x) that defines a direction we can shift from point f0(x).
The neighborhood of this function f0(x) in the direction h(x) of radius τ are all functions
f(x)+t·h(x)
where 0 ≤ t ≤ τ.

Now we can define a point f0(x) as a local minimum of a functional F if has a minimum at this point regardless of a choice of direction h(x).
In other words,

Function f0(x) is a local minimum of functional F() if for any direction defined by function h(x) there exist a real positive number τ such that
F(f0(x))F(f0(x)+t·h(x))
for all 0 ≤ t ≤ τ

Analogously,

Function f0(x) is a local maximum of functional F() if for any direction defined by function h(x) there exist a real positive number τ such that
F(f0(x))F(f0(x)+t·h(x))
for all 0 ≤ t ≤ τ

The above definitions simplify a complicated dependency of a functional on an infinite set of argument functions to a much simpler dependency on a single real variable.

The usefulness of these definitions is in our ability to differentiate by parameter t, assuming that derivative should be zero at points of local minimum or maximum.
But that is a subject of the next lecture.

Monday, June 30, 2025

Physics+ Functional, Variation: UNIZOR.COM - Physics+ 4 All - Lagrangian

Notes to a video lecture on UNIZOR.COM

Lagrangian -
Functional and Variation


To introduce concepts of Functional (a noun, not an adjective) and Variation which happen to be very important mathematical tools of Physics, let's consider the following problem.

Imagine yourself on a river bank at point A.
River banks are two parallel straight lines with distance d between them.
You have a motor boat that can go with some constant speed V relative to water.
The river has a uniform current with known speed v which we assume to be less than the speed of a boat V.
You want to cross a river to get to point B exactly opposite to point A, so segment AB is perpendicular to the river's current.

Problem:

How should you navigate your boat from point A to point B to reduce the time to cross the river to a minimum?

It sounds like a typical problem to find a minimum of a function (to minimize time). But this resemblance is only on a surface.

In Calculus we used to find minimum or maximum of a real function of real argument by differentiating it and checking when its first derivative equals to zero.

In our case the problem is much more complex, because we are not dealing with a function (time to cross the river) whose argument is a real number. The argument to our function (time to cross the river) is a trajectory of a boat from point A to point B.
And what is a trajectory of a boat?

Trajectory is a set of positions of a boat, which is, in its own rights, can be a function of some argument (trajectory can be a function of time, of an angle with segment AB or a distance from line AB in a direction of a river's current).
Trajectory is definitely not a single real number
.

In our case the trajectory is determined by two velocity vectors:
velocity vector of a boat V and
velocity vector of a river's current v.

The boat's velocity vector, while having a constant magnitude V can have variable direction depending on navigation scenario.
The current's velocity vector has constant direction along a river bank and constant magnitude v.

So, the time to cross the river is not a function in our traditional meaning as a function of real argument, it's "a function of a function", which is called Functional (a noun, not an adjective).
Examples of Functionals as "functions of functions" are
- definite integral of a real function on some interval,
- maximum or minimum of a real function on some interval, - average value of a real function on some interval,
- length of a curve that represents a graph of a real function on some interval,
etc.

It is impossible to determine minimum or maximum of a Functional by differentiating it by its argument using traditional Calculus, because its argument is not a real number, it's a function (in our case, it's a trajectory as a function of time or some other parameter).
We need new techniques, more advanced Calculus - the Calculus of Variations to accomplish this goal.

We have just introduced two new concepts - a Functional (a noun, not an adjective) as a "a function of a function" and Calculus of Variations as a new technique (similar but more advanced than Calculus) that allows to find minimum or maximum of a Functional.
These concepts are very important and we will devote a few lectures to address these concepts from purely mathematical point before starting using them for problems of Physics.

Before diving into a completely new math techniques, let's mention that in some cases, when solving a problem of finding minimum or maximum of a functional, we can still use classic approach of Calculus.
This can be done if an argument to a functional (a function in its own rights) can be defined by a single parameter. In this case a functional can be viewed as a regular function of that parameter and, as such, can be analyzed by classic Calculus techniques.

Here is an example that is based on a problem above, but with an additional condition about trajectories.
Instead of minimizing the time to cross the river among all possible trajectories, we will consider only a special class of trajectories achieved by a specific scenario of navigation that allows one single real number to define the whole trajectory.

Assume, your navigation strategy is to maintain a constant angle φ between your course and segment AB with positive φ going counterclockwise from segment AB.

Obviously, angle φ should be in the range (−π/2,π/2).

With an angle φ chosen, a boat will reach the opposite side of a river, but not necessarily at point B, in which case the second segment of a boat's trajectory is to go along the opposite bank of a river up or down a stream to get to point B.

The problem now can be stated as follows.
Find the angle φ to minimize traveling time from A to B.
For this problem a functional (time to travel from A to B), which depends on trajectory from A to B, can be considered as a regular function (time to travel from A to B) with real argument (an angle φ).

Solution for Constant Angle φ:

If you maintain this constant angle φ, you can represent the velocity vector of a boat going across a river before it reaches the opposite bank as a sum of two constant vectors
V = V + V||
where
V is a component of the velocity vector directed across the river (perpendicularly to its current) and
V|| is a component of the velocity vector directed along the river (parallel to its current).
The magnitudes of these vectors are
|V| = V·cos(φ)
|V||| = V·sin(φ)

The time for a boat to reach the opposite bank across a river is
T(φ)=d / |V|=d/ [V·cos(φ)]

In addition to moving in a direction perpendicular to a river's current across a river with always positive speed V=V·cos(φ), a boat will move along a river because of two factors: a river's current v and because of its own component V|| of velocity.
The resulting speed of a boat in a direction parallel to a river's current is V·sin(φ)−v, which can be positive, zero or negative.

Therefore, when a boat reaches the opposite side of a river, depending on angle φ, it might deviate from point B up or down the current by the distance
h(φ) = [V·sin(φ)−v]·T(φ)
This expression equals to zero if the point of reaching the other bank coincides with point B, our target.
The condition for this is
V·sin(φ)−v=0 or
sin(φ)=v/V or
φ=arcsin(v/V)=φ0.
So if we choose a course with angle φ0=arcsin(v/V), we will hit point B, and no additional movement will be needed.

Positive h(φ) is related to crossing a river upstream from point B when the angle of navigation φ is greater than φ0 and negative h(φ) signifies that we crossed the river downstream of B when the angle of navigation φ is less than φ0.

In both cases, after crossing a river we will have to travel along a river's bank up or down the current to cover this distance h(φ) to get to point B.

Our intuition might tell that an angle φ0 of direct hit of point B at the moment we reach the opposite side of a river, when h(φ)=0, should give the best time because we do not have to cover additional distance from a point we reached the opposite bank to point B.

It's also important that the actual trajectory of a boat in this case will be a single straight segment - segment AB - the shortest distance between the river banks.

In general, the time to reach the other side of a river depends only on component V of the boat's velocity and it equals to
T(φ) = d / [V·cos(φ)]

If we choose a course with angle φ=φ0 to reach the opposite side exactly at point B, the following equations take place
sin(φ0)=v/V
V²·sin²(φ0) = v²
V²·cos²(φ0) = V²−v²
V·cos(φ0) = √V²−v²
T0) = d /V²−v²
This is the total time to get to point B.
If we choose some other angle φ≠φ0, we have to add to the time of crossing a river T(φ) the time to reach point B going up or down a stream along the opposite river bank.

Let's prove now that the course with angle φ=φ0 results in the best travel time from A to B.

The distance from a point where we reach the opposite bank to point B is
h(φ) = [V·sin(φ)−v]·T(φ) =
=
[V·sin(φ)−v]·d / [V·cos(φ)]

This distance must be covered by a boat by going down (if h(φ) is positive) or up (if h(φ) is negative) the river's current.
Let's consider these cases separately.

Case h(φ) is positive

This is the case of angle φ is greater than φ0.
Obviously, the timing to reach point B in this case will be worse than if φ=φ0 with h(φ0)=0.
First of all, with a greater than φ0 angle φ the river crossing with speed V(φ)=V·cos(φ) will take longer than with speed V0)=V·cos(φ0) because cos() monotonically decreases for angles from 0 to π/2.
Secondly, in addition to this time, we have to go downstream to reach point B.
So, we should not increase the course angle above φ0.

Case h(φ) ≤ 0 because φ ≤ φ0

This scenario is not so obvious because crossing the river with angle φ smaller than φ0 but greater than 0 takes less time than with angle φ0.
But it adds an extra segment to go upstream after a river is crossed.

The extra distance h(φ) is negative because V·sin(φ is less than v, which allows a current to carry a boat below point B.
Since the point of crossing the river is below point B, the distance |h(φ)| should be covered by going upstream with speed V−v, which will take time
Th(φ) = |h(φ)| / (V−v)

Using the same expression for h(φ) but reversing its sign to deal with its absolute value, we get the additional time to reach point B after crossing a river
Th(φ) =
[v−V·sin(φ)]
V·cos(φ)·(V−v)
The total travel time from point A to point B in this case is T(φ)=T(φ)+Th(φ), which after trivial simplification looks like
T(φ)=
d[1−sin(φ)]
(V−v)cos(φ)
This function is monotonically decreasing by φ because its derivative
T'(φ) =
d[sin(φ)−1]
(V−v)cos²(φ)
is negative.
Therefore, its minimum is when its argument is the largest, that is if φ=φ0, h=0 and the time to get to point B is
TAB = T = d / [V·cos(φ)]

So, the answer to our simplified problem, when we managed to solve it using the classic methodology, is to choose the course of navigation from A to B at angle φ0=arcsin(v/V).
The minimum time of traveling is TAB=d / [V·cos(φ)]

As you see, in some cases, when a set of functions that are arguments to a functional can be parameterized by a single real value (like with an angle φ in the above problem), optimization problems can be solved using classic Calculus.

The subject of a few future lectures is Calculus of Variations that allows to solve problems of optimization in more complicated cases, when parameterization of arguments to a functional is not possible.

Friday, June 20, 2025

Physics+ Kepler Third Law: UNIZOR.COM - Physics+ 4 All - Laws of Newton

Notes to a video lecture on UNIZOR.COM

Laws of Newton -
Kepler's Third Law


Kepler's Third Law states that for all objects moving around a fixed source of gravitational field along elliptical orbits the ratio of a square of their period of rotation to a cube of a semi-major axis is the same.

As in the case of the Kepler's First Law, this Third Law has been based on numerous experiments and years of observation.

Based on all the knowledge conveyed in previous lectures on Kepler's Laws, we will derive this Third Law theoretically.

Let's make a simple derivation of Kepler's Third Law in case of a circular orbit.

In this case the velocity vector of an object circulating around a central point is always perpendicular to a position vector from a center to an object.
Since the gravitational force is collinear with a position vector, it is also perpendicular to velocity, which is tangential to a circular orbit. Therefore, gravitational force makes no action along a velocity vector which makes the magnitude of the velocity vector constant.

Let's introduce the following characteristics of motion:
t - absolute time,
r - radius of a circular orbit of a moving object,
F - vector of gravity,
M - mass of the source of gravitational field,
m - mass of object moving in the gravitational field,
r - position vector from the source of gravitational field to a moving object,
r'=v - velocity vector of a moving object,
r"=v'=a - acceleration vector of a moving object,
T - period of circulation,
ω=2π/T - scalar value of angular velocity,
Here bold letters signify vectors, regular letters signify scalars and magnitudes of corresponding vectors, single and double apostrophes signify first and second derivative by time.

Constant magnitude v of velocity vector means constant angular velocity ω and obvious equality v=r·ω.

Magnitude a of an acceleration vector can be simply found by representing a position vector as a pair of Cartesian coordinates (x,y):
x = r·cos(ωt)
y = r·sin(ωt)
x' = −r·ω·sin(ωt)
y' = r·ω·cos(ωt)
x" = −r·ω²·cos(ωt)=−ω²·x
y" = −r·ω²·sin(ωt)=−ω²·y
and, therefore,
a = r" = −ω²·r
(collinear with r and F)
from which follows
a = |a| = |−ω²·r| = ω²·r

According to the Newton's Second Law,
F = m·a
According to the Universal Law of Gravitation,
F = G·M·m/r²
Therefore,
a = ω²·r = G·M/r²
from which follows
ω² = G·M/r³

Since ω=2π/T,
4π²/T² = G·M/r³
T²/r³ = 4π²/(G·M) - constant
End of proof for circular orbit.

Let's prove it in a more complicated general case of any elliptical orbit.
We will use the First and the Second Kepler's Laws as well as the results presented in the previous lecture Planet Orbit Geometry to derive this Third Law.

Recall the Kepler's Second Law (see the lecture Kepler's Second Law in this course).
We have introduced a function A(t) that represents an area of a sector bounded by r(0), r(t) and a trajectory from a planet's position P(0) at time t=0 to its position at any moment of time P(t).
Then the area swept by position vector r(t) during the object's motion from time t1 to t2 equals
ΔA[t1,t2] = A(t2) − A(t1)

Using the above symbols, the Kepler's Second Law can be formulated as
If t2−t1 = t4−t3 then
A(t2)−A(t1) = A(t4)−A(t3)
The above condition is equivalent to a statement that
dA(t)/dt is constant or, equivalently, that A(t) is a linear function of time t with A(0)=0.

In the same lecture we have proven that
dA(t)/dt = ½|L|/m
where L is an angular momentum of a moving object (constant in a central force field) and m is object's mass.

Therefore,

A(t) = t·½|L|/m

Assume, we want to know how much area is swept by position vector r(t) during a period T of a complete round movement of a planet around the Sun.
Obviously, it's
A(T) = T·½|L|/m
At the same time, A(T) is an area of an elliptical orbit of a planet, and we know that the area of an ellipse along which a planet moves equals to
A(T) = π·a·b
where a is semi-major and b is semi-minor axes (see the lecture More on Ellipse in this course).

Therefore,
π·a·b = T·½|L|/m

Let's assume that at time t=0 a planet is at the furthest from the Sun point (aphelion), it's initial position vector is r0 and its velocity vector is v0.
At this initial point on an orbit the position vector r0, lying along the major axis of an ellipse, and a tangential to an ellipse vector of velocity v0 are perpendicular to each other.

Therefore, the magnitude of the angular momentum |L| equals to a product of planet's mass, a magnitude of its position vector (that is, a distance of the Sun) and a magnitude of its velocity vector:
|L| = m·|r0|·|v0| or L/m = r0·v0

Now the above formula for a period T of a planet's rotation around the Sun is
π·a·b = ½T·r0·v0

In the previous lecture Planet Orbit Geometry we have derived the expressions for major and minor axes of an elliptical orbit of a planet in terms of its initial position and velocity at aphelion:
a =
r0
2−β
b =
r0β·(2−β)
2−β
where β=r0·v0²/(G·M)

Let's substitute these expression into a formula connecting a period T with an area of an ellipse:
πr0²√β·(2−β)/(2−β)²=½T·r0·v0

Let's square both sides to get rid of a radical:
π²r04β/(2−β)3=¼T²·r0²·v0²

Next is just technicality.
Cancel one r0 from both sides
π²r03β/(2−β)3=¼T²·r0·v0²

Replace r03/(2−β)3 with a3 (see formula above)
π²a3β=¼T²·r0·v0²

Replace β with r0·v0²/(G·M) (see formula above)
π²a3r0·v0²/(G·M)=¼T²·r0·v0²

Cancel r0·v0² on both sides
π²a3/(G·M)=¼T²

Final result:

a3
=
4π²
G·M

The right side is a constant that contains no planet-specific parameters (like initial position and velocity), which means that any planet has the following property (Kepler's Thirt Law).
The ratio of a square of the period of a planet's rotation around the Sun to a cube of a semi-major axis of its elliptical orbit is constant that depends only on a mass of the Sun.

Thursday, June 19, 2025

Physics+ Orbit Geometry: UNIZOR.COM - Physics+ 4 All - Laws of Newton

Notes to a video lecture on UNIZOR.COM

Laws of Newton -
Planet Orbit Geometry


Before studying this material we strongly recommend to study geometrical aspects of ellipse (for example, in Math+ 4 All => Geometry => Ellipse) and physical aspects of movement in the gravitational field in lectures of this course Central Force Field, Planet Orbits, More on Ellipse and Kepler's First Law.

The purpose of this lecture is to determine the geometric characteristics of a planet's orbit based on its initial position and velocity relative to the Sun.

Before addressing these issues let's recall the characteristics of an object of mass m circulating on a circular orbit around a fixed in space central object of mass M.
In particular, we are interested in some relationship between the moving object's linear speed v and a radius r of rotation around a central source of gravitation of mass M in order to stay on a circular orbit.

According to the Newton's Universal Law of Gravitation, the force of gravity F is
F = G·M·m/r²
where G is the gravitational constant.
The Newton's Second Law connects this force to a mass m and centripetal acceleration a of a moving object
F = m·a
Therefore,
G·M·m/r² = m·a
from which follows
a = G·M/r²

We can express the linear speed along an orbit v and centripetal acceleration a in terms of constant radius of uniform rotation r on a circular orbit with constant angular speed of rotation ω using time-dependent Cartesian coordinates {x(t),y(t)} of a moving object as follows.

Position vector:
x(t) = r·cos(ω·t)
y(t) = r·sin(ω·t)

Velocity vector:
x'(t) = −r·ω·sin(ω·t)
y'(t) = r·ω·cos(ω·t)
v = √x'(t)²+y'(t)² = r·ω

Acceleration vector:
x"(t) = −r·ω²·cos(ω·t)
y"(t) = −r·ω²·sin(ω·t)
a = √x"(t)²+y"(t)² = r·ω² = v²/r

Therefore, returning to the Universal Law of Gravitation,
v²/r = G·M/r²
and we conclude that for a circular rotation of an object in a gravitational field

β =
r·v²
G·M
= 1

In the above formula we have introduced a symbol β with which we expressed the main condition of object to stay on a circular orbit.
This symbol will be used in our analysis of an elliptical orbit of an object in a central gravitational field.

Let's switch now to a more complicated case of elliptical orbit.

Kepler's First Law states that all planets move around the Sun along elliptical orbits with the Sun in one of the two focal points of their orbits.
Kepler had come up with this law experimentally based on many years of observations.
In the lecture Kepler's First Law we have proven it based on the Newton's Second Law and the Universal Law of Gravitation.

The geometric properties of a planet's elliptical orbit are completely defined by two parameters: its major axis of the length 2a and eccentricity e, from which we can derive a minor axis of the length 2b and focal distance 2c using equations
c = a·e
b²+c²=a² => b=a·√1−e²

Assume, a point-mass M (the Sun) is the source of a gravitational field and is fixed in our space.
Assume further that a point-mass m (a planet) is moving relatively to this point-mass M in its gravitational field with no other forces involved.
Two major characteristics of this motion are the planet's position and velocity.

In the lecture Planet Orbits we have proven that the trajectory of a planet's movement around the Sun is a plane that we called the plane of motion.



This plane goes through the source of gravity and contains two vectors - the vector of initial position r0=r(0) from the source of gravity to a moving object at some chosen moment in time t=0 (from the Sun to a planet) and the vector of object's initial velocity v0=v(0) (tangential to a trajectory) at the same time.

In the lecture Central Force Field we have proven that Angular Momentum vector L=m·r(t)v(t) of a planet moving around the Sun is independent of time, is a constant vector perpendicular to a plane of motion.

In this lecture we will examine the geometric properties of an elliptical trajectory of a planet moving around the Sun and see how these geometric properties relate to a planet's initial position and velocity relative to the Sun at some chosen moment in time t=0.

We know that the trajectory of a planet is an ellipse lying within a plane of motion with the Sun at one of this ellipse focal points.
Using this, let's choose the coordinate system and the initial moment in time t=0 to simplify the equation of an orbit.

We will use the polar coordinate systems lying within a plane of motion.
The pole of the polar system will be at the Sun.
The polar axis will coincide with the major axis of an elliptical orbit and will be directed from the Sun towards the furthest from it point on a major axis (aphelion).

As the initial time t=0 we will choose a moment in time when a planet is at this furthest point from the Sun.
At this point of intersection of an ellipse and its major axis the initial velocity, being tangential to an ellipse, is perpendicular to a major axis and, therefore, perpendicular to a position vector.

Therefore, vectors r(0) and v(0) are perpendicular to each other with their magnitudes, correspondingly, r0 and v0.

So, at time t=0 a planet is at the polar angle θ=0 and distance from a pole (the Sun) r(0)=r0.
The velocity vector is perpendicular to a major axis and its magnitude is v(0)=v0.
Parameters r0 and v0 are given. Based on them, we have to determine the characteristics of an elliptical orbit - its semi-major axis a and the eccentricity e.

As described in the Math+ 4 All - Geometry - Ellipse lecture, the canonical equation of our elliptical orbit in these polar coordinates is
r(θ)=
a(1−e²)
1−e·cos(θ)
Setting θ=0 gives the an equation
r(0) = r0 = a·(1+e)
It's one equation with two variables a and e. We need more equations to find these characteristics of an elliptical orbit.

Setting θ=π in the above equation of an ellipse (that is, considering the opposite point on an ellipse along the major axis called perihelion) gives
r(π) = a·(1−e)
This adds the second equation for a and e but adds another unknown r(π).

While r(π) and v(π) (the magnitudes of the position and the velocity vectors at the perihelion) are unknown, we can use two Laws of Conservation between points θ=0 and θ=π to determine them - the Law of Conservation of Angular Momentum and the Law of Conservation of Energy.

Recall the Conservation of Angular Momentum Law for an object in a central force field
L = m·rv

The velocity and position vectors are perpendicular to each other at point θ=0 as well as at point θ=π.
Therefore, the magnitude of the vector product of these two vectors at both points equals to the product of their magnitudes and is the same because of conservation of angular momentum.
L(0) = m·r0·v0 =
= m·r(π)·v(π) = L(π)

From this follows
v(π) = r0·v0/r(π)

Next we will use the conservation of energy - the sum of potential energy of an object in the gravitational field and its kinetic energy is constant.

Potential energy of an object of mass m (a planet) in the gravitational field of an object of mass M (the Sun) located at a distance r from it is
U = −G·M·m/r
Kinetic energy of this object, when its speed is v, is
K = ½m·v²
Full energy is E = K + U

Therefore,
E(0) = ½m·v0² − G·M·m/r0 =
= ½m·v²(π) − G·M·m/r(π) =
= E(π)


From the equation for conservation of angular momentum we get the value of v²(π) in terms of r(π) and initial parameters r0 and v0 as follows
v²(π) = r0²·v0²/r²(π)
and put it in the equation for energy conservation getting an equation to determine r(π)
½m·v0²−G·M·m/r0 =
= ½m·r0²·v0²/r²(π)−G·M·m/r(π)


For brevity, let's temporarily use variable x instead of r(π).
To simplify this equation, let's reduce all members by mass m: ½v0²−G·M/r0 =
= ½r0²·v0²/x²−G·M/x

and multiply by 2x² gathering all members to the left side of an equation getting
x²·(2G·M/r0−v0²) −
−x·(2G·M) + r0²·v0² = 0


Recall that earlier in this lecture we analyzed the circular rotation in a central gravitational field and introduced a symbol β=r·v²/(G·M), which was supposed to be equal to 1 in order for an object to stay on a circular orbit.
Multiplying our equation for x by r0/(G·M) and using this symbol β=r0·v0²/(G·M), the equation looks much simpler:
x²·(2−β) −x·(2r0) + β·r0² = 0

Canonical quadratic equation
P·x² + Q·x + R = 0
where
P = 2−β
Q = −2r0
R = β·r0²
has solutions
x = r(π) =
−Q±√Q²−4P·R
2P
with all components described above known.

The expression under a square root can be simplified.
Q²−4P·R = 4r0²−4(2−β)βr0² =
= 4r0² − 8βr0² +4β²r0² =
= 4r0²·(1−2β+β²) =
= 4r0²·(1−β)²


Therefore,
x = r(π) =
2r0 ± 2r0·(1−β)
2(2−β)
or
r(π) =
r0·(1±(1−β))
2−β
If we choose a plus sign in the numerator, the value for r(π) will be equal to r0, which only possible if our ellipse is a circle with coinciding focal points.
This is a trivial case and we will not consider it here.

In all other cases the solution is
r(π) = r0·β/(2−β) where β=r0·v0²/(G·M), which contains only known values - initial distance from the Sun and speed of a planet at aphelion and other known variables.

Therefore, we have a simple system of two equations with two unknowns a and e:
r(0) = r0 = a·(1+e)
r(π) = a·(1−e)

This system has solutions
a = ½[r0+r(π)]
e = 2r0 / [r0+r(π)] − 1 =
=
[r0−r(π)] / [r0+r(π)]
From these we can determine the focal distance
c = a·e = ½[r0−r(π)]
and semi-minor axis
b = √a²−c² = [r0·r(π)]½

Putting the obtained expression for r(π) into above formulas we have the geometric properties in terms of initial distance of a planet from the Sun and its linear speed at aphelion:
a = ½[r0+r(π)] =
= ½
[r0+r0·β/(2−β)]
which can be transformed into

a =
r0
2−β

Let's do the same with eccentricity e:
e = [r0−r(π)] / [r0+r(π)] =
= [r0−r0·β/(2−β)] /
/
[r0+r0·β/(2−β)]
which can be transformed into

e = 1 − β

The focal distance:
c = a·e
which can be expressed as

c =
r0(1−β)
2−β

The semi-minor axis:
b² = a² − c² =
=
[r0² − r0²·(1−β)²] /(2−β)² =
= r0²·β·(2−β) /(2−β)²

Therefore,

b =
r0β·(2−β)
2−β

To conclude geometric properties of an elliptical orbit, since we know both semi-axes, we can determine an area of an ellipse A=π·a·b

A=
πr0²√β·(2−β)
(2−β)²

Tuesday, May 27, 2025

Physics+ Motion in Polar Coordinates: UNIZOR.COM - Physics+ 4 App - La...

Notes to a video lecture on UNIZOR.COM

Motion in Polar Coordinates

The subject of this lecture is to describe characteristics of movement (position, velocity and acceleration) in polar coordinates.
This approach will be useful in analyzing the movement of objects in a central force field (like in a gravitational or an electrostatic fields).
Using of polar coordinates to analyze the movement in a central field seems to result in simpler derivation of important physical results, like the Kepler's Laws.

Consider a model of a space with a fixed position of a point-mass M - the source of a central gravitational field.
Assume, a test object of mass m moves in this gravitational field. As was proven in the lecture Planet Orbits of this course, the trajectory of this test object lies within some plane of motion, and the source of gravitation also lies in this same plane.

Let's associate the origin of some Cartesian coordinate system with the fixed position of the source of gravitation O and XY-plane coinciding with the plane of motion of our test object.
So, the coordinates of the source of gravitation are always {0,0,0} and coordinates of a test object always have Z-coordinate equal to zero, so in most cases we will not even specify it, considering we deal with a two-dimensional XY-space of a plane of motion.

Now we also introduce a polar system of coordinates {r,θ} in a plane of motion with the same origin at the source of gravitation O and the base axis coinciding with the OX-axis.

Consider a position vector r from the source of gravitation O to point P where a test object is located.
If a test object is at Cartesian coordinates {x,y}, its position vector can be represented as
r = i + y·j
where i and j are unit vectors along X- and Y-axes correspondingly, forming an orthogonal basis on XY-plane.

Assume, our test object is at polar coordinates {r,θ} related to its Cartesian coordinates as
x = r·cos(θ)
y = r·sin(θ)
To express the same position vector r in some orthogonal basis in the polar system of coordinates, let's introduce two unit vectors:
êr along a line from the origin O to position of a test object P, that is collinear to vector r;
êθ along a line perpendicular to êr.
This orthogonal basis is not fixed in space like unit vectors i and j in the Cartesian system of coordinates, but is moving with a test object.

In this new orthogonal basis the same position vector r can be represented as

r = êr

where r is the magnitude of vector r - its first coordinate in the polar system {r,θ}, which, in turn, can be expressed in Cartesian system as
r=√x²+y²
As you see, polar representation of a position vector as a vector in some orthogonal basis is simpler than its Cartesian representation - a very important factor for analysis of movement in a central gravitational field.

All the coordinates mentioned above, Cartesian {x,y} and polar {r,θ} are functions of time, as our test object is moving in the gravitational field.

As we know, differentiating a position vector r by time gives the velocity vector v=r', where we use a single apostrophe to indicate a derivative by time.
Vector representation of velocity in Cartesian orthogonal basis, as we know, is
v = x'·i + y'·j

We would like to represent a vector of velocity in polar coordinates as well using the orthogonal basic {êr, êθ}.
For this, first of all, we express basic {êr, êθ} in terms of basic {i, j} using the fact that vector êr has a unit length and positioned at angle θ to OX-axis, while vector êθ has a unit length and positioned at angle θ+π/2 to OX-axis:
êr = cos(θ)·i + sin(θ)·j
êθ = −sin(θ)·i + cos(θ)·j
where we used trigonometric identities
cos(θ+π/2) = −sin(θ) and
sin(θ+π/2) = cos(θ)

From the above follows an important property of this orthogonal basis
dêr /dθ = êθ
dêθ /dθ = −êr

Furthermore, the time derivatives of these unit vectors are
êr' = (dêr /dθ)·θ' = θ'·êθ
êθ' = (dêθ /dθ)·θ' = −θ'·êr

Since r = êr
v = r' = dr/dt =
= d(r·êr)/dt =
= r'·êr+r·êr' =
= r'·êr+r·θ'·êθ

v = r'·êr+r·θ'·êθ

Let's extend these calculations and get an acceleration vector represented in the same basis of {êr , êθ}.

a = v' = dv/dt =
= r"·êr + r'·êr' +
+
r'·θ'·êθ + r·θ"·êθ + r·θ'·êθ' =
=
r"·êr + rθ'·êθ +
+
r'·θ'·êθ + r·θ"·êθr·θ'²·êr =
=
(r"−r·θ)·êr +
+
(r·θ"+2rθ')·êθ


As we know, according to Newton's Second Law, vector of acceleration is collinear to a vector of force.
According to Newton's Universal Law of Gravitation, vector of gravitational force is collinear to a position vector.
Therefore, vectors a and r are collinear.
Consequently, a and êr are collinear, from which follows that coefficient at êθ must be equal to zero:
r·θ"+2rθ' = 0 and
a = (r"−r·θ)·êr

We can come up with the same result using the Law of Conservation of Angular Momentum.
Recall that the vector of Angular Momentum of an object in a central force field is constant because a central force has no rotational action, no torque.
It is directed along the Z-axis perpendicularly to a plane of motion.
This vector of Angular Momentum is defined as L=m·rv.
We can express the constant magnitude of this vector in terms of {r,θ} as follows.
r = {r·cos(θ), r·sin(θ), 0}.
v = {r'·cos(θ)−r·sin(θ)·θ', r'·sin(θ)+r·cos(θ)·θ', 0}.

According to the rules of vector product in three-dimensional Cartesian coordinates, their vector product L/m = rv is a vector with X- and Y-coordinates equal to zero and its Z-coordinate equal to
r·cos(θ)·[r'·sin(θ)+r·cos(θ)·θ']
−r·sin(θ)·
[r'·cos(θ)−r·sin(θ)·θ']
The above expression evaluates to
r²·[cos²(θ)+sin²(θ)]·θ' = r²·θ'
Therefore,
|L|/m = L/m = r²·θ'
and is a constant of motion in a central field.

Since L/m is a constant of motion, its derivative by time is zero.
Therefore,
d(L/m)/dt = r²·θ" + 2r·r'·θ' = 0
Canceling r as a non-zero multiplier results in
r·θ" + 2r'·θ' = 0

This nullifies the êθ component in the above expression of acceleration in polar coordinates.

Therefore,

a = (r"−r·θ)·êr

Just as a check point, if the motion is circular (r is constant) and uniform (θ' is constant), this formula looks like
a = −r·ω²·êr
which is fully in agreement with kinematics and dynamics of a uniform rotation (see lectures in UNIZOR.COM - Physics 4 Teens - Mechanics - Rotational Kinematics and Rotational Dynamics).

Using the established equality L/m=r²·θ' we can substitute θ' in the above equation in a vector of acceleration with L/(mr²) getting
a = [r"−r·L²/(m·r²)²]·êr or
a = [r"−L²/(m²·r³)]·êr

Again, using Newton's Laws, this vector of acceleration should be equal to
a = [r"−r·L²/(m·r²)²]·êr =
= −(G·M/r²)·êr
.
Therefore, we can express it as a differential equation
r"−L²/(m²·r³) = −G·M/r²

Monday, May 19, 2025

Physics+ More on Ellipse: UNIZOR.COM - Physics+ 4 All - Laws of Newton

Notes to a video lecture on UNIZOR.COM

Laws of Newton -
More on Ellipse Characteristics


Let's get to more details about properties of an ellipse. It's important for our future discussion of Kepler's Laws described in the next few lectures of this part of a course.

Axes in Polar Coordinates

The equation in polar coordinates (r,θ) with an origin at one of the ellipse' foci and a base axis coinciding with the line between the foci is
r = a·(1−e²)/[1−e·cos(θ)]
where a is half of a major axis,
c is half of a distance between foci,
the ratio e=c/a is called eccentricity of an ellipse and it's always less than 1.

For an ellipse described above, the distance from a focus at the origin of a polar system to a further end of an ellipse along X-axis should be equal to half of the major axis plus half of a focal distance, that is a+c.
Indeed, if we substitute θ=0 into an equation of an ellipse in polar coordinates, we obtain
r(0) = a·(1−e²)/[1−e·cos(0)] =
= a·(1−e²)/
[1−e] =
= a·(1+e) = a + a·c/a = a + c


To reach the opposite end of an ellipse (the shortest distance from an origin) we have assign θ=π, which should result in r=a−c.
Let's check it by substituting θ=π in our equation of an ellipse.
r(π) = a·(1−e²)/[1−e·cos(π)] =
= a·(1−e²)/
[1+e] =
= a·(1−e) = a − a·c/a = a − c


Since
r(0) = a + c and
r(π) = a − c
we can derive the half of the major axis
a = (1/2)·[r(0) + r(π)]
and the half of the focal distance
c = (1/2)·[r(0) − r(π)]

As we know, the half of minor axis b equals
b = √a²−c²
Short calculations show that in terms of r(θ) it will be
b = √r(0)·r(π)


Ellipse Area

The equation of an ellipse in Cartesian coordinates (x,y) with X-axis coinciding with the line between the foci and the origin of coordinates being at a midpoint between foci is
x²/a² + y²/b² = 1
Here a is a half of a major axis and b is a half of a minor axis of an ellipse.

If we consider only top half of an ellipse, this equation can be resolved for y to represent it as a function y(x)
y²/b² = 1 − x²/a²
y² = b²·(1−x²/a²)
y² = (b²/a²)·(a²−x²)
y = √(b²/a²)·(a²−x²)
y = (b/a)·√(a²−x²)

Let's compare this function with a function describing the top half of a circle of radius a and equation
x² + y² = a²
from which follows
y = √(a²−x²)

Graphically the functions describing an ellipse and a circle look like this
As you see, for any abscissa x the ordinate of an ellipse is smaller than the ordinate of a circle by the same factor b/a.

If you take a look at any vertical bar from the X-axis up, it's height to an intersection with an ellipse is smaller than to an intersection with a circle by a factor b/a.

That means, the area of a portion of that bar below the ellipse is smaller that the area of a bar below a circle by the same factor b/a.

The area of a circle and the area of an ellipse can be comprised from an infinite number of such bars of infinitesimal width (integration!), which means that the total area of an ellipse is smaller than the one of a circle by the same factor b/a.

Since the area of a circle of a radius a is πa², the area of an ellipse is
Aellipse = πa²·(b/a) = πab


Area and a Period

Consider an object moving along an elliptical trajectory.
Let's introduce a system of polar coordinates with an origin at one of the ellipse foci and base axis coinciding with a line between foci.
Let the semi-axes of an elliptical trajectory be a and b.

Let r be a position vector of a moving object - a vector from the ellipse' focus chosen as an origin of polar coordinates to an object's position at any time.
Let θ be an angle between the base axis and vector r. This angle, obviously changes with time as an object moves along its trajectory.

An object's movement in this system of coordinates along its elliptical trajectory is described in polar coordinates as r(θ) with angle θ being, in turn, a function of time t.

Assume farther that an object moves on an elliptical trajectory with certain periodicity T, that is, it returns to the same position at each interval of time T
θ(t+T)=θ(t) + 2π.

Consider a function A(θ) equal to an area of an ellipse swept by position vector r(θ) from its position at θ=0 to a position at angle θ.

Since an angle θ, in turn, depends on time, area A(θ) can be considered as a function of time A(t) as well.

By the time from t=0 to t=T an angle θ will make a full turn by and vector r will swipe an entire area of an ellipse.
Therefore,
A(T) = A(θ(T)) = πab

If we know the function A(t), we can determine the period of rotation T based on geometrical characteristics of a trajectory.
Actually, when we will discuss the Kepler's Laws of planetary movements, we will prove that
A(t) = k·t
where k - some constant of motion.
That allows to calculate the period T:
A(T) = k·T = πab
Therefore,
T = πab/k

Friday, May 16, 2025

Physics+ Kepler's Second Law: UNIZOR.COM - Physics+ 4 All - Laws of Newton

Notes to a video lecture on UNIZOR.COM

Laws of Newton -
Kepler's Second Law


This lecture continues studying movement of objects in a central field. The familiarity with material presented in the lectures Central Force Field and Kepler's First Law is essential for understanding this educational material.

Kepler's Second Law states that a segment, connecting our Sun with any planet moving around a Sun, sweeps out equal areas during equal intervals of time.
As in the case of the Kepler's First Law, this Second Law has been based on numerous experiments and years of observation.

This Law can be formulated more mathematically.

Imagine a three-dimensional space with a single source of gravitation - a point-mass M at point O.
Some object comes into this field - a point-mass m. Its position at time t is at point P(t) and its velocity is v(t).

At time t=0 the position of our object is P(0) and velocity vector v(0).

As we know from previous lectures of Laws of Newton part of this course, the trajectory of our object will lie in the plane defined by vectors of initial position OP(0)=r(0) and initial velocity v(0) at time t=0.
Therefore, we can restrict our analysis to a two-dimensional case of trajectory lying completely within the plane defined by r(0) and v(0).

Let's choose a system of polar coordinates in this plane with an origin at point O, where the source of gravity is located, and a base axis defined by direction from point O to a position of our object at time t=0 - point P(0).

If our object during a time interval from t1 to t2 moved from point P(t1) to P(t2), its position vector r(t) swept up a sector bounded by r(t1), r(t2) and a trajectory from P(t1) to P(t2).

Let's introduce a function A(t) that represents an area of a sector bounded by r(0), r(t) and a trajectory from P(0) to P(t).
Then the area swept by position vector r(t) during the object's motion from time t1 to t2 equals
ΔA[t1,t2] = A(t2) − A(t1)

Using the above symbols, the Kepler's Second Law can be formulated as
If t2−t1 = t4−t3 then
A(t2)−A(t1) = A(t4)−A(t3)
The above condition is equivalent to a statement that
dA(t)/dt is constant.

Indeed, let t1 and t3 be any two moments of time and t2=t1+Δt and t4=t3+Δt.
Then t2−t1=Δt and t4−t3=Δt
Therefore,
A(t1+Δt)−A(t1) =
= A(t3+
Δt)−A(t3)
Dividing both sides by Δt, we get
[A(t1+Δt)−A(t1)]/Δt =
=
[A(t3+Δt)−A(t3)]/Δt
Taking this to the limit, when Δt→0, we get
dA(t)/dt|t=t1 = dA(t)/dt|t=t3
Since values t1 and t3 are chosen freely, it means that the derivative of a function A(t) is constant.
Hence, we conclude, A(t) is a linear function of time t.

In reverse, if we assume that the first derivative of function A(t) is constant and, therefore, A(t) is a linear function of time t, we can easily prove that
if t2−t1 = t4−t3 then
A(t2)−A(t1) = A(t4)−A(t3)

The constant first derivative of function A(t), that represents an area swiped by a position vector during the time from t=0 to some value t, is just a mathematical way of stating the Kepler's Second Law.
Proving this characteristic of function A(t) is a proof of the Kepler's Second Law.

Let's prove it then.
Assume that an object position vector during time interval Δt moved from r(t) to r(t+Δt).
These two vectors form a triangle whose area approximately equal to an area A(t+Δt)−A(t) of a sector swiped up by vector r(t) during the time Δt. The approximation will be better, as interval of time Δt tends to zero.

The third side of this triangle is a vector connecting the end points of position vectors.
An approximation of this vector's magnitude is a magnitude of velocity vector at time t multiplied by time interval Δt:
r(t+Δt)r(t)v(t)·Δt
The area of a triangle formed by two vectors a and b equals to the half of a magnitude of a vector product of these vectors because
(i) the area of a triangle with two sides a and b with an angle ∠φ between them is equal to
½·a·ha = ½·a·b·sin(φ)
where ha is an altitude onto side a
(ii) from the definition of a vector product
|ab| = |a|·|bsin(φ)

Using this, we can say that an area of a triangle formed by r(t), r(t+Δt) and v(t)·Δt equals to ½|r(t)v(t)|·Δt.

Recall that Angular Momentum of an object of mass m moving in some field, having a position vector r(t) and velocity v(t), is defined as
L(t) = m·r(t)v(t).
But if the field is central, the Angular Momentum is a constant because a central force has no torque.
Therefore,
|r(t)v(t)| = |L|/m is a constant.

An immediate consequence from this is that the area of a triangle formed by r(t), r(t+Δt) and v(t)·Δt is ½|L|/m·Δt.

When Δt0, the area of our infinitesimal triangle tends to the area of a sector swiped up by position vector r(t) during infinitesimal time interval Δt ΔA(t)=A(t+Δt)−A(t).

Therefore,
limΔt→0ΔA(t)/Δt = ½|L|/m
which is a constant.
The limit above is a derivative of A(t) by time. Since it is a constant, A(t) is a linear function of time.
That proves the Kepler's Second Law.