The rotation problem and Hamilton's discovery of quaternions I | Famous Math Problems 13a

Channel: Insights into Mathematics Published: 2013-05-17 7,978 words Source: manual_caption

Advanced Mathematics & Geometric Physics

Transcript

Hello everyone, I'm Norman Wildberger. In today's famous math problem, we're going to look at the famous Irish mathematician William Rowan Hamilton and his discovery of quaternions in the context of what I call the rotation problem, which is the problem of how do we describe, understand, and manipulate rotations in three-dimensional

space algebraically. The motivation for this problem comes from the intimate connection between complex numbers and rotations in the plane. Complex numbers are numbers of the form a + bi, where i is a kind of imaginary number satisfying i squared equals minus 1.

Around Hamilton's time, it was well appreciated that these complex numbers somehow had an intimate connection with the geometry of the plane and, in particular, allowed an efficient computational calculus for rotations. So the problem that Hamilton posed to himself was: what algebraic structure plays an analogous role for rotations in space? We have rotations in

the plane, and we know we can rely on complex numbers to help us with the algebra, but what about rotations in three-dimensional space, which are considerably more subtle and difficult to manipulate? Well, for a long time, Hamilton thought that the answer lay in extending the idea of complex numbers from a two-dimensional algebra to a three-dimensional algebra.

In other words, he was thinking about vectors of the form t + ai + bj (forget the ck for a moment) and trying to find a way of introducing an algebraic structure on these three-dimensional vectors. He sought a way of multiplying them that allowed one to sort of capture the rotational structure of three-dimensional space.

Then, on some fateful day in 1843, while walking with his wife along a canal in Dublin, he was crossing a bridge when this inspiration came to him. He realized that the solution was to consider not a three-dimensional space of vectors, but rather a four-dimensional space of vectors,

and that the crucial properties satisfied by these new numbers i, j, and k could be captured by these equations: i squared was equal to minus 1, as was j squared and k squared (that's much like the complex number situation). In addition, there's a relationship between i, j, and k which we can write as i times j times k equals minus 1. He realized this in a flash of insight and,

in a small piece of mathematical vandalism, carved this equation into the stonework of the bridge. That's a famous mathematical anecdote—a bit of high drama in the world of pure mathematics. William Hamilton was no doubt Ireland's most

famous mathematician, a very brilliant fellow who made contributions to physics, optics, and other areas of algebra as well. In addition to his discovery of quaternions, he also built up the theory of quaternions into a very powerful and broad tool for doing physics. However, this came

into conflict with another approach to vectors, and ultimately, quaternions lost out to our current approach using dot products and cross products. But as we will see, they are actually closely connected to quaternions as well. Hamilton was also famous for an

important insight into classical mechanics. He took the framework of Lagrange established for understanding Newton's laws and twisted it in an important new direction to give the so-called Hamiltonian formulation of classical mechanics. This not only brought into being a subject called symplectic geometry but also was a very important contributor to the

20th-century development of quantum mechanics. We're going to describe this problem, and it's going to take us probably three lectures to do that. So in today's lecture, I want to start by setting the framework, so to speak, by making very clear this connection

between complex numbers and rotations in the plane. It turns out that if you look at that subject in the right way—in other words, in a rational way—then that makes it much easier to understand what happens with the story of quaternions and rotations in three-dimensional space.

That's what we're going to do today, mostly, and then in the next lecture, we're going to talk about rotations and how we think about rotations in three-dimensional space, and then we'll get to the actual quaternion algebra in four dimensions that Hamilton introduced. So we'd have to go up to four dimensions and understand a little bit of the geometry of four dimensions to get at these quaternions.

Now, this is a subject that is quite important these days because working with rotations is something that we do a lot in industrial work, in graphics, in computer programming, for example in video game construction. So there are lots of situations where we want to manipulate rotations effectively, and it turns out that

quaternions are probably still the best and most efficient way of doing that. It's a lovely subject that undergraduates can learn about, so I'm going to give you an introduction to Hamilton's quaternions in these next three lectures. I'm going to start off with this much simpler

situation: complex numbers and rotations of the plane. There will be a lot of material in this lecture, none of it too sophisticated, but some of it a bit novel because I have a rational point of view towards things, and it turns out this rational point of view is a very good way of understanding things and makes a lot of higher geometry much simpler.

So even if you're very familiar with complex numbers, you are going to learn some important things in today's lecture. If you haven't seen complex numbers before, well, this is probably something of an introduction.

You can also have a look at the WildTrig 15 video I made quite a few years ago now, but it also has some information on complex numbers. You might want to watch that before you have a look at this, so don't hesitate to stop the video, go back, and make sure you understand everything I'm saying here.

Alright, we're starting with complex numbers, but I adopt a rational point of view because I don't believe in irrational numbers. It's not a religious position; it's just that I haven't seen any, and no one has actually shown me a proper irrational number, so why should I believe in them? It's a very good position that I encourage

you to adopt as well. You might think that it diminishes one's mathematics, but it turns out that it does exactly the opposite—it strengthens one's mathematics because one can then look carefully and clearly and logically at many things which previously required waffling. So, a complex number for us is a pair (a, b) of rational numbers, and this is what I

use to denote the type of rational numbers. I prefer not to think in terms of infinite sets. Pictorially, we have an XY plane, and the pair (a, b) can be thought of as this point here with x-coordinate a and y-coordinate b, or can also be thought of as the vector from the origin to this point. Sometimes it's best to think in terms of a point, and sometimes it's

better to think in terms of a vector, but in fact, logically speaking, the complex number is neither of those—it's just the pair of rational numbers. Complex numbers support operations—actually, they support all four operations of addition, multiplication, subtraction, and division—but here are the two main ones: addition and multiplication.

Addition is pointwise, corresponding to the usual vector sum of vectors. The multiplication is where all the interest lies. So the formula is that (a, b) times (c, d) is, by definition, a times c minus b times d—that's the first entry—and then the second entry is a times d plus b times c. So we make the definition—that's

just how we're going to define multiplication of these ordered pairs of rational numbers—and it turns out that's a very good choice because it has lovely properties. Namely, these operations satisfy the following familiar laws of arithmetic: first of all, the two operations are commutative: z plus w equals w plus z, and z times w equals w times z.

Please check that. We have associativity: that if you have three complex numbers, it doesn't matter in what order we pair them. So z plus (w plus u) is the same as (z plus w) plus u, and more importantly for multiplication as well: z

times (w times u) is the same as (z times w) times u. Then there's a distributive law: z times (w plus u) equals zw plus zu. And here I'm allowing myself the shortcut zw means z times w. These laws

are relatively straightforward to verify, except perhaps for this one—this is the most interesting one: the associativity of the multiplication. That turns out to be somewhat non-trivial; you actually have to make a calculation for that, and I please urge you, if you haven't done this in your mathematical career yet, make this calculation. Check that this is true.

Then you might like to think about what happens if we modify this definition a little bit. Suppose we decide to make a new multiplication by changing that minus sign to a plus sign, or maybe that plus sign into a minus sign, or maybe sticking a factor of 2 in front of the ad, or some other variant like that.

You will find that then this multiplication property here tends to not be satisfied, so you will get an appreciation for the beauty of this particular definition. It just happens to work for that associative property. We'll also define a few special complex

numbers. So, zero is just the pair (0, 0), and one is the pair (1, 0). You can check that this number, this one, has the property that one times z is z times 1, which is z.

So it's the multiplicative identity, and this zero has the property that zero plus z equals (0, 0) equals z. Okay, so we could also define subtraction and division, but I'm not going to do that. This is the essence of the algebraic structure of the

complex numbers, and it's an example of what is called an algebra in mathematics—a little bit of an unfortunate name because algebra has these different meanings, but anyway, that's the complex numbers. I've now repeated the multiplication here so we can refer to it, and I want to point out that there are some other sort of special complex numbers that play distinguished roles.

First of all, the numbers of the form (a, 0), where the second coefficient is 0—those numbers correspond to points on the x-axis, and their arithmetic is particularly simple. You can check that, additively, (a, 0) plus (b, 0) is just (a + b, 0), and (a, 0) times (b, 0), multiplicatively, we

get the product (ab, 0). So that if we restrict our attention to this real axis, then the complex numbers there act just like the usual rational numbers. So we can think of the ordinary rational numbers as being embedded in our picture in terms of the points on the x-axis.

Now, multiplication by such a real complex number—so often these ones are called the real complex numbers, which is possibly a source of confusion because I don't believe in real numbers in the usual sense—but that shouldn't prevent me from being able to use the adjective "real," and so here we're just going to say that any complex number whose second coordinate is 0, we're going to call that a real

complex number, even though in fact the entry a is actually a rational number. So if we multiply by such a real complex number (a, 0), then we see that it acts by scaling by a. So let's check: if we multiply (a, 0) by (c, d), the rule is we take a times c minus 0 times d, so that's just a

times c, and then we get a times d plus 0 times c, so that's just a times d. So what has happened is that each of these entries has gotten multiplied by a, and a is just an ordinary rational number. Let's agree that we use a sort of vector space terminology here and notation and agree that we can pull that a out front if it's common to both terms, and so we can rewrite this as a times

(c, d). So here the a is just a rational number; we're just talking about scaling the vector by multiplying by a. So that's what happens when we multiply any complex number by (a, 0). For

example, if we multiply that complex number by (a, 0), a here is somewhere between 1 and 2, say, then it would mean that this vector would enlarge by a factor of a, so it would become roughly twice as long. So the product would be up there somewhere. On the other hand, it's also interesting

to consider complex numbers that lie on the y-axis—they also have a distinguished role, and let's put them in. So these are sometimes called—well, they're called imaginary complex numbers. Imaginary complex numbers.

Alright, the ones that are lying on the y-axis and have the form (0, a). So what happens if we multiply by (0, a)? Have a look: here's our basic law for multiplication. So we get 0 times c minus a times d for a total of minus a times d, and the

second entry is 0 times d plus a times c, or ac. If we pull out the common a, as we did before, then this is a times (-d, c). So what's the relationship between (-d, c) and (c, d)? Well, if here is (c, d), then here is (-d, c). It's really the same vector except that it's been rotated by

a quarter turn. So, a quarter of the way all the way around—that's a right angle there; these two vectors are perpendicular; that's 90 degrees if we're measuring an angle spread of 1, if we're measuring with rational trigonometry. So the effect of (a, 0) is just dilation, but multiplying

by (0, a) dilates and rotates by a quarter turn. So we're already seeing some geometry, some geometrical transformation associated with the algebraic structure. So the complex number which is on the y-axis and has coordinate (0, 1) has a special role in the subject, and it's

usually given a special name. So that's the complex number i. It's interesting because when we square it, let's see what happens. i squared is equal to (0, 1) times (0, 1).

We're going to get 0 times 0 minus 1 times 1, so that's minus 1, and we're going to get 0 times 1 plus 1 times 0, that's 0. So we're getting this number (-1, 0), which is really like the number, the rational number, minus 1, because we've said that the numbers ending in 0 are really acting like

the rational numbers. So we can agree that we're going to call this number just minus 1. In that case, i squared is equal to minus 1. So we have this algebraic system now where we have this new number called i, which has this remarkable property that no rational number does, namely its square is minus 1.

Now, we can see that geometrically from what we were just saying in terms of rotation. If we multiply by i, well, then we just rotate by 90 degrees or a quarter turn. And so if we do that twice—if we multiply by i

squared—we're taking every vector and changing it to its negative; we're rotating essentially by 180 degrees or a half turn, which is essentially to negate any vector. It's a geometrical interpretation of this equation. Now we're going to use this particular complex

number to simplify or give an alternate form for complex numbers. So instead of writing the pair (a, b), we'll think of this as being a times (1, 0) plus b times (0, 1). And this (1, 0) is really the number 1, which we don't need to write, and this (0, 1) is this new number i, so we

can write this expression as a + bi. That's an alternate form for complex numbers, which is the one that we're going to probably use in practice most. With this notation, this particular equation is really the only thing that we have to remember when we're multiplying.

We can almost forget about that original rule. As illustrated here, suppose we want to multiply (3 + 5i) with (-2 + i). Then if we're just going to do this using the distributive law, we get 3 times -2 is

-6, 3 times i is +3i, 5i times -2 is -10i, and 5i times i is 5i squared. Well, i squared is equal to minus 1, so we should replace this thing with a minus 1 to give us -6 - 5, and then the i's combine as 3 - 10i, giving us altogether -11 - 7i. You can check

that this is the same thing that you would get if you did the original multiplication in terms of this times this minus this times this, and this times this plus this times this. It's just that the advantage is that we don't actually have to remember the formula for multiplication—we only have to remember this very simple law and use natural distributivity. Alright, and then numbers like this (0, a) can be just written as a times i, so these are the

imaginary complex numbers that lie on the y-axis. Alright, now let's introduce a little bit more terminology and then the main theorem, the most important fact about complex numbers. So if z is equal to (a, b), or in our new notation a + bi, then let's give

a name to these numbers a and b. Let's call a the real part of z and denote it by this R(z), and let's denote the rational number b as the imaginary part of z and denote it by Im(z). Notice these are both rational numbers. Let's define the complex conjugate.

So if we take the complex number z and we put a little bar over it, that means complex conjugate, and what you're going to do is negate the second coefficient. In terms of a + bi, that changes to a - bi. So that's the complex conjugate z. That's an important idea in complex numbers, and we're

going to use that to define the quadrance of the complex number. So the quadrance of the complex number z is Q(z)—it's, by definition, z times z bar. It's what you get when you multiply this times this.

Now, that's a difference of squares. If you multiply a + bi times a - bi, you get a squared - (bi) squared, but i squared being minus 1 amounts to a squared + b squared. So what we're getting is the sum of the squares of the coefficients of the complex numbers.

If you think in terms of a diagram and Pythagoras's theorem, this represents the area of the square that Euclid would have built on the segment from zero to the point z. It's the quantity that appears as the hypotenuse area in Pythagoras's theorem.

Now, an important point: in ordinary texts, one then uses this—in fact, one doesn't give this thing a name—one goes directly to what's considered more fundamental, namely the length of the complex number z by taking a square root of this. We do not want to do that, okay? We do not want to use square

roots if we can avoid it. Square roots properly take us outside of pure mathematics because the square root is actually a very subtle, problematic construction, as evidenced by the fact that if I ask you what the square root of 13 is, you can't give me a precise answer. Your

calculator will spit out a certain number of digits, but you don't actually have a number whose square is 13. In fact, the existence of square root of 13, as I argue in my Math Foundations series, is highly suspect. So we're going to avoid that.

We're not going to mention the length of vectors or moduli of complex numbers. This is a big step up conceptually, actually, okay? So not using that means that we have to make everything algebraic—we're forced essentially in the right direction in terms of our thinking.

So I know this is a little bit novel to many of you, but believe me, there's much to be said for it. In particular, we have this main theorem, which looms as an absolute pillar of the subject, which is that if you have two complex numbers z and w, then the quadrance of the product z times w is the product of quadrances: Q(z times w) = Q(z)

times Q(w). So the quadrance of a product is the product of the quadrances. Ha! So let's prove it. Let's say that z is equal to a + bi and w equals c + di.

Then the left-hand side—what does it look like? Well, we have to multiply z and w, and then there'll be two coefficients, and we have to take the first coefficient squared plus the second coefficient squared—that's what the quadrance of z times w is. So here, I hope you recognize this as the real part of the product:

ac - bd, and the imaginary part of the product is ad + bc. So we're going to take this squared plus this squared—that's the left-hand side. The right-hand side is the quadrance of z—that's a squared + b squared—times the quadrance of w, which is c squared + d squared.

So the assertion is that this equals this for any rational numbers a, b, c, and d, and this is an identity of Fibonacci, Leonardo of Pisa, and in fact possibly goes back to Diophantus, 300 years after Christ. Let's check why it's true.

We square this—we're going to get ac squared, which is this term here, plus bd squared, which is this term, minus twice the product. Let's forget about minus twice the product temporarily. When we square this thing, we get ad squared, which is this times this, plus bc squared, this times this, plus twice the product.

Alright, so minus twice the product over here is minus 2 times a times c times b times d, and plus twice the product here is 2 times a times d times b times c. Those two terms are conveniently exactly the same with opposite signs, so they cancel, and equality is obvious.

So this drives this basic result—very important, crucial identity that's somehow at the heart of the beauty and usefulness of complex numbers in terms of rotations. So while quadrance is a rational analogue of length or distance, what is a rational analogue of angle? Traditional treatments of complex

numbers rely heavily on the notion of length and angle for polar coordinates for complex numbers. Maybe it's inconceivable to you that you could study the subject without those two concepts if you're very familiar with complex numbers, but in fact, you can. There are rational analogues of angle, and there are a number of different possibilities.

One of them is the spread from rational trigonometry, or the turn, or the half-turn. The turn is also described in my book; the half-turn I'm going to talk about in the Math Foundations or the WildTrig series. It's something else that's also very interesting. Today, I'm going to tell you about the turn, which

is a very natural idea that's closely connected with the geometry of lines in any xy. So here is a number z = a + bi, x-coordinate a, y-coordinate b, and here's a vector joining the origin to z. We're going to define the turn of z to be the slope of this line.

The slope is, by definition, the change in the y-coordinate divided by the change in the x-coordinate, so the y-coordinate is b, the x-coordinate is a, and that's how we're going to define the turn of this complex number. So geometrically, we're going to relate it to a slope of a line.

Now more generally, what we really want to do is to define the turn between two vectors. So this would be the turn between the vector, say, (1, 0) and z, but more generally, if we have z1 = a1 + b1i and z2 = a2 + b2i, then it turns out that this is the formula for the turn, which is a generalization of this slope of line. What is

it? So the turn from z1 to z2 is, by definition, a1b2 - a2b1 in the numerator, and in the denominator a1a2 + b1b2. Now these two expressions look a lot like the coefficients of the products of z1 and z2, and if you think about it actually for a little while, you'll see that these are the complementary expressions that are linear in the various variables—complementary to the two

expressions that are used in the product. This you may also recognize as determinants; essentially, it's an area of a parallelogram formed by the two vectors, and this here you recognize as a dot product between the two vectors. Now this particular expression is very nice—what's

the definition of the turn from z1 to z2, and in the special case when the first vector is, say, (1, 0), then you can check that this thing reduces to this one here. There's one more thing to be said: there is a slight problem with this definition if a is 0. In other words, if the

complex number is perpendicular to the x-axis or 90 degrees, then the turn is either undefined or infinite. And that will also happen here in this more general case—if the two vectors are perpendicular, say z1 and z2 are perpendicular, then the turn will be undefined. By the way,

here is the terminology that I like to use. So it's given by this little straight line with a little arrow, and there's the "u." The arrow here denotes that the object has an orientation—it depends on which complex number is first and which one is second. You can check that if you change the order, then the turn negates.

Alright, so this is how we're going to express the turn from z1 to z2. It's a number associated with the two complex numbers that measures somehow how far apart they are—it's a replacement for angle, but it's a rational replacement—no transcendental functions or definitions are required.

And here's the main theorem for turns: the theorem that asserts that if we have three complex numbers z, w, and v, with v not equal to 0, then the turn from z to w is the same as the turn from zv to wv. In other words, when we multiply both z and w by the same complex number, we get two new complex numbers zv and wv.

The turn between them is the same as the turn between the original z and w. So this is a rational analog to the statement that when we multiply by a complex number, the angles between two complex numbers don't change. Alright, so what's the proof? Let's say that z

is a + bi, w is c + di, and let's suppose that v is x + yi. So let's calculate this thing here. Alright, so we have to calculate z times v and w times v, and then we have to apply the formula for the turn on the previous page. Alright, so z times v is going to be—well, there's going to

be an ax - by—that's going to be the first term, and then there's going to be ay + bx. And w times v is going to have terms cx - dy and the next one cy + dx. Then you can check that this numerator here is the determinant formed by those two vectors that we just mentioned.

So the first coefficient of zv times the second coefficient of this one, minus the first coefficient of this one times the second coefficient of that one. And here in the bottom is the dot product, or inner product, of those two vectors—the product of the two first coefficients plus the products of the two second coefficients.

And now you have to expand this, alright? So please do this—it's a good algebraic exercise. Expand the numerator, expand the denominator, and stare at them both individually, and convince yourself that they both factor. The numerator factors

as x squared + y squared times (ad - bc), and the denominator factors as x squared + y squared times (ac + bd). And since we're assuming this v is non-zero, the x squared + y squared has to be non-zero—we're talking about rational numbers here—and so conveniently, these two terms cancel, and we're left with the turn between z and w.

It's a lovely calculation that replaces the usual fumbling around with cosines and sines and formulas for cos(a + b), and so on. Okay, this is without any transcendental notions. That's a big step up, in fact, so

it's something well worth thinking about. So the invariance of turn under multiplication—mathematics is a very conservative subject, especially pure mathematics, and many mathematicians have said to me, "Yes, Norman, that's all very good—rational approach, probably very interesting—but what about just the additivity of angles? With angles, we can add

them—an angle of three degrees plus an angle of four degrees gives you an angle of seven degrees." Well, that's true, but it's a very heavy price to pay—the machinery and the transcendental business that you have to put into the subject in order to try to get at that linearity. It's an attempt to force linearity on the circular structure, which doesn't really want to go from a theoretical point

of view. This exercise here gives you a rational analog to this additive structure. Suppose that you have three complex numbers z1, z2, z3, and we measure the turns between all three pairs. So let's say u1 is the turn between z2 and z3, u2 is the turn between z3 and z1, and u3 is the turn

between z1 and z2. Remember, the order matters, so I'm kind of doing this in a cyclical order so it's all symmetrical. Okay, so a great exercise: prove then that these three turns satisfy the following pleasant relation: u1 + u2 + u3 = u1u2u3. And please do it without any reference to

tangents of angles, okay? This is a purely rational result—it deserves a purely rational proof. Notice that it means, in particular, that if you know two of those turns, then you have a linear equation for the third one. So while it's not quite as simple as just adding the

two turns, it's not that much more complicated. Alright, now let's have a look at a very interesting circle of ideas. I want to connect the unit circle in the complex plane, which plays a very important role, to the projective line of lines through the origin, to the idea of rotations

of the plane centered at zero. So these three subjects are all intimately connected. I'm going to remind you of our main theorem with quadrances, that quadrance of z times w is quadrance of z times quadrance of w.

And in particular, if you have a point gamma which lies on this unit circle, so the unit circle has equation x squared + y squared = 1 in xy coordinates—in terms of complex numbers, well, it's those complex numbers whose quadrance is equal to 1, or if you like, z times z bar = 1. You know, three different ways of writing down the equation

of the unit circle. So suppose that gamma is on the unit circle, and we consider what happens if we multiply, say, w by gamma w. So out here somewhere, if we multiply by gamma, then we get the quadrance of gamma times the quadrance of w, but the quadrance of gamma is 1, and that tells

us that the quadrance of gamma w is the same as the quadrance of w. In other words, multiplying by gamma does not change the quadrance of the vector. So if you have a w out here and you multiply by gamma, what's going to happen is that you're going to stay on the big circle through w—you're going to rotate along that circle.

So multiplication by gamma is a rotation of the entire plane. So there's an intimate connection between rotations and points on the unit circle because, after all, if we have any rotation centered at 0, then this number 1 is going to get sent to another point on the unit circle, and the rotation is determined by that point that

we get. So the moral is that rotations of the plane centered at 0 and points on the unit circle are intimately connected. For every rotation, there's a point on the unit circle, and for every point on the unit circle, there's a rotation.

Now what's more interesting now is I'm going to throw in a third aspect—the projective line. Alright, so in projective geometry, the projective line in this kind of situation is obtained by looking at lines through the origin—that one there as an example. Well, we can take any line through the origin—this space of lines is what's called

the projective line. So here's a typical line, let's say, that goes through the point a + bi, say that point is z. And I remind you again that all numbers are rational numbers, so a and b are rational numbers.

We're only considering lines that go through the origin and a rational point. Now this line is determined by z, but it's also determined by any non-zero multiple of z. So we sometimes write square brackets of z to denote the line, meaning that we can multiply z by any

non-zero rational number and it's the same line. Alright, so there's a very important but somewhat subtle connection between the projective line and the unit circle. And what is it? Well, if you've been classically trained, you probably think that what you do is you take z and you divide by its modulus.

This is the standard way of getting a unit vector or a unit complex number from a general complex number—you divide by the modulus. So that means we take this and we have to divide by the square root of a squared + b squared.

But as I tried to explain, the square root of a squared + b squared is a highly problematic concept, and if we're working rationally, as we are here, it doesn't figure in our picture, so that's not an option for us. This is a very familiar construction in standard complex analysis, but it's not available to us thinking rationally.

Now, you might say, "Well, that's a problem for you, Norman. I mean, we can do something that you can't." Well, whether you can actually do it or just talk about it is a question. But the point is that there is something

better that we can do to replace this idea, and it's going to be very important for us to understand what that is. It's a replacement. Okay, so here's what we're going to do. We are

going to take this line, and we're going to take the point 1, and we're going to reflect the point 1 in this line—that's like a reflection. Alright, so how do we do a reflection? Well, we take a line through here which is perpendicular, something like that, that's perpendicular, and then we go up to here.

It's going to meet the circle at a second point here, and this point here is the reflection of 1 in this line. And notice that it's unique—that's quite different from the idea of taking this thing and dividing by its modulus to find a unit vector which is on this line.

In rational geometry, a line through the origin does not have to meet the circle, so the existence of this point and this point, this intersection between the line and circle, is problematic—maybe it exists and maybe it doesn't; it depends on a quadratic equation determined by a and b. But this point always exists—it's there no matter what; we

can always reflect this point in that point. Alright, so, well, okay, well, what is this point? How can we write it efficiently? Well, it turns out there's a very nice and beautiful way of doing that, and we can see that by drawing this line here and going out a little bit further—imagine it coming out a little bit.

So the trick is to consider this complex number z and to multiply it by z. Where will that get us? Well, if we multiply this by z, we're going to get z squared, and z squared is going to be up here somewhere. And in red, up here somewhere, is going

to be z squared—I'll put it in red so you can see it—z squared, and it's going to be up there, where this turn and this turn are equal. This turn, say u, and this turn, u, are equal, because multiplying by z will take 1 to this and will take this to this, so it preserves the turn here going to the turn there.

That's the main theorem about turns: the turn between 1 and z is the same as the turn between z and z squared. And so that line is going to meet that point, and that point is equal to this thing divided by its quadrance. So this point here

is actually z squared over the quadrance of z. It's the same kind of thing that we were trying to do here, but because this is z squared, the quadrance of this is the square of the quadrance of z, and that's a property of this thing here—the quadrance of z squared is the quadrance of z all squared.

So when we divide z squared by the quadrance of z, we're on the unit circle. So what a beautiful thing that is! So that's a way of associating to a line a point, and if you change that by multiplying it by a scalar—multiplied by 2, or by 3, or by -1—this point is not going to change.

If you multiply z by lambda, then there's going to be a lambda squared appearing up there, and there's going to be a lambda squared appearing down there, and they're going to cancel. So it doesn't matter what z you choose on that line, you perform z squared divided by Q(z), you're always going to get that point.

This is a very important idea, and it's somehow at the heart of a certain two-to-oneness which appears throughout this subject, and it's often considered a feature of the quaternion business, but you should appreciate that it's already existing here in the complex numbers—that if we want to associate a rotation to this line, the way to do it is to associate to it the rotation by z squared over Q(z).

A very good diagram to spend a couple of hours staring at and thinking about. So let's try to understand this structure by looking at a sort of special case. But first, let's identify that what we've managed to do

is to associate to every complex number z which is non-zero a rotation. Let's call it phi sub z, where this rotation is defined as follows: it's defined as phi sub z acting on a complex number w is multiplying w by z squared over Q(z). Because z squared over Q(z) is a unit

complex number, it lies on the unit circle, and so this is necessarily a rotation that depends on z. I'll just mention that there's another nice way of writing it: remember that Q(z) was equal to z times the complex conjugate z bar, and so if we write it that way, then two of the z's on top and bottom cancel, and we can write it as z times w over z bar.

It's another way of writing it. So that's quite nice, and it has the property that if you multiply z by a rational number, then the rotation doesn't change. So the rotation only depends on the line through the origin through

z. And you can also check, as a nice exercise, that if you compose two rotations, phi sub z phi sub w, that's the same as phi sub z times w. Now I want to connect that circle of ideas with a very classical subject which is close to my heart, which I've mentioned in quite a few of

my videos. It's a very important thing that all undergraduates should be well aware of, and that's the rational parameterization of the circle. And in fact, that really drops out from looking at what we've just done.

Okay, so a bit of complex analysis allows us to rethink this rational parameterization in a very pleasant way. And the idea is that, well, if, let's say, we're interested in lines through the origin, the space of lines through the origin can be described as essentially a line plus a point at infinity. So if I take this line here, this green line,

which has the equation real part of z equals 1, all the complex numbers on here are of the form z = 1 + it, where t is some number. Alright, if we take any z like this, then the point on it, joined to the origin, we get a line, and all lines through the origin are of that form for some t, except for the sort of special case where we're just looking at this parallel one, which

is sort of the case when t is infinite. Alright, but otherwise, all lines meet this green line exactly at one point, let's call it 1 + it. Alright, now that's a rather interesting point—let's have a look at what happens if we do this construction that we talked about, where

we calculate z squared over Q(z) when z happens to be this special complex number 1 + it. So if z is equal to 1 + it, then z squared over Q(z)—what is it? Well, first of all, what is z squared? z squared will be 1 - t squared + i times 2t, 1 - t squared + 2t times i. But we have to

divide by Q(z)—what's Q(z)? It's the sum of the squares of the coefficients, so it's 1 + t squared. So we get (1 - t squared)/(1 + t squared) plus (2t)/(1 + t squared) i, and we know that that's a point on the unit circle, and moreover, it's exactly the point that we get by doubling this angle, if you like, or taking this turn and applying

an equal turn to it. There it is right there. So while z squared is up here generally, then dividing by Q(z) scales it down so that it gets on the unit circle, and this point is also the reflection of the point 1 in this line.

Now this is the familiar rational parameterization of the unit circle that goes back essentially to Euclid—every rational point on the unit circle is exactly this form. That's the x-coordinate, that's the y-coordinate, for some rational number t, except for the special case when you have minus one, which sort of

corresponds to t equals infinity. There's a few other things I would just like to mention about this which are interesting too. The other property is that if you take the line from -1 to this point z squared over Q(z), that's actually parallel to the line

that we started with. Why is that? Well, it's because here you see that's u, that's a turn of u, and that's a turn of u, and so this turn subtended by this chord at the center is related to this turn subtended by that same chord at the circumference—that's going to be a turn of u as well—elementary geometry of the circle.

So this line is actually parallel to this line. And if we just move everything over by 1, then the point 1 + it, if we translate it over by 1, it just becomes the point it. And so the meaning of t, if you like, you can

think of it here, is it's the point on the y-axis that you need to choose, and then connect it to -1, and then that line will meet the unit circle at this point: (1 - t squared)/(1 + t squared), (2t)/(1 + t squared) i. And then connecting things with the rotation phi sub z, once we have this point on the unit circle associated with that is the rotation multiplication by

that point, or what we're calling phi sub z, and that's the rotation that physically takes the point 1 and rotates it to that point there, takes this point and rotates it to there—so it's a rotation like that. That's the rotation phi sub z. And for future reference, I'd like to also remind

you that that rotation can also be thought of as a product of two reflections. Alright, so there's this line here, and there's this line here that we're considering. If we take the product of the reflection in this line, followed by the reflection in this line, what do

we get? Take some point here and you reflect it first in the x-axis, and then you reflect it in this line here, you're going to get a point here, which is the same as the rotation of the original point phi sub z. So this rotation, determined by this unit circle, is also associated with this line in another way—that is, basically the product

of this reflection times this reflection. Alright, so that's quite a lot of material. If you haven't seen complex numbers before, of course it'll probably be overwhelming to you. If you have seen some complex numbers before, I hope this provides a somewhat different view on things, and this two-to-one aspect is going to be very important for us when we try to

understand quaternions. So this is a very good thing to try to understand first—it makes understanding quaternions so much easier. As good practice for you, you might like to calculate the turn associated with the vector, say (1, 0), and this unit vector that we found here.

So in other words, the turn u, this turn here, is t because that's t and that's 1, so the slope is t. So that's a turn of u, and that's a turn of u. Then what's the combined

turn from here to here? In fact, you can think about what happens if you combine more turns of u, three turns of u, four turns of u, and so on. What is the turn of the composite turn? It's a very nice little formula, and of course we all want to do that without any transcendental functions—no circular functions, no mention of angles.

So in our next video, we're now going to go up to three-dimensional space and start discussing rotations—the problem of rotations. How do we describe rotations in three-dimensional space, and in particular, how do we compose rotations? I hope you'll join me for that.

I'm Norman Wildberger. Thanks for listening.

The rotation problem and Hamilton&#39;s discovery of quaternions I | Famous Math Problems 13a

Transcript

The rotation problem and Hamilton's discovery of quaternions I | Famous Math Problems 13a