Game Production and Pipeline Services

Geek Zone

Homogeneous coordinates with GA (in progress...)

It is assumed that the reader is familiar with 3-d computer graphics math-related topics as well as having some knowledge of Geometric Algebra.

1 Intro

In standard 3-d engines, little distinction is made between points and vectors. For example, a 3-d vector a = (1, 2, 3) is treated either as a position vector or as a direction, depending on the context.

Homogeneous coordinates attempts to rectify that by appending an extra coordinate (or 'dimension', if you prefer) to vectors. In other words, if we're working in N-d, we add an extra dimension to the base space, and call this new space projective N-d.

Thus, the 3-d point version of a looks like a = (1, 2, 3, w ≠ 0) where w is the extra coordinate.

(Note that I use bold to indicate vectors, and italics for scalars and other non-vector quantities.)

What happens if w does equal 0? Then we have an improper point (also called a point at infinity), which is indistinguishable from a vector in the base space.

Why does this matter? Well, the typical way of converting a point back to its base space- representation (i.e., a vector) is to divide all of the point's components by w (which is then dropped). If w is equal to 0, this causes a divide-by-zero error.

Now, when using APIs like OpenGL or DirectX, the extra coordinate allows the use of 4x4 transformation matrices to perform not only scaling and rotation, but translation as well. Other parts of the 3-d pipeline also use homogeneous coordinates (clipping, perspective transformations, etc.).

1.1 Using homogeneous coordinates without being entirely aware of it

Homogeneous coordinates are also used implicitly when 3-d programmers manipulate planes in [n;-d] form, where n is the plane's 3-d normal vector, and d is the plane's distance from the origin.

Creating a plane X from three position vectors {a, b, c}, involves the following:

Let n = ((b - a) ⊗ (c - a)) / ||((b - a) ⊗ (c - a))|| , with ⊗ denoting the 3-d vector cross product.

Then X = [n, ⟨n, a⟩] , where ⟨u, v⟩ is the bra-ket notation denoting the dot product of vectors.

Taken as a whole, the above represents the typical state-of-the-art with regards to the use of homogeneous coordinates in 3-d apps. On rare occasions, we may also hear talk of Plücker coordinates, which offers interesting ways of manipulating lines and planes in 3-d.

Unfortunately, Plücker coordinates are usually presented de facto, that is, without explaining the structural 'method to the madness'.


On the other hand, GA provides an elegant and unified method for creating geometric entities such as lines and planes, extensible to any dimension.


2 Demonstrating GA

By way of example, let's create a plane using GA techniques:

Let {a, b, c} be three points in P3.

Then computing a normalized plane | Π | using GA can be done in the following manner.

Define a function dir(X) that computes the direction of k-flat X in PN as

dir(X) = (eN + 1X) ,

and function weight(X) as

weight(X) = || dir(X) ||

Then | Π | = X / weight(X) , where

  • X = ( abc ) ,
  • ∧ denotes the progressive product,
  • eN + 1 denotes the unit basis blade corresponding to the homogeneous coordinate, and
  • ⌋ denotes the left contraction product.

(Note: I prefer the term 'progressive' rather than the more frequently-used 'outer' since GA also offers a regressive product, which is dual in function to the progressive product.)


Now, some definitions are in order:

  • PN denotes the projective space of N dimensions, i.e., having dimensionality (1 + N) for a N-d base space.
  • A k-flat refers to a point, line or plane, each of which are k-vectors representing geometric entities. In other words, a 1-flat is a point (is a 1-vector), a 2-flat is a line (is a 2-vector) and a 3-flat is a plane (is a 3-vector).
  • The left contraction product, introduced by Dorst et al., is one of several inner products definable in GA, and probably the most useful for geometry.


Suppose you have two vectors of grade-m and grade-n, respectively. Then

Am ⌋ Bn = ⟨ A B ⟩ n - m , if (n - m) ≥ 0.
* A B denotes the geometric product of multivectors A and B, and
* ⟨ M ⟩ k denotes the grade-k part selector (a.k.a. grade projection operator) of multivector M.

With this understanding, the function dir(X), which computes the direction of k-flat X, returns

  • a scalar if (k == 1), i.e., if X is a point. This corresponds to the w-coordinate of the point.
  • a vector if (k == 2), i.e., if X is a line. This corresponds to the direction of the line defined by b - a.
  • a bivector if (k ==3), i.e., if X is a plane. This corresponds to the bivector representing the plane's surface normal.

The function weight() computes the magnitude of the value returned by dir(), and returns a k-flat's weight (a scalar).


In summary, a k-flat is a k-vector, and has non-zero weight if it's not degenerate or improper (i.e., 'at infinity'). Furthermore, dividing the flat by its weight normalizes it.

It is advisable to work with normalized k-flats, much in the same way as it is good practice to work with unit-length normal vectors in a standard 3-d app setting.


We saw how to create a normalized plane; now let's make a normalized line. The procedure is even simpler: for points {a, b}, a != b, simply compute a ^ b and divide by the weight.

The above method extends to any dimension: no having to create specialized classes and code for each geometric entity, no messing around with Plücker coordinates, etc!


Now let's look at things like distance queries, incidence relationships (i.e., finding intersections), etc.


Define the scalar distance between 3-d points and planes as

dp, Π = p ⌋ ∼Π ≡ ⟨p,∼Π

where ∼ denotes geometric dual, i.e.,

∼X = X IN-1

which is the geometric product of multivector X with the unit antiscalar inverse for PN

Q: What's going on here?

A: Let us first understand what Π represents: it is a 4-d trivector constructed by 'wedging' three points together (i.e., by having computed the progressive product of three points).

* The dual of any k-flat in PN is a (N + 1 - k) - vector in RN + 1

* Write the dual of a k-flat in lower case, so that ∼Π is rewritten as π (for planes), ∼Λ is rewritten as λ (for lines), etc..

Then π is a 4-d vector in [n, -d] form, while Π = d e123 + nz e124 - ny e134 + nx e234 , where the ni denote the coefficients of the plane's normal vector.

Since p is both a point in P3 and a vector in R4, and π is a vector in R4 as well, then clearly

dp, Πpπ ≡ ⟨p,π

In other words, the distance of any 3-d point to any 3-d plane is the result of evaluating the dot product between the point and the dual plane!

* Note also that the value of pπ = the value of pΠ .


Let p = (1, 2, -3, 1), and Π = (0, 0, 1, 1) ∧ (3, 0, 1, 1) ∧ (0, 5, 1, 1) = (15, 15, 0, 0) .

Then | Π | = (1, 1, 0, 0) and |π| = (0, 0, 1, -1). Interpret this as a plane at distance 1 from the origin and having normal = Z.

Computing p ⌋ |π| gives -4, and computing p ∧ | Π | gives -4e1234 .

Thus we see that p is at distance -4 from the plane.

  • It is not hard to imagine that if pπ = 0 (or if pΠ = 0), then the point lies on the plane.
  • It is also not hard to imagine that if the above works for point-plane distance queries, might not it also apply to point-line distance queries as well?


In summary,

  • the 'planes' that 3-d programmers are accustomed to working with are in fact 'dual planes' in the parlance of projective GA.
  • N-d points are just (N + 1)-d vectors.
  • Points and planes are geometrically dual to each other.


Jason WoodComment