Sententia cdsmithus

June 15, 2009

Thoughts on an Associative Metric for the Non-Negative Reals

Filed under: math — cdsmith @ 12:00 pm

A recent reddit post posed the question of whether there exists an associative metric on the non-negative reals.  I didn’t see it until after most people moved on, so I’m posting my thoughts on the subject here.  It is likely that none of this is new, but perhaps some people will find it interesting to have it collected in one place.

As a brief review just to collect the relevant definitions in one place… a function d : X^2 \to \mathbb{R}_{\geq 0} is a metric if:

  1. d(x,y) = 0 when x = y, and d(x,y) > 0 when x \neq y.  This is called definiteness.
  2. d(x,y) = d(y,x).  This is called symmetry.
  3. d(x,y) \leq d(x,z) + d(z,y).  This is the triangle inequality.

In particular, if X = \mathbb{R}_{\geq 0}, then d is a binary operation.  One may then ask whether it is associative.  That is, if d(x, d(y,z)) = d(d(x,y),z).  The problem is whether there exists a metric d on the non-negative reals that is in fact associative.

A Key Observation

To get started, let’s just play some games with manipulating the axioms above.  Let’s consider any non-negative real number x, and let k = d(0,x).  That is, k is the distance of x from the origin.  Then something interesting happens.  By the associative property we’ve just discussed, we have:

d(0, d(x,k)) = d(d(0,x), k)

But remember, d(0,x) on the right-hand side of that equation is the same as k, and by the definiteness property of a metric, d(k,k) = 0.  So we really have:

d(0, d(x,k)) = d(k,k) = 0

Now, applying definiteness again in the other direction, we can conclude that 0 = d(x,k), and then applying it yet a third time, that x = k.  In other words, we have:

Proposition 1: Suppose d is an associative metric on the non-negative reals.  Then for all non-negative reals x, d(0,x) = x.

This is not an unusual property of a metric on the reals.  For example, the standard metric d(x,y) = |x-y| has this property for all non-negative numbers.  However, it’s not true of all metric (not the discrete metric, for example.)  Note that so far, we have used only associativity and definiteness.  The really interesting property of metric, though, is the triangle inequality.  So let’s apply that to try to get some bounds on the possible choices of d, by decomposing distances into distances from the origin.  Suppose we choose x and y as arbitrary non-negative reals.  Then we have:

  1. d(0,x) \leq d(0,y) + d(y,x)
  2. d(0,y) \leq d(0,x) + d(x,y)
  3. d(x,y) \leq d(x,0) + d(0,y)

Now, just using symmetry and proposition 1 to simplify those expressions, we get:

Proposition 2: Suppose d is an associative metric on the non-negative reals.  Then for all non-negative reals x and y,|x-y| \leq d(x,y) \leq x + y.

These are not very tight bounds on the possible values of d, really.  But they do tell us something interesting: namely, that a metric that works for us will necessarily have something to do with the magnitudes of the numbers involved.  This was not necessarily a given — one can imagine metrics on many sets of numbers that are entirely unrelated to the magnitudes of the numbers involved.  It turns out that in our case, our willingness to interchange the value of the metric with its inputs via the associative property somehow forces the metric to give significance to the magnitude of the numbers.

Group Structure

The original post to Reddit made the casual observation that it can be shown that if this is true, then \mathbb{R}_{\geq 0} is a group under this operation.  To review another basical definition, a group is any set together with a binary operation (which I’ll call \bullet), which satisfies:

  1. Associativity: that is, x \bullet (y \bullet z) = (x \bullet y) \bullet z.
  2. Identity: that is, there exists e such that for all x, e \bullet x = x \bullet e = x.
  3. Inverses: that is, for any x, there is a x^{-1} such that x \bullet x^{-1} = x^{-1} \bullet x = e.

As a quick change of notation (so that we’re more consistent with the infix notation normally used for groups), we define x \bullet y = d(x,y).  Of course, the associativity of \bullet was something we assumed from the very beginning.  Proposition 1, together with the symmetry property, give us the identity, which is zero.  Definiteness gives us inverses: for any x, we know d(x,x) = 0, so that x is its own inverse.  Therefore, \langle \mathbb{R}_{\geq 0}, \bullet\rangle is a group.

A good bit could be said about the structure of this group.  In fact, that every element is its own inverse gives us very specific information about what this group looks like (it’s built from copies of \mathbb{Z}_2, for example.)  But even just knowing that it’s a group gives us some interesting ways to think about the problem.  The bounds we found in the previous section were actually compatible with the standard metric on the real numbers.  But seen as a group operation, our associative metric starts to develop some interesting properties that are somewhat different from the standard metric.

For example: suppose you give me a non-negative real x, and ask me what other non-negative reals are a distance y away from it.  Because distance is a group operation, there is precisely one answer to this question: if x \bullet z = y, then z = x^{-1} \bullet y (and in this case, since every number is its own inverse, that’s the same as x \bullet y).  In other words the function d_x(y) = d(x,y) is necessarily one-to-one, for any x.

That’s a pretty significant statement!  In fact, it’s fairly simple now to show that for x \neq 0, d_x therefore cannot be a continuous function in the topology induced by the standard metric |x-y|.  We might previously have hoped that perhaps the metric that worked would look something like the standard metric, though perhaps stretched or distorted in some way.  In fact, the earlier proposition gave us extra hope, by limiting the amount by which our metric can differ from the standard metric.  It’s now clear, though, that an associative metric will be quite  different from the standard metric.

(As an aside, it’s worth noting that d_x above is continuous in the topology induced by the metric d.  This is basically equivalent to requiring that for any \epsilon > 0, there exists \delta > 0 such that whenever y \bullet y_0 < \delta, we also have d_x(y) \bullet d_x(y_0) < \epsilon.  But d_x(y) \bullet d_x(y_0) = x \bullet y \bullet x \bullet y_0 = y \bullet y_0, so choosing \delta = \epsilon always works.)

Where Next?

Who knows.  We have two conclusions now that seem rather at odds with each other: first, then an associative metric would need to overestimate the standard metric by a bounded (though, admittedly, very loosely bounded) amount.  Second, we have that in terms of topological properties, our metric differs significantly.  Resolving these tensions — or showing that they can’t be resolved — would perhaps be an important next step in addressing the problem.

May 23, 2009

Approximations of Pi (and other numbers) as Fractions

Filed under: Uncategorized — cdsmith @ 2:38 pm

A discussion with a friend and his daughter this morning led me to remember an intriguing function from the Haskell standard library – approxRational.  We were talking about his daughter learning in school that pi is (according to her class notes; it’s quite likely that the teacher was actually more correct) “3.14, or 22/7″.  Even the 11-year-old among us knew that something was wrong; her calculator told her that 22/7 was not exactly 3.14.  They are both approximations, of course, which was easy to explain.  But then she wanted to know why we choose those particular approximations.  And that was an interesting question.

With decimal numbers, of course, you just pick a number of digits and round it off.  Sure, this depends on the base used to represent the number, and that’s ugly… but it’s called the decimal system for a reason.  With fractions, the situation is less clear.  Why would you want to remember 22/7?  There’s a whole host of other possibilities: for instance, 3/1, or 7/2, or 16/5, or 311/99, or 355/113.  (That last one, as we’ll see later, is a particularly good choice.)  Some of those fractions are easier to remember than 22/7.  Others are closer to pi than 22/7.

We’ve got two competing values here:

  1. We want the result to be easy to remember.  If we seek a result (as I do) that’s not somewhat arbitrary (note that using base 10 itself is arbitrary, and therefore so are criteria like wanting round numbers, or numbers whose digits form patterns) then the only good interpretation of “easy to remember” is “using small numbers”.
  2. We also want a number that’s close to pi.

Clearly, we’re looking for some kind of a balance here.  Well, Haskell’s approxRational function to the rescue.  The documentation for this function says:

approxRational, applied to two real fractional numbers x and epsilon, returns the simplest rational number within epsilon of x. A rational number y is said to be simpler than another y' if

  • abs (numerator y) <= abs (numerator y'), and
  • denominator y <= denominator y'.

Any real interval contains a unique simplest rational; in particular, note that 0/1 is the simplest rational of all.

The claim that any real interval contains a unique simplest rational is an interesting one.  Simplicity, as defined here, is a partial order.  That is, 1/2 is simpler than 2/3, which is in turn simpler than 3/4.  But if one asks which of 2/3 or 3/2 is simpler, there is no answer.  Those numbers are incomparable with respect to the simplicity order; neither is simpler than the other.  In general, with a partial order, there may be any number of minimal elements in a given set; possibly none at all, or possibly infinitely many.  The claim here is that with this particular partial order of simplicity, and this particular set being an interval of real number, there is precisely one.

This turns out to be not terribly hard to prove.  Suppose you have two rational numbers a/b and c/d. Then let e=min(a,c) and f=min(b,d).  It’s not hard to see that e/f is at least as simple as either a/b or c/d (this by definition, basically; if e/f can be reduced, the result is more simple, not less), and also that it falls between a/b and c/d so that it must be in any interval that contains them.  It’s not necessarily the simplest rational in the interval, but it proves that no two rational numbers can both be minimally simple over an interval than contains both of them.

This also provides a pretty powerful tool in answering our question.  If we decide in advance how much inaccuracy we’re willing to accept in our approximation, then there is a unique approximation that’s easiest to remember, and approxRational will tell us what it is.  This immediately eliminates some possible approximations.  For example, you should never approximate pi as 7/2, because 3 is both easier to remember, and closer to pi.  By considering ever smaller intervals and using approxRational, one can obtain a sequence of approximations that are simplest for some amount of allowable inaccuracy.

Prelude Data.Ratio> map (approxRational pi) [0.5^k | k <- [1..6] ]
[3%1,3%1,13%4,16%5,19%6,22%7]

Up to a certain level of accuracy, the best approximations appear to be 3, 13/4, 16/5, 19/6, 22/7, or 201/64.   But ah, how to choose.  One clue is afforded by the fact that the number 3 appears twice in the list.  In fact, if we extended the list further, we’d discover that 22/7 appears 3 times!  If a number is simple enough to beat out other approximations for somewhat large intervals, but also good enough to keep winning for very small intervals, then this bodes well for its being a good general-purpose choice.  It’s a decent measure of the efficiency of the number.  Let’s get some more precision in the choice of intervals, and then count up the number of occurrences:

Prelude Data.Ratio Data.List Control.Arrow Data.Function> take 10 $ reverse $ sortBy (compare `on` fst) $ map (length &&& head) $ group $ map (approxRational pi) [0.99^k | k <- [1..10000] ]
[(6414,884279719003555%281474976710656),(572,355%113),(297,22%7),(288,5419351%1725033),(194,3%1),(168,1146408%364913),(143,312689%99532),(120,833719%265381),(110,80143857%25510582),(109,165707065%52746197)]

I did the same sort of thing, but considered many more values for epsilon, then grouped recurring elements of the list, measured their lengths, and sorted on them.  Initially, the best approximation seems to be quite ugly:884279719003555/281474976710656.  That’s an illusion, though.  Because pi is itself a floating point constant in Haskell, it is actually equal to a rational number, and that’s the one.  So it’s occurrence at the top of the list means only that we’ve looked as far as we can.  The remaining answers are true approximations.  22/7 does indeed fair pretty well, though it’s narrowly edged out by 355/113, which is an even more efficient (in fact, to get closer to pi than 355/113, you have to go all the way to 52518/16717!

The lesson?  Who knows?  22/7 is a pretty decent choice for teaching 11-year-olds about fractions that are close to pi.  355/113 would be better, if you think they can remember numbers that big.  And approxRational is fun.

Exercise for the interested reader: implement an inverse to approxRational.  That is, given a rational number, can you determine the adjacent simpler rationals?  If so, you could work backwards: take the most precise possible rational approximation of pi that fits in a Double, and then determine how far you have to move away to get a simpler rational that approximates it.  (Hint: there are only finitely many rationals simpler than a given rational.  You shouldn’t consider them all, but if you think about what it would look like to list them, you can find a decent search strategy.)

April 20, 2009

Old Memories About Trisecting Angles

Filed under: Uncategorized — cdsmith @ 10:24 pm

This is just some remembering. When I was in middle school (about 12 or 13 years old), I was convinced I could find a way to trisect an angle. Of course, this is known to be impossible, by an argument that I now (but didn’t, at the time) understand. That didn’t stop me from trying, though.  Here’s the construction I spent most of my time on:

Conjeceture for trisecting an angle

Conjecture for trisecting an angle

I started out with an angle, which is the two outermost lines on the given drawing.  Using the compass, I marked off points equally far along the sides.  I then constructed their midpoint.  Then I drew three overlapping circles centered at those three points, as shown.  The intersection points of these circles furthest from the vertex of the angle were used to draw my “trisecting” lines.

Of course, this doesn’t really trisect an angle, since it’s a construction with a compass and straight edge, and no such construction can possibly trisect an angle.  But if you didn’t know that, there are a few sanity checks you might perform.

  • Does it appears to work?  Yes, it does, as it turns out.  In fact, I performed this construction fifty or sixty times, and measured the result with a protractor.  Each time, it was close enough to trisected that I chalked up the difference to human error.  (Having just performed the construction now using the Kig software, to capture the image above, I can now see that the result is visibly off for extremely small angles; but of course these are the ones that I was least able to check without reasonable computer software at the time.)
  • Does it work for cases that are easily calculated?  Yes!  A few easy examples: If you start with a 180 degree angle, you get back 60 degree angles… perfectly trisected.  A zero-degree angle is vacuously correct.  It’s a slightly more involved argument, but the construction also trisects a 90 degree angle correctly.

At this point, one can’t be blamed for getting a little excited.  We have a construction that gives a division of an angle into three parts.  The parts always seem to look equal, and are always close enough to be within reasonable measurement error.  It works for the cases we’ve checked so far.  One can’t be blamed for shifting gears a little here, gaining a little confidence, and looking a little less for a counter-example, and a little more for a proof.

Of course, those familiar with the impossibility of this task will be looking for the case where we start with a 60 degree angle.  And, of course, it turns out not to work.  In particular, the “trisected” angle works out to about 19.1066 degrees.  Close, but not quite.

So what am I really constructing?  Well, something pretty ugly.  I’m leaving out the math (basically, just pick out a couple triangles, and apply known triangle relations such as the law of sines and thelaw of cosines), but, the answer in Maxima notation (and simplified as best as Maxima can) is -asin((sin((3*%theta-5*%pi)/6)*abs(sin(%theta/2)))/sqrt(-2*sin(%theta/2)*cos((3*%theta-5*%pi)/6)+sin(%theta/2)^2+1)).  Yeah, wow.  But how could we have fooled ourselves into thinking that mass of complicated stuff is actually theta / 3?  Well, take a look at Maxima’s plot of this angle versus theta (that starting angle):

Graph of the constructed angle versus the original

Graph of the constructed angle versus the original

Ifthe construction were correct, that graph would be a straight line passing through (3,1).  As it turns out, it may be somewhat difficult to tell the difference in this graph!  So while this construction is wrong, it’s also remarkably close.

Code for Manipulating Graphs in Haskell

Filed under: Uncategorized — cdsmith @ 5:48 pm

Here’s a bunch of code I wrote, most of it about a year ago, for doing things with the graphs from the Data.Graph module in Haskell.  The choice of functions, from among the many generally useful functions acting on graphs, comes from a specific project.  The actually functionality is pretty generic, though.  So I’m just throwing this out there. If someone else wanted to package it and throw it on Hackage (likely with a different module name), they would be welcome to do so.

(more…)

April 19, 2009

Re: Conditional Convergence, or when Addition Isn’t Commutative

Filed under: math — cdsmith @ 11:46 pm

I’m writing in response to an apparently deleted blog post that was submitted to Reddit about conditional convergence.  The post was incorrect… well, perhaps nonsensical would be a better phrase.  But even though I’m more of an algebraist, conditional convergence is still an interesting idea.  So I’m taking my response there, and turning it into a new post.

Most people probably know that the infinite series (1 + 1/2 + 1/3 + 1/4 + …) diverges.  Even though the numbers in the sum keep getting smaller and approach zero the further you get out the sequence, you can still add up enough terms of this sequence to surpass any number you like.  It just might take a long time.  Now here are a couple other simple consequences of that.

  • Even if you throw away a bunch of terms from the beginning of the sequence, it still diverges.  Hopefully, this is obvious.
  • Even if you only add up every other term, it still diverges.  (Proving this by contradiction is fun if you haven’t done a lot of basic analysis.  All you need to know is that if each term of one series A_n is closer to zero than the corresponding term of a series B_n, and B_n converges, then A_n converges, too.)

Okay, now lets consider the sequence we’re really interested in:

1 – 1/2 + 1/3 – 1/4 + 1/5 – 1/6 …

The difference between this sequence and the original is that every other term is negative.  Now, it’s not hard to see that this series actually converges.  But what does it converge to?  Well, in the order it’s given here, it converges to something between 1/2 and 1.  But the interesting point here is that I can rearrange the terms of this sequence to give it any convergence behavior I want.  I can make it converge to any real number x. I can make it diverge toward positive infinity.  I can make it diverge toward negative infinity.  I can even make it jump around back and forth between 0 and 1 but without actually converging anywhere.  And with all of these different convergence behaviors, I am still going to use the same terms of this series… just add them in different orders.

Making The Series Converge to x: Suppose I choose a real number x, and I want my series to converge to x.  Easy.  For simplicity, I’ll assume x is positive, but it’s easy (and obvious) to modify this for x negative, as well.  I’ll start out just taking the positive terms of the sequence… 1 + 1/3 + 1/5 + 1/7 + …, and I’ll keep going that way until the sum is strictly greater than x.  (How do I know that it will eventually become strictly greater than x?  Because I am just adding up every other term of that first sequence, and we agreed that sum diverges to infinity.  So I’ll reach x in some finite number of terms.  As soon as the sum surpasses x, I’m going to start including negative terms… – 1/2 – 1/4 – 1/6 … and so on.  I’ll continue that way until the sum drops back down below x again.  Then I’ll swap back to the positive terms, and so on.

I claim this uses all the same terms as the original series… but instead of converging to something between 1/2 and 1, it converges to any real number x.  (If x is negative, the process is the same, except you start with negative terms.  And if x is zero, you start with whichever you prefer.)  Here’s why that is.  Suppose you’re looking for term n of the original series.  In the rearranged series, it won’t occur at position n.  But every time I switch directions, I’m taking at least one term from that direction… so term number n is guaranteed to occur somewhere in the first n direction changes.  Each direction change is only finitely long (as we saw above; this is a consequence of the fact that the strictly positive series diverges), so term n occurs at some finite index in the sum.  So every term appears, and by the way I chose terms, I know I didn’t introduce any new ones, or duplicate any of them.  The terms are the same; only the order differs!

But this new series definitely converges to x.  Right at each direction change, we overshoot x by a little bit.  But the amount that we overshoot x in the sum is always no more than the last term picked in the sequence… and as we choose more terms both on the positive and negative sides, the largest unpicked term is decreasing.  So over time, we start overshooting by smaller and smaller amounts, and the sum converges.

Making the Series Diverge to Positive Infinity: I can also make the same series, with the same terms, increase without bound, thereby diverging to positive infinity.  The trick it to use the negative terms, but just use them much less often.  So add positive terms until you get to, say, 2.  Then negative terms until you get back down to one.  Then positive terms until you reach 3.  Then negative terms until you’re back down to 2.  Keep that up and you get a divergent series.  Change the numbers a bit, and your series diverges to negative infinity instead.

You can even keep the sum from approaching infinity, while also preventing it from converging to any particular finite sum.  Add positive terms up to 1.  Add negative terms until you get back to zero.  Positive terms back up to 1.  Repeat.  The details of proving that you’ve got the same terms, and that the series actually shows the desired behavior, are similar to the first case.

So that’s conditional convergence.  You really can take a series that conditionally converges, and make it show any behavior you like simply by rearranging the terms.

April 11, 2009

Needing Intuition in Math(s): one example

Filed under: Uncategorized — cdsmith @ 11:55 am

I’m helping a few people with abstract algebra at the moment, and I came to this realization.  Most people learning abstract algebra, as far as I can tell, have no idea why homomorphisms and factor groups are sensible things to think about.  They quickly come to understand the idea of a group, and enough varied examples are usually given that they can see how the idea of a group applies to a number of things.  They quickly come to terms with subgroups, though the idea looks rather trivial to them.  Then you get to homomorphisms and factor groups; at this point, most classes run out of intuition and just jump in for some unmotivated mathematical constructions.

I’m not quite sure why this is, honestly.  Anyone with the slightest modicum of mathematical curiosity probably has thoughtn about factor groups since they were seven or eight years old.  In the context of integers and addition, most children realize on their own (whether it’s taught to them or not) that the sum of two even numbers, or of two odd numbers, is even, while the sum of an even number and an odd number is odd.  This is, of course, a factor group.  Students who are presented with the mathematical definition of a factor group should first have, in their set of mental tools, this simple intuitive definition:

Factor Group: For any group (G,*), a factor group is a group that is obtained by being sufficiently sleep-deprived (or perhaps drunk, depending on the university) that one can’t tell the difference between some members of the original group, and then trying to write down a group table.

Of course, one then goes on to point out that sometimes this works, but sometimes it doesn’t.  If one looks at the integers and only sees “even” or “odd”, then it works.  If one looks at the integers and only sees “negative” or “non-negative”, then it doesn’t work, since the sum of a negative number and a positive number could be either negative or positive.  It then becomes natural to ask when it works, and when it doesn’t.  This provides a justification, then, for nailing down the abstract definitions, defining normal subgroups, proving that the factor group is well-defined when modding out a normal subgroup, and so on.  First, though, the student needs to be convinced that these are natural things to think about.

(A quick aside: I’m not pulling out the idiotic canard that students need to be convinced that mathematics is useful in “the real world” or anything so ridiculous as that.  But even the purest mathematicians are more interested in answering questions that naturally arise than those that seem quite arbitrary.)

Speaking of defining normal subgroups, it is really inexcusable how many students have never even noticed the close relationship between normal subgroups and commutativity.  Sure, everyone knows that all subgroups of an abelian group are normal; but this seems to be treated as a sort of occasionally useful curiosity.  Few students are even exposed to the simple fact that normality of subgroups is inextricably entwined in the degree to which the subgroup commutes with the surrounding group.

Case in point: one of the equivalent conditions for normality that is often taught is this: Let H be a subgroup of G.  Then H is normal in G if and only if for any h in H, and any g in G, the product (g’ h g) is in H (where by g’, I mean the inverse of g).  This statement is absolutely correct… but it is utterly useless to a student who will most certainly not recognize (g’ h g) as a statement about the commutativity of h and g.  So first, it helps to show a few other results:

  1. The following are equivalent: (g’ h g) = h, and gh = hg.  The proof is trivial is both directions, and students can likely figure it out on their own.  But, and this is the important part, students will likely not recognize this simple fact if they haven’t seen it before.
  2. Let H be a subgroup of G.  Then H is a subgroup of Center(G) if and only if for any h in H, and any g in G, (g’ h g) = h.  This is just applying the previous result and the definition of a center.

From this point, the path to the earlier condition is clear.  We’re really talking about commutativity, but of course our ultimate goal is to be able to blur our eyes (or get sleep-deprived, or drunk) such that we can’t tell the difference between the members of H.  Then one may start with #2 above, and simply remove the distinction between members of H — so instead of looking for (g’ h g) = h, we look for (g’ h g) being “close enough” to h… in other words, it should be in the same subgroup.  One may now recognize that normality really is a weaker analogue of commutativity.

So the literal point here is that students ought to be taught these two intuitions.  The larger point is to wonder why it’s apparently considered appropriate to teach abstract algebra without teaching these two intuitions.

June 6, 2008

Another Screenshot of the Ray Tracer

Filed under: Haskell — cdsmith @ 4:30 pm

Some people are remarkably picky. They actually want to see reflections in a ray tracer. Can you believe that? Here you go then!

Ray Tracer Second Screen Shot

Ray Tracing in Haskell

Filed under: Haskell — cdsmith @ 10:03 am

I’m giving a talk at a user group meeting next week on how to program in Haskell. Rather than just listing off language constructs, I wanted to motivate the whole thing with an actual application. So that’s how I ended up spending yesterday afternoon writing a ray tracer in Haskell. I’m certainly not the first to do so, nor the best, but I am writing this blog post to explain the process I went through. I’ll try making this a literate Haskell post, so you can copy and paste it into a source file (extension .lhs) and run it. The actual code is split into three (very short) modules, but this should work.

Some imports to start us off:

> import Data.Maybe (mapMaybe)
> import Data.Function (on)
> import Data.List (minimumBy)
> import Data.Ix (range)
> import Control.Monad (forM_)
> import Graphics.UI.Gtk hiding (Color, Point, Object)
> import qualified Graphics.UI.Gtk as GTK

First, I built a couple basics of vector math. Out of a desire to demonstrate some list manipulation functions in the presentation, my Vector type is simply [Double].

> type Vector = [Double]
>
> (<+>)       = zipWith (+)
> (<->)       = zipWith (-)
> (<*>)       = zipWith (*)
> vlength u   = sqrt (sum (map (^2) u))
> normalize u = map (/ vlength u) u
> u .*. v     = sum (u <*> v)
> k #*  v     = map (*k) v

Since vectors don’t really even approximate a ring, it seems quite inappropriate to call them Num. Therefore, I’ve invented my own operators. The operators in angle brackets are component-wise, while the normal dot and scalar products are given their own operators, .*. and #*, respectively. I found this to be quite usable, though it would be equally reasonable to eschew operators, and define these as named functions to be used in infix notation.

I then define a few precedences, to make these easier to use.

> infixl 6 <+>
> infixl 6 <->
> infixl 7 <*>
> infixl 7 .*.
> infixl 7 #*

I’ll treat points as displacement vectors from an origin, and colors as vectors in a color space, but it helps to use separate names for them when writing type signatures:

> type Point = Vector
> type Color = Vector

Now for a few real data types. The idea is that an object has some general-purpose properties, plus a shape. The shape could be arbitrary, but is currently limited to only spheres and infinite planes. (Hey, this talk is only an hour and a half long!) So I’ve separated it out. The other properties I’ve got are material properties: how shiny (that is, reflective) is the surface, and what is its color. These are defined as functions from a point in space to the desired values, but in practice I’m just going to use const.

> data Object = Obj          { shape       :: Shape,
>                              objectColor :: Point -> Color,
>                              objectShine :: Point -> Double }
>
> data Shape  = Sphere       { center  :: Point, radius      :: Double }
>             | Plane        { point   :: Point, normal      :: Vector }

And lights are important to. Right now, I’ve got point lights and ambient lights. Adding directed lights (e.g. spot lights) wouldn’t be too difficult, but it is neither of critical importance nor necessary to demonstrate a Haskell language idea, so I’ve left it out.

> data Light  = PointLight   { lightPt :: Point, lightColor  :: Color  }
>             | AmbientLight { lightColor :: Color                     }

The two things we need to know about any shape are how to compute the distance to it along a given ray (specified by origin and a unit direction vector), and how to find the normal (another unit vector) at a given point on the surface of the object. These next two functions do this. They are defined for spheres and planes, but can of course be extended to new variants of Shape via additional pattern matching.

> shapeDistance :: Shape -> Point -> Vector -> Maybe Double
>
> shapeDistance (Sphere c r) orig dir =
>     let p    = orig <+> (c <-> orig) .*. dir #* dir
>         d_cp = vlength (p <-> c)
>         d    = vlength (p <-> orig) - sqrt (r^2 - d_cp^2)
>         ans | d_cp >= r                 = Nothing
>             | (p <-> orig) .*. dir <= 0 = Nothing
>             | otherwise                 = Just d
>     in  ans
>
> shapeDistance (Plane p n) orig dir
>     | dir .*. n >= 0  = Nothing
>     | otherwise       = Just (((p <-> orig) .*. n) / (dir .*. n))
>
> shapeNormal :: Shape -> Point -> Vector
> shapeNormal (Sphere c r) pt = normalize (pt <-> c)
> shapeNormal (Plane p n)  pt  = n

The math there can be derived without a whole lot of trouble, provided you’re relatively familiar with the basic ideas of vector operations. It surprises me, though, how many people only understand dot products in terms of their implementation (the sum of the component-wise products, or a b cos(theta)), and not in terms of what they mean. In particular, when one of the vectors in a dot product is a unit vector, the dot product is the component of the other vector along the direction of the unit vector. That’s a pretty powerful tool to have.

Next come the lights. For the lights, the general question is how much (in each color component) the given light illuminates a particular point with a particular normal. The object list is needed here as well, since lights can be occluded. This is all captured in a function applyLight, which is implemented for each type of light. It’s trivial for ambient lights, since by definition they ignore all the specifics of the situation; but it’s non-trivial for point lights. I also fence all light components to eliminate “negative” lights.

> fence a b x | x < a     = a
>             | x > b     = b
>             | otherwise = x
>
> applyLight :: [Object] -> Point -> Vector -> Light -> Vector
> applyLight objs pt n (PointLight lpt color) =
>     let dir   = normalize (lpt <-> pt)
>         dist  = vlength   (lpt <-> pt)
>         lval  = (n .*. dir) / dist^2 #* color
>         final = map (fence 0 9999) lval
>     in  case findNearest objs pt dir of
>             Just (_,opt)
>                 | vlength (opt <-> pt) < dist -> [0,0,0]
>                 | otherwise                   -> final
>             Nothing                           -> final
>
> applyLight objs pt n (AmbientLight color) = color

Now for the common stuff. A very common thing we want is to take our list of objects and find the nearest point of intersection for a given ray. This function does that, by calling objectDistance for each object in turn, and then choosing the minimum one. It gives back Nothing if there is no intersection, or Just (obj,point) if there is.

> findNearest :: [Object] -> Point -> Vector -> Maybe (Object, Point)
> findNearest objs orig dir =
>         let consider obj = case shapeDistance (shape obj) orig dir of
>                                 Nothing -> Nothing
>                                 Just d  -> Just (obj, d)
>             result = mapMaybe consider objs
>         in  case result of
>                 [] -> Nothing
>                 _  -> let (obj,d) = minimumBy (compare `on` snd) result
>                       in  Just (obj, orig <+> d #* dir)

And finally, the ray tracing itself. I’ve also thrown in defaultColor, which is the color assigned to rays that have either reflected back and forth beyond the recursion limit, or that proceed into infinity. For now, this is always black.

> defaultColor :: Vector -> Color
> defaultColor _ = [0, 0, 0]
>
> trace :: [Object] -> [Light] -> Point -> Vector -> Int -> Color
> trace objs lights orig dir 0     = defaultColor dir
> trace objs lights orig dir limit =
>     case findNearest objs orig dir of
>         Nothing        -> defaultColor dir
>         Just (obj, pt) ->
>             let n        = shapeNormal (shape obj) pt
>                 ndir     = dir <-> 2 * (n .*. dir) #* n
>                 refl     = trace objs lights pt ndir (limit - 1)
>                 color    = objectColor obj pt
>                 shine    = objectShine obj pt
>                 lvals    = map (applyLight objs pt n) lights
>                 lighting = (foldl (<+>) [0,0,0] lvals)
>                 in_light = shine #* refl <+> (1 - shine) #* lighting
>             in map (fence 0 1) (color <*> in_light)

Finally, there’s the GUI stuff. I won’t discuss this much, except to say that it’s horridly bad. I’m doing the actual ray tracing in an onExpose event handler. If I had even a modicum of common sense, I’d be rendering to an image or something, so that I wouldn’t have to redo all the ray tracing just because you switched over to a different window. Oh well. I’ll improve this before the talk on Tuesday.

> objs = [
>     Obj (Sphere [-50, -50, 100] 100)
>         (const [1.0, 0.3, 1.0])
>         (const 0.1),
>     Obj (Sphere [ 50,  50, 100] 100)
>         (const [0.2, 1.0, 0.2])
>         (const 0.1),
>     Obj (Plane [-100,100,0] (normalize [1,-0.2,-0.2]))
>         (const [1.0,1.0,1.0])
>         (const 0.0)
>     ]
>
> lights = [ PointLight   [175, -200, 10] [40000.0, 20000, 20000],
>            AmbientLight                [0.0, 0.05, 0.1]
>          ]
>
> drawScene d ev = do
>     dw    <- widgetGetDrawWindow d
>     (w,h) <- widgetGetSize d
>     gc    <- gcNew dw
>
>     let eye = [0, 0, -500]
>
>     forM_ (range ((0,0), (w-1, h-1))) $ \(x,y) -> do
>         let (x', y') = (fromIntegral x - fromIntegral w / 2,
>                         fromIntegral y - fromIntegral h / 2)
>         let [r,g,b]  = trace objs lights eye
>                              (normalize ([x',y',0] <-> eye)) 5
>         let fg       = GTK.Color (round (65535 * r))
>                                  (round (65535 * g))
>                                  (round (65535 * b))
>         gcSetValues gc $ newGCValues { foreground = fg }
>         drawPoint dw gc (x,y)
>     return True
>
> main = do
>     initGUI
>     w <- windowNew
>     d <- drawingAreaNew
>     windowSetTitle w "Ray Tracer"
>     containerAdd   w d
>     onExpose  d (drawScene d)
>     onDestroy w mainQuit
>     widgetShowAll w
>     mainGUI

And there you go: a ray tracer! Of course, there are many things that ought to be improved. They range from the trivial (render to an off-screen image, perhaps save to an image file, and read from a file instead of coding the scene as an immediate data structure) to the very complex (implement diffuse effects, more complex interpolated curves, etc.). But as a quick demo goes, this one didn’t turn out so bad.

Screenshot

June 4, 2008

Some video talks on programming topics

Filed under: Uncategorized — cdsmith @ 7:51 pm

Last Thursday, I attended a meeting of the Colorado Springs Open Source Meetup Group. I ended up offering to record and post videos of the presentations. They are now available. Matthew McCullough’s presentation on JavaFX is here, and Scott Ryan’s presentation on jQuery is here. I should mention that I had absolutely nothing to do with either of these presentations, except that I showed up with the video camera and microphone. This was a test run for recording my presentations at C-SPLAT (the Colorado Springs Programming Languages and Tools interest group) next week. So as far as content goes, I take no responsibility for inaccuracies or unusual views expressed in the talks.

As test runs go, all was pretty smooth. The biggest problem was that Google Video has been having some problems, and took most of a week to actually make the videos available for viewing after I uploaded them. Also, Google’s processing reduced the quality to the point that you can’t easily read the presentation slides. Next time around, I might see about seeding them in BitTorrent as well as uploading to Google Video; and of course I’ll make my slides available in PDF alongside the video.

Web Site Issues: Resolution

Filed under: Uncategorized — cdsmith @ 7:50 pm

Well, after the catastrophic failure of my web site earlier this week, I’ve put everything back at http://www.pphsg.org/cdsmith.

In case you’re wondering, the PPHSG is a group I started in 2001 to organize science and math education activities for home-schooled children. At the time, I had a neighbor who was home-schooling. The PPHSG still continues to this day, but I’ve passed on leadership to some actual home-schooling parents who have decided not to use the web site. So voila, instant replacement personal web site!

Next Page »

Blog at WordPress.com.