Monday, September 29, 2008

Overlap

Though the typical mathematics-heavy curriculum for 'Computer Science' studies seems to suggest otherwise, even in its simplest form with only one dimension this is probably the most advanced mathematical problem many programmers ever encounter in their jobs: "Do two intervals overlap?" Naturally, a whole range of applications need to have such a check in place with respect to the scheduling of resources within a time frame.

Over time I've stumbled on a number of implementations in production code by different programmers who undoubtedly set out to take the task very seriously. I imagine sheets of paper with pairs of bars in all mutual configurations possible... (No, my imagination is not usually that creative... Let's say I know someone... And no, I am not mocking anybody.) Still, the resulting code was either overly complicated or flat out wrong, possibly because one possible configuration of two intervals was overlooked. The possibility of open ended intervals, sometimes in combination with using separate fields for date and time, didn't help in keeping the intent of the code immediately obvious, let alone easy to review for correctness.

Of course the solution to the 'interval overlap' question is very simple and the code can be very short and understandable. In mathematical terms: two intervals [a1, a2] and [b1, b2] overlap if and only if a1 < b2 ∧ a2 > b1.

The details of writing that in your language of choice, dealing with open ended intervals and separate fields for date and time are left as an exercise to the reader.


"But wait...", you say. "That looks so simple it can't be true. That expression can't possibly cover all the bar configurations I drew on these sheets of paper... Oh well, I can check with them of course. Hmm, it seems it might be true after all..."

That was my voice (perhaps without the drama added here for effect). My colleague challenged me to proof that the expression above holds after all. Only months later, after encountering yet another piece of complex code that was written with the exact same goal, did I sit down to take up the challenge. Trying to remember from 'Analytics' class how one goes about proving stuff, not much more came up than just the phrases 'Complete Induction' and 'Complete Intimidation'. The latter proving technique is pretty attractive and often rather effective as well, but it's really frowned upon by the elite (of which I obviously want to be part desperately). As the Induction technique didn't seem suitable either... well, I came up with the following, which I think is conclusive enough:

Intervals [a1, a2] and [b1, b2] overlap if a c exists that is part of both intervals:
   a1 ≦ c ≦ a2
   b1 ≦ c ≦ b2

From these equations one can easily deduce that:
    (a1 ≦ b2) ∧ (b1 ≦ a2)

Note that when
    (a1 = b2) ∨ (b1 = a2)
the overlap has a length of 0, which you wouldn't consider an actual overlap. So, for an overlap with length>0 the condition remains:
    a1 < b2 ∧ a2 > b1

Q.E.D.

Please let me know if I'm oversimplifying things... although you may be forgiven for thinking I just 'overcomplicated' the issue. And in case you were really looking for something slightly more challenging, please have a go at this. ;-)

Update:An anonymous commenter points out that I was indeed oversimplifying and provides the remainder of the proof. Thanks for that! ;-)

I really like it when a little thinking helps to keep our code base maintainable by expressing the logic as compactly (yet readable) as possible! I do hope however that the above is hardly my most important contribution towards that goal...

1 comment:

  1. What you have proved is "if [a1, a2] and [b1, b2] overlap then a1 <= b2 ∧ a2 >= b1". But for an "if and only if", you also need to prove it the other way, i.e that if a1 <= b2 ∧ a2 >= b1 then [a1, a2] and [b1, b2] overlap. So assume that a1 <= b2 ∧ a2 >= b1 and show that it necessarily follows that there must be at least one element in both [a1,a2] and [b1,b2].

    To do so, you could consider the two numbers a1 and b1. One of the following statements must be true:
    1) a1 = b1
    2) a1 < b1
    3) a1 > b1

    If a1 = b1, then we are done because a1 is in both ranges.

    If a1 < b1, recall we have assumed that a2 >= b1. So
    a1< b1 <= a2. Thus b1 is in [a1,a2] and so is in both ranges.

    If a1 > b1, recall we have assumed that a1 <= b2. So b1 < a1 <= b2. Thus a1 is in [b1,b2] and so is in both ranges.

    For each possibility we have shown that it follows that there exists at least one one element in both [a1,a2] and [b1,b2]. Q.E.D.

    I disagree that you should change the <= to <
    I would consider a1=b1 to be an actual overlap - the ranges would then overlap at a point.

    ReplyDelete