John Stracke’s Work

My professional site

The M1 Project

The M1 project was eCal’s effort to build a new version of our Web-based calendar. We decided that none of the existing technologies really met our needs, so I got to buid what was, in essence, a new sort of application server. I was given one constraint: it had to be in Perl, since that’s what the team had been using (and getting good results with). I took a fresh approach to the problem; my goals were:

The approach I took was inspired by an idea I picked up while working at Netscape, from a presentation on the Magellan rendering engine which was planned for Javagator. (Javagator was never released, but I believe the Magellan design was eventually used for Gecko, the rendering engine for Mozilla.) The basic idea of Magellan is that each HTML tag is implemented by a separate class; the class’s constructor takes an HTML element (and its children) and tells its parent element how much space the element needs on the page. Each parent element then takes responsibility for arranging its children and computing how much space it needs for them. In essence, it’s very similar to the layout techniques used by GUI toolkits, where each parent widget takes responsibility for laying out its children. (It might be interesting to try making this correspondence explicit, by implementing a layout engine on top of custom widgets corresponding to HTML tags. The resulting widgets could be reused by any browser-like app.)

So. In applying this idea to HTML generation, my tactic was to have handler classes registered for various custom tags; each class had a "begin" method (for start tags) and an "end" method (for end tags). (Tags for which there was no handler registered were assumed to be plain HTML tags, and were passed through unmodified.) In addition, a begin method could grab the parser (much as an X Windows application can grab the mouse) so that the current handler would be invoked for all elements (or text) inside the current element; this was used for elements which needed to control the processing of their child elements. For example, an <if> element whose condition came out false would grab the parser and not let any of its children be processed.

The second major feature of the engine was a system of variables. There were <get> and <set> tags for getting and setting variable values, and <get-cookie> and <set-cookie> for accessing cookies. This was important, because one big decision made to improve scalability was that all state information should be stored on the client, rather than in a database keyed by session ID (as is commonly done in ASP, for example). This meant slightly greater bandwidth use, as all state variables had to be sent over the wire; but it improved scalability by reducing load on the bottleneck database.

And that was it; that was about all one had to know about the engine in order to write HTML pages. There was some authentication functionality, but only login pages needed to worry about it; other pages could just count on the engine to apply the authentication rules before running them. And there was one other convention we used, which was that, if handling a form required complex logic, then that logic was embodied in a script rather than an HTML template. This let us further separate logic from presentation. By convention, the scripts would receive two CGI arguments, forelink and backlink, which were the HTML pages to redirect to on success or failure, respectively. This way, one could write separate UIs that used the same form handlers.

It was all very straightforward, and the team loved it. We did occassionally have to add new custom tags for features I hadn’t thought of; the most significant was <list>, which corresponded to Perl’s foreach or Lisp’s mapcar. <list> came about because the calendar UI needed to be able to display lists of items (e.g., all the events scheduled for a day), and we didn’t want to have to write special-purpose tags for them. (Apart from the effort involved, it would have meant that a customer could not have customized the list without writing code.) So, instead, we had <list>, which took an array variable (probably populated via a database call) and a fragment of HTML, and repeated that fragment once for each element of the array.

It performed pretty well, too, once I’d spent some time speeding it up. One of the most interesting speedups came when I started compiling template pages down to Perl, instead of just interpreting them. This in turn provided the opportunity for an additional improvement: tag handlers could provide specialized code to insert into the generated code. So, for example, since <get> tags were so common, we sped them up by having the code generated by (roughly) print(get("foo")) instead of getHandler("get")->start(attrs). And <list> got sped up immensely by having it generate a for loop, containing the generated code of the template HTML, instead of calling the interpreter in a similar loop.

The upshot of the whole thing was that it was an enormous success, technically. (The company was unable to sell it and went under, but that was due more to some strategic problems, such as failing to predict that moving from hosted consumer calendars to the enterprise market would put us in head-to-head competition with Microsoft and IBM. The fact that nobody wanted to spend money on IT in the summer of 2001 was another problem, of course.) The product was easy to build, easy to maintain, and fabulously scalable: we wound up able to support 1,000,000 users on a fairly modest hardware platform (two Sun ES450s, I believe), yielding a per-seat hardware cost 1/10 that of Exchange. (In fact, Exchange can’t handle 1,000,000 users in any case. I have a friend whose company has 110,000 users on Exchange, and it performs miserably. They ask Microsft for help getting to scale better , and the result is, broadly, "We can’t help you; we didn’t think it would ever get that far anyway. Could you tell us how you did it?") It was great.

I’ve since learned that the M1 design may be similar to the design Paul Graham describes for Viaweb (now Yahoo! Store). Someday I’d like to combine the two ideas.

Home SCA Geekery Books Thoughts Send mail Mastodon