The Story So Far
I thought I couldn’t do it. I thought I wouldn’t do it. But I was wrong. Behold, the thus-far-unnamed homegrown blogging backend written entirely by yours truly. It has bugs, as I’m sure you will notice, but I promise, this version is much better than a few builds (not that the code is compiled or anything) ago when my statistics module was reporting upwards of four-hundred SQL queries on every request. For those who don’t speak computer, this means that, each time someone went to Organon, the database where all the entries are stored would be hit 400 times for various pieces of data in order to build the page. By comparison, a “good” or “normal” amount is around 25 (hopefully even less than that, if possible).
And so now we’re down to a lovely eighteen queries per request, which just plain kicks ass, considering the amount of code there is behind-the-scenes.
Your next thought is probably, “If there is so much backend code, why does the page look so horrible? I mean Arial? C’mon, you can do better than that!”
Yes, I can do better than that, but no one has the time to finish a custom blogging system in a night, and I certainly didn’t either. For now, this is what you get. But I’m on a roll, so look out for more features/better design in the next week or two.
Under the Hood
For you programmers out there, or if you’re just curious, I thought I’d write a bit about the more ingenious coding that makes up this project. I won’t describe everything line-by-line, but I’ll at least put in a few bits about the parts that I am particularly proud of.
Versatile Event System
There are many situations in programming when you need one part of the program to react to the actions of another part of the program automatically, without having to explicitly tell it to do so. To handle this, programmers use an event system, which basically acts as the go-between between the two separate parts of the program and communicates what each part is doing so that the other can react.
Now, most event systems (there is probably a better term for this, but I’ll just use mine instead) are complex and require a lot of coding around by the programmer to make sure that they will work correctly. Mine, however, is almost completely transparent, only coming into being when I need it. Here’s how it works:
An object called Page is designed to construct the HTML page that users see. In the Page object’s constructor function, which is essentially the ‘”setup” program for that particular object (objects are parts of a program), the Events object (the event system) is passed to the Page object so that it can be used. This is the only prerequisite for using the events system: any object that uses it must have it passed to it in its constructor.
Once the Page object has finished building the page, it needs to connect to the database and perform a query to get some final data. When the query is performed, the Page object also calls the recordEvent function of the Events object; in other words, the Page object is telling the Events object what it has done. At the same time, the Page object passes the query_id of that particular query to the Events object as well, just so that other objects that are watching for that particular event (a database query) to take place have a little bit more to work with. Objects that watch for a certain event are called “listeners” or “observers”.
Let’s recap a bit. We have a Page object that has just sent the Events object a quick notification that it is doing a database query. How will the Events object respond?
Before we get to that, I should note that not every object can be a listener. If an object wants to be a listener, it must first register with the Events object, telling it that a) it wants to listen, b) which events it wants to listen to, and c) what it will do if a particular event occurs. In our example, we have a third object, Statistics, that has already registered with the Events object and told it that, if a SQL_QUERY_EXECUTED event occurs (a database query), the Events object should call one of the methods of the Statistics object, “incrementQueryCount”.
Now, what is a method, exactly? Well, if an object is a part of a program, then a method is a part of an object. More specifically, it is a block of code that has the sole purpose of performing a specific function, which in this case is to add one to the query count. When the Page object executes a database query, the event is recorded by the Events object, which then looks through its registry (nothing like the Windows registry, don’t worry) for objects such as Statistics that have requested that Events tell them that this event has occurred. When it sees that Statistics is one of these objects, Events acts according to Statistics’ initial instructions - it calls the “incrementQueryCount” method, which adds one to the query count statistic.
So how is this useful? Well, though I only said that Page was trained to report a database query as SQL_QUERY_EXECUTED, all other objects were instructed to do so as well. As the program runs, dozens of SQL_QUERY_EXECUTED reports are forwarded to Events, and each time it tells Statistics to increment its query count by one. Just before ending the program’s execution, after all instructions have been followed and most of the page has already been send to the user, the program will get the finaly query count from the Statistics object and put it somewhere on the page. (You can see this on Organon at the bottom of every page.) While knowing the query count is of no use whatsoever to visitors, it is extremely important to programmers. Had I not checked the query count at the bottom of the page, I would never have known that I was racking up over 400 queries per page view, and most likely that error would not have been fixed. So indirectly the query count does help you, the visitor, because it helps the programmer keep the page generation time low.
There you go, a crash course in object-oriented programming, which is probably a college-level subject. I know it took me a while to get my head around it. But if it’s all you know to begin with, it really isn’t a hard thing to grasp.
The real reason that I was so ecstatic over developing this code (only 75 lines, which is unbelievable) was that because of the way the Events object acts as a go-between. The listener and the listenee (?) never have to know each other exist. It’s sort of like the way the newsmedia acts as the middleman between an American and the war in Iraq. I don’t know every single thing that happens there, and I’m pretty sure there aren’t any Iraqis who know me or who I am, but I know about the important things that are happening because the newsmedia (like the Events object) gets reports from Iraq and passes the ones I’m interested on to me. (Unlike the newsmedia, the Events object can’t be biased or warp the truth, however, so it is much more dependable.) The Iraq war and I can go on separately, but because I am “listening” to events happening there, I can then act on what I hear (though I probably wouldn’t do much in this case).
How was that? I was trying to teach you something without overwhelming you or babying you. That can be hard to do, because the line between them is a fine one. Only the best technology writers are able to walk it. (Not that I would count myself as one of the best, or even good.)
Edit: One More Thought on Events
Theoretically, my Events object could be used to link together an infinite number of other objects, resulting in a fully-autonomous program that would need only to be started up to function beautifully. Hmm…
Organized Data Pulling
When you click on a Read more link, a lot of things have to happen to get all the information pertaining to a single entry and present it on the page. The base data, such as the entry content, title, date, etc. is easy to grab. It can be done with one query, even. But there is also more data, such as the labels and their names, titles, and descriptions, as well as the comments (turned off for now) and the information for the user who posted the entry (also turned off). More queries must be performed to get each little chunk of information, sometimes two or more just for a small thing, like the names of an entry’s labels.
You see, in the database, most things rely upon unique identifiers to relate to one another. Every user (posting user, not visiting user), label, comment, and entry has its own unique ID that no other piece of data of its type will have. Rather than store information like this: “The user Aristotle has posted this entry,” it is better to do it like this: “The user with unique id #5 has posted this entry.” (Not in that format, of course, I’m just using English rather than SQL, which is the language that database queries are represented in.) Why is this better? Because users might want to change their username, and they can’t change their username and still keep authorship of their blog entries if the username is used to describe authorship. The unique ID number will never change and is unknown to the user, so it is ideal for this. Also, in the event that two users have the same username, there could be some problems figuring out which one was the actual author of an entry in the database. But since they each have unique ID numbers, it is easy for the database server to tell the difference.
When the base data for the entry is pulled from the database, the user ID for the entry’s author comes with it. In order to get the rest of the information about the author, such as their username and email address, another query is performed using the user ID to find the author’s information. Normally, this would all be squashed together with queries to get the comments and the labels and such. But I have chosen to farm out task-specific work to certain methods of the Entries object. For example, the “getUser” method does exactly that - it gets the information for the user who authored the entry. Likewise, the “getComments” method gets the comments, and the “getLabels” method gets the labels. All nice and simple, and all I have to do to put them all together is execute all the methods in rapid successions. In some respects, the methods provide plugin-like functionality similar to the way the Flash plugin adds extra functionality to your web browser (which had better be Firefox (or Safari). Again, we’re talking about separation of parts of code from one another. This is important because, if you have too many pieces of code all intermingling and breeding with one another (okay, maybe not breeding, but it could happen), any attempt to change one of the base pieces of code results in the entire program falling apart, like a house of cards. So you have to be careful.
END
I hope you enjoyed my little outpouring of tutorial goodness, even if it was all on programming theory and not anything practical. I’ll try to put in some practical examples soon.
By the Way
RSS hasn’t been reimplemented. Guess you’ll have to live with my horrible design for a while, then you can go back to your fancy NetNewsWire and FeedDemon and Sage (which I use) and so on. And if you haven’t checked out RSS yet, I highly suggest it. Google for it, and I’m sure you’ll figure it out. No time to talk about it here. Late. Need sleep. *thunk*