About Me

microformats hcard approaching

is a Professional Geek for Microsoft Australia. More info lives underneath the About Box...

-33.831416, 151.222526
MrCell+61.417.212181
Work:
1 Epping Road
North Ryde, NSW 2113
Australia
photo of nick hodge

Stuff

View Nick Hodge's profile on LinkedIn

msdn channel 9

How the mungenetengine works

By Nick Hodge | November 30, 1999

The data­base

There are three basic tables in the MySQL data­base. One con­tains con­tent, and ref­er­enced by mcid. The second con­tains any bin­ary con­tent (images, PDFs) and is ref­er­enced by an miid The final con­tains sec­tions ref­er­enced by msid, which are a mech­an­ism for organ­ising and clas­si­fy­ing con­tent and images.

Around these three tables live some util­ity tables that con­tain basic log­ging inform­a­tion (a simple page count per frag­ment) and cach­ing. Each page, when first rendered, is cached to a sep­ar­ate table. This enables faster response on the second and sub­sequent hits. To ensure that only cur­rent pages are cached, there is a simple cachet­ree imple­men­ted such that when a change or dele­tion to a record is made, the cache is pruned. (the page cache is presently disabled)

These sec­tions provide the under­ly­ing hier­ach­ical struc­ture of the site. This struc­ture is tightly bound to the nav­ig­a­tion through the site. As this is in the data­base, it is very easy to change the under­ly­ing nav­ig­a­tion struc­ture without chan­ging how indi­vidual pieces of con­tent are ref­er­enced. Usu­ally, the struc­ture is defined as a folder hier­achy on the server. As soon as you start mov­ing the folders around, the ref­er­ences from one piece of con­tent to another breaks — for­cing a mass update of the site. External search engines that may have the links stored in their data­base get 404 style errors. In the mun­gen­eten­gine, each con­tent, image and sec­tion is ref­er­enced indi­vidu­ally without rela­tion­ship to its place in the nav­ig­a­tion structure.

Each frag­ment of con­tent or image belongs to a sec­tion. These sec­tions are the hier­archy behind the site — so each sec­tion can belong to another sec­tion, and a par­tic­u­lar sec­tion can have mul­tiple child-sections. A sec­tion can only have one par­ent. As you move to another sec­tion or page within another sec­tion, the mun­gen­eten­gine (renamed to mne.php) can gen­er­ate a nav­ig­a­tion frag­ment; which is a tem­plate driven hier­archy con­tain­ing links to the par­ent sec­tion, sib­ling con­tent pages or sec­tions, and any children.

In fact, each con­tent entry is just a frag­ment of html. A whole page may actu­ally com­prise of other ref­er­enced con­tents ele­ments, each inser­ted into the final html that you seen in the browser. The pro­cess of build­ing one page from another is a pro­cess called tem­plat­ing. Its not as simple as one page ref­er­en­cing another; this pro­cess is recurs­ive — one page refers to another until the final page is built. The concept here is to only enter data only once into the data­base. Each con­tent can insert other frag­ments (or images) into the html output.

There are dif­fer­ent types of con­tent frag­ments. For instance, one is an external link — these con­tents are resolved, and wrapped into an HREF style link.

Images are in fact bin­ary objects. They are wrapped into the final html — except they refer back to the mun­gen­eten­gine — which grabs the images out of the data­base and serves them up. html frag­ments that are gen­er­ated auto­mat­ic­ally add the width/height, alts and hrefs nor­mally asso­ci­ated with images — if they are in the data­base. PDFs and other non-image con­tent can be linked to so the user can down­load them, or embed­ded into the HTML fragment.

SWF and SVG can also be stored in the data­base. When ren­der­ing out these ele­ments, the engine gen­er­ates object embed tags. The width and height are not read from either format as yet. Whilst it is pos­sible to read the ‘twips’ from a SWF, it seems overly com­plex to do — just yet

The engine can also gen­er­ate what are known as bread­crumbs (next and pre­vi­ous links) — this is easy as the struc­ture and rela­tion­ship between objects are known and map to the navigation

PHP, the server side script­ing lan­guage, munges the data together from the MySQL data­base and serves it up. As at early Feb­ru­ary 2003 this code is about 1870 lines long. I am test­ing this on Win­dowsXP and trans­fer­ring it here to a Linux box for pro­duc­tion. I really only have a choice of three server side script­ing sys­tems (PHP, Perl or Python). Learn­ing PHP has been a great experience.

The next stage is to trans­fer the exper­i­ence here into using a bet­ter lan­guage that has more object ori­ent­a­tion. I have chosen Python as the langauge to accom­plish this task. How long this takes, only time will tell!

Topics: mungenet | No Comments »

Comments