Next: Directories
Previous: Content Targeted Advertising

Database Driven Content

Many organisations have information held in a database which they would like to make available over the Web. This data also represents potential content that can be indexed by search engines. More content means more keyword coverage, more PageRank to be distributed within the site and more potential relevant anchor text in links.

The database will usually be accessible through some kind of HTML form interface. This allows the user to retrieve data based on some search parameters. Forms are too complicated for search engine robots to use so an alternative approach is needed. The data can either be exported to a set of static pages or it can be accessed programmatically in a search engine friendly way.

It seems a pity to turn all that nice, dynamic data into a set of static pages. If the database is rarely or never updated this can be a satisfactory approach at the risk of duplicating the amount of data to be stored. The programmatic approach has the advantage of centralising the data in one place at the expense of performance and complexity. Each time a page is accessed a program is run on the Web server which then retrieves the data from the database management system (DBMS). What is effectively being created is a bespoke content management system with all that implies in terms of project management, cost and security implications.

As an example we may have a database that holds information on motorcars. I worked on just such as system as part of the OneSwoop.com online car retailer. The data is organised as manufacturer, model name and description. To make the URLs search engines friendly we could structure the system so that the resource:

    cars

retrieves information about all the car manufacturers in the database, it is effectively a content page and an entry point to the cars database. The page is a Common Gateway Interface program (CGI) that selects all of the car manufacturers from the database and displays them as a series of hyperlinks.

Clicking on the link:

    cars/ford

will retrieve all the car models that Ford manufacturers from the database displaying each one as a hyperlink. Finally:

    cars/ford/focus

Retrieves information about the Ford Focus from the database. From a search engine viewpoint it is accessing a hierarchy of static pages. From a programmers viewpoint there may be a single program that uses information in the URL to decide what action to perform. Now if you are familiar with Web programming you may remember that parameters to a program are normally passed in a section of the URL called the query string, that is, everything after the question mark '?':

    cars?manufacturer=ford&model=focus

Here the 'cars' CGI script is passed the manufacturer and model name as parameters. A search engine may index only the root cars page and some dislike following links with query strings, they are said to be search engine unfriendly. The solution is create pseudo static URLs that look like the ones we saw earlier in this section and then use some trickery to extract the parameters from the URL. The popular Apache server has an extension called mod_rewrite that can do this for us. Users of Microsoft's Internet Information Server (IIS) should check out ISAPI Rewrite. Both are powerful but have a complicated syntax.

Search Engine Optimization Book            

See Also

Content Management Systems

Home ] Table of Contents ] Start ]