Web-Enabled
Filesystem-Based Databases
Leo Liberti
Constructing a Web-enabled database is advantageous for many reasons.
Virtually all techniques for implementing product catalogs, shopping
carts, and even portal sites rely on a database containing the information
and some form of dynamic Web page construction, be it ASP, PHP,
or CGIs. The downside to having a Web-enabled database is that it
is difficult to set up and administer. The usual options are:
- Buy a commercial all-in-one product (e.g., Oracle).
- Contact a specialized computer services firm that will take
care of the job.
- Dowload, configure, install, and administer freely available
software packages (e.g., Apache, PHP, MySQL), and make them work
together seamlessly.
The first solution circumvents having to know all the nitty-gritty
details of systems administration, but it still requires quite a
lot of technical involvement (using Oracle or other commercial software,
however "graphical" the interfaces may appear, is usually
not an easy task). Furthermore, commercial software costs money.
The second solution simplifies administration, but it is usually
very expensive. The third solution costs no money at all, apart
from the cost of the hardware. However, it is not easy to set up
such a system from scratch. Although there are plenty of resources
(on the Web and elsewhere) that teach how to do this, admins may
not have the time or the inclination to spend hours or days reading
technical manuals.
Considerations
The worst problem in setting up a Web-enabled database exclusively
based on free software is that it requires a broad range of technical
skills:
- A systems administrator to set up and administer the operating
system (e.g., Linux) and make the Web server, database server,
and server-side scripting modules work together.
- A Webmaster to properly compile, configure, and maintain the
Web server (e.g., Apache).
- A DBA (database administrator) to properly compile, configure,
and maintain the database server (e.g., MySQL).
- A programmer to write server-side scripts (e.g., PHP) or CGI
programs.
- An HTML designer to write the HTML pages.
Because small firms can rarely afford to employ all of the professionals
described above, the pressure will be put on the systems administrator
to tackle all of the tasks. Because operating systems (especially
UNIX-like ones) represent data in the form of files, systems administrators
are usually very skilled at manipulating files and directories.
However, when manipulating databases (creation, data input, searches,
output data representation), junior administrators may encounter
more difficulties. Furthermore, there are cases when a full-fledged
database is either overkill or just not the best way to encapsulate
the data. This is the case with product catalog files, for example,
where each product needs an image and a description.
With these issues in mind, I propose a simple filesystem-based
database model with the necessary PHP scripts to generate the dynamic
HTML pages on the fly. One disadvantage to using files and directories
instead of a proper database system is the unsuitability for dealing
with large quantities of records, especially where each record field
is small. The main advantage is that inputting data into the system
can be as simple as copying or moving files around.
Overview and Requirements
The filesystem-based database I describe here is based on the
principle that a directory represents a record and a file represents
a field. This is a good data representation for databases where
each record contains fields with variable (and large) sizes. Because
a product catalog, for example, usually contains product images
and product descriptions for each listed product, it fits the requirements
well. Thus, our sample implementation will be a product database.
For this example, we assume that a company that sells old computer
hardware and systems is setting up the database. The database records
simply consist of product ID, name, price, image, and description.
The software requirements are:
1. Linux OS (or any other OS where Apache and PHP can be installed)
2. Apache Web server
3. PHP Web-scripting engine
(Installation and configuration of Linux, Apache, and PHP are
outside the scope of this article.)
Filesystem DB Administration
We'll start by creating the database:
cd /home/www
mkdir products
Note that the parent directory of products must be in the document
paths of the Web server. Assume that the Apache document root directory
(corresponding to the URL http://127.0.0.1/) has absolute path
/home/www, and the products directory has absolute path /home/www/products.
Next, we must input data relative to a product. Our example company
just collected and refurbished an old SparcStation 1 and wants to
insert it in its product catalog. A picture has been saved in ??/images/sparcstation1.jpg,
a thumbnail image has been created in ??/images/sparcstation1.gif,
and a brief description of the product has been written in the (ASCII)
text file ??/descriptions/sparcstation1.txt. To find the
next available product ID (which is assumed to be a number starting
from 0 and incremented each time a product record is added to the
database), we save the following lines to a shell script getnewID
and make it executable. Typing ./getnewID will then return
a new product ID:
#!/bin/sh
# getnewID
cd products
[ "`echo *`" == "*" ] && echo 0 || echo * | wc -w | sed -e 's/ //g'
We can now insert a new (empty) record by creating a subdirectory
named sparcstation1-ID (where ID is the new product ID found
by getnewID) under the products directory. This can be automated
by running the following shell script with argument sparcstation1:
#!/bin/sh
# newrecord
if [ "$1" == "" ] ; then
echo "error: syntax is 'newrecord recordname'"
else
cd products
NEWID=`../getnewID`
DIR="$1-$NEWID"
mkdir $DIR >/dev/null 2>&1 || ( echo "error: record exists" )
[ ! -f $RECORDNAME/name ] && echo $1 > $RECORDNAME/name
[ ! -f $RECORDNAME/id ] && echo $NEWID > $RECORDNAME/id
fi
Because in this example the database had just been created, the product
ID referring to the product sparcstation1 is 0. Thus, the directory
is sparcstation1-0.
The product image, thumbnail, and description files are inserted
simply by copying them:
cp ~/images/sparcstation1.jpg products/sparcstation1-0/
cp ~/images/sparcstation1.gif products/sparcstation1-0/
cp ~/descriptions/sparcstation1.txt products/sparcstation1-0/
The price can be set (e.g., to 100 pounds) manually by typing:
echo 100 > sparcstation1-0/price
Finally (and optionally), we can make an HTML page with additional
information for each product by creating an index.html file
with appropriate content within each directory. The whole process
can be automated by modifying the above script (newrecord)
to include the copying of files and price setting.
Database Display by PHP
Next, I will present an HTML page containing a very simple PHP
script that displays the whole product database. Here is the HTML
code for the page design:
<html>
<head>
<title>Old Computers and Hardware</title>
</head>
<body bgcolor="#000040">
<table width="100%" border="0" cellspacing="2" cellpadding="2">
<tr>
<th bgcolor="#5000BB" colspan="3" align="left">
<font color="#9090FF" size="6">
<i>Products</i>
</font>
</th>
</tr>
Here is the PHP script (commented):
1. initialization;
2.
3. <?php
4. $pdir = "/home/www/products";
5. $d = dir($pdir);
6. $i = 0;
7. cycle over the directory entries:
8.
9. while($entry = $d->read()) {
10. skip entries . and ..
11.
12. if(strcmp($entry, ".") != 0 && strcmp($entry, "..") != 0) {
13. display three products on each table row:
14.
15. if ($i % 3 == 0) {
16. echo "<tr>\n";
17. }
18. // cell
19. echo "<td bgcolor=\"white\">";
20. find product name and ID from directory name:
21.
22. $id = strrchr($entry, "-");
23. $name = substr(strrev(strchr(strrev($entry), "-")), 0, -1);
24. display product name:
25.
26. echo "<h3>";
27. echo "<a href=\"".$pdir."/".$entry."/\">".$name."</a>";
28. echo "</h3>";
29. display thumbnail image:
30.
31. echo "<a href=\"".$pdir."/".$entry."/".$name.".jpg\">";
32. echo "<img src=\"".$pdir."/".$entry."/".$name.".gif\">";
33. echo "</a><br>\n";
34. display description:
35.
36. readfile($pdir."/".$entry."/".$name."-briefdesc.txt");
37. end the row every three products:
38.
39. echo "</td>\n";
40. if ($i % 3 == 2) {
41. echo "</tr>\n";
42. }
43. $i++;
44. }
45. }
46. close the directory for reading:
47.
48. $d->close();
49. end the PHP script:
50.
51. ?>
52. end the HTML page:
53.
54. </table>
55.
56. <address>
57. <a href="mailto:liberti@iris-tech.net">Leo Liberti</a>
58. </address>
59. </body>
60. </html>
Database Query
To perform a query on the filesystem-based database, just run
the grep command with a few bells and whistles. The following
shell script (search.sh) looks for a regular expression in
a database directory and returns an HTML-formatted list:
#!/bin/sh
# search.sh: perform a regexp search in a database directory
if [ "$1" == "" -o "$2" == "" ]; then
echo "error: syntax is $0 search_regexp db_directory"
exit 2
fi
cd $2
echo "<ul>"
for i in 'find . -exec grep -isl $1 \{\} \;' ; do
DIRFILE=`echo $i | cut -b 3-`
DIR=`echo $DIRFILE | cut -d '/' -f 1`
FILE=`basename $i`
if [ $FILE == name ] ; then
URL=$2/$DIR/
ITEM=`cat $i`
else
URL=$2/$DIRFILE
ITEM=$DIRFILE
fi
echo "<li><a href=\"$URL\">$ITEM</a></li>"
done
echo "</ul>"
Next, prepare a simple HTML form (search.html) that asks the
user for a search expression. The form passes the input data to the
PHP script search.php in the variable $s:
<html>
<head>
<title>catalogue search</title>
</head>
<body bgcolor="#000040">
<table width="100%" border="0" cellspacing="2" cellpadding="2">
<tr>
<th bgcolor="#5000BB" align="left">
<font color="#9090FF" size="6">
<i>Catalogue Search</i>
</font>
</th>
</tr>
<tr>
<td bgcolor="#BBBBBB">
<form method="get" action="search.php">
Input search term: <input name="s" size=50 maxlength=200 value="">
<input type="submit" value="Search">
</form>
</td>
</tr>
</table>
<address>
<a href="mailto:liberti@iris-tech.net">Leo Liberti</a>
</address>
</body>
</html>
Finally, run the shell script search.sh above from within the
PHP script search.php:
<html>
<head>
<title>products catalogue search</title>
</head>
<body bgcolor="#000040">
<table width="100%" border="0" cellspacing="2" cellpadding="2">
<tr>
<th bgcolor="#5000BB" align="left">
<font color="#9090FF" size="6">
<i>Database Search for term: <? echo $s ?></i>
</font>
</th>
</tr>
<tr>
<td bgcolor="white">
<?php
/* search the filesystem-based DB for term in $s */
$pdir = "products";
$cmd = "./search.sh " . $s . " " . $pdir;
$ret = 0;
System($cmd, $ret);
?>
</td>
</tr>
</table>
<address>
<a href="mailto:liberti@iris-tech.net">Leo Liberti</a>
</address>
</body>
</html>
Conclusion
I have presented a scheme for constructing a Web-enabled database
that circumvents using a true database engine, thus simplifying
the administration. The implementation is based on the principle
that a record is represented by a directory and a field is represented
by a file. This makes the scheme very useful in cases where the
fields contain large amounts of variable-sized data, such as images
or text descriptions. I have illustrated a simple PHP script, which
creates a Web page displaying the contents of the whole database.
There are countless improvements to this simple scheme, and all
or most of them should be easy to implement for sys admins with
a good command of shell scripting and even a modest knowledge of
PHP (or other CGI-like) Web scripting. A few other options include:
- Implementing shell scripts to query the database according
to some logical test.
- Implementing a Web-based database search with PHP.
- Implementing a security scheme based on the UNIX file permissions
so that only certain users have access to changing the records.
(Users accessing the database through the Web can be accounted
for by having Apache running under a special user, such as wwwrun).
Visit http://www.ipnos.co.uk/products.php for a working
implementation of this scheme.
Leo Liberti has graduated in Mathematics from Imperial College,
London, in 1992. He holds an M.Sc. in Mathematics from Turin University,
Italy. He is now a research assistant and part-time systems administrator
at Imperial College; he is also the Technical Director at IrisTech,
Como, Italy, an Italian firm that supplies customers with Web-based
and electronic services.
|