Prequel to SQL - The Perl Journal, Fall 1999

Craig McElwee

The thing about magicians is that they do some rather amazing things and make it look easy. You know it's not really magic, but without a key secret or two, there is simply no way you can manage the same feat. The same can be said for putting together a little SQL-based web application on Windows; seeing the pros do it in a few paragraphs of documentation makes it look easy, then you try it yourself and it's no-go. Where to turn?

This is what I faced some months ago. I considered myself a competent Perl programmer who simply hadn't yet had the need for a SQL (Structured Query Language) database. I knew I'd eventually need one, but I'd survived with tied hashes and CSV files until then.

CSV files
A CSV (Comma-Separated Values) file is simply a text file with lines containing data separated by commas, like this: 'Johannes,Smythe,505 Oyster Street,555-5555'. This is a good method of saving data in "flat file" (that is, plain old text file) databases, since each line is a record which can be read by a Perl script, and the data extracted by splitting on the commas. Assuming the above line is in a file called addresses.txt, you could do something like this:
open(IN, "<addresses.txt"); while($line = <IN>) { ($first, $last, $address, $phone) = split /,/, $line; mangle data to heart's content... }

If there are already commas in your data, this won't work; you'll have to use a different delimiter, such as the pipe symbol ("|"). There is even a DBI::DBD for CSV files, so you could start writing scripts with DBI now, and when your needs require a "real" database, migrating the scripts will be as easy as changing one line in your program.

The turning point came after I had put together a small web application at work: a customer address book with search capabilities. I had combined several databases of customer info into one CSV file, (they were pipes, not commas, but no one seems comfortable with my calling it a PSV file), one record per line.

The application consisted of three HTML pages. One was a frameset, one was a "Welcome to this app" page, and the last was a menu bar with several buttons to call CGI scripts which generated the content pages. One of the buttons, LIST ALL, called a CGI script which read through the CSV file one line at a time, pulling out each customer name and putting it into an array to be sorted. The user could then click on a customer name and be shown that customer's full contact information, formatted to resemble a rolodex card.

There was also a text box with a search button activating a CGI script, which displayed the rolodex cards matching the search keyword. On each card was a MODIFY button, which redisplayed the rolodex card with each field in a text box so it could be edited and saved. Finally, there was a NEW button to add a new customer. Yes, it was a very sophisticated web application, now known as SupportNet (no patents pending).

My manager called me in one day and, as he was wont to do, discussed a few features he wanted my application to have. He wanted to add "a few more" functions to each card to help our support people be more productive. Click a button and an email window would pop open with the customer's email address already in the TO field. Click a different button and Microsoft Word would open up with a partially written letter. Click a button and a composition window would pop up to let you make notes about the customer's problem. A button which would autodial the phone. Oh, and each and every above action should be journaled, one journal per Rolodex card, with the action, date, time, and who did what. And the journals would have to be searchable, able to be sorted on different values, and each item listed would be hyperlinked to the actual item for perusal. I was given the standard in-house development budget: $0 and a gig of disk space on an NT server. He asked how long it would take; I said I'd get back to him and left to sketch out the design.

I figured most of these tasks wouldn't be difficult at all. Cookies would keep track of what each support person did. Email and note composition would be a new browser window with an appropriately formatted form. I was pretty sure I had read something about using Perl and OLE to manipulate Word and dial phones. The stumbling blocks I saw were intertwined: how to construct the journals, and that CSV file.

There were several schemes to keep all the client files centrally located and keep track of who they belonged to. Most of these seemed like workarounds to cope with the CSV file, which also had inherent problems with record locking since the file had to be manipulated by multiple users simultaneously. It quickly became obvious that I should let a proper database do the work; the CSV file had to go.

Let me qualify "proper." I had to use Microsoft Access - that's what was already installed and all my budget would allow. Before I added any new features I wanted to convert the CSV file to Access and prove it still worked. Essentially my conversion task list looked like this:

Put data into database: Convert PSV file to Access.
Get data out of database: Modify scripts to connect to database and then use SQL SELECT statements instead of while(<INFILE>) to retrieve data. So, on with it already!

I would have to create a new database, but I had already discovered that CSV files - even my pipe-delimited ones - could be imported into Access with little difficulty. Open Access, select File/New Database... and from the resulting window, select Blank Database and click OK. After choosing a location for the database, feed it: Select File/Get External Data/Import... First change the select box labeled Files of type to Text Files. (If you only have Microsoft Access and ODBC Databases available as choices, then follow the message below (Not all file types are installed by default...) and run Office Setup to add all the data access options, which makes Text Files available. This is painless and takes less than 60 seconds to do.)

Now just change to the directory where your CSV file is and double-click it. You should be presented with an Import Text Wizard window, in the middle of which are the first several lines of your CSV file. If Fixed Width is selected (the default), click on the Delimited radio button and click Next. In Choose the delimiter... click Other and fill in your delimiter. If your delimiter is a tab, semicolon, comma, or space, then by all means click that radio button. The first few lines of your CSV file should now be displayed with lines separating the data. Each column represents a field, and each row a record, in your database table. You are almost there - click Next.

Your next choice is to import this data as a new or existing table. This is rather self-evident, but if you are not sure, choose New and click Next. Now you give names to all the fields; click in a column to highlight it, and fill in a field name at the top. Anything will do, but let me recommend keeping field names short and without spaces - this will save headaches later when you are debugging Perl scripts with SQL statements inside. When you are done, click Next and on the next screen, let Access add a Primary Key field and click Next again. Enter a name for your table, and click Finish. You can quit now or double-click on the table icon to see a spreadsheet view of your database table. You can add, delete, or edit data as you will. When you are done, save your database and close Access.

Back to Perl: for getting the data out, I had read about DBI (the Database Interface) in TPJ. It had an ODBC database driver, and Access was ODBC compliant (or so I hoped, since Microsoft developed ODBC and it stands for Open DataBase Connectivity). I figured the script conversion from flat-file to SQL would be a piece of cake: just toss in a few Perl modules and add a few lines of code and presto: I look like a genius.

So I started looking closely at the articles and documentation. I managed to install DBI and DBD::ODBC without incident - I recommend Steve Lidie's article in TPJ #12 for details. So far so good. In my head, my test script went like this: open a connection to the database, for every record in the table print the record, close the database. So I started to write it.

I had a lot of documentation at hand, the Camel, Panther, and Ram books, Perl Journals with DBI articles, and PODs and other myriad information from the Internet. And do you know what? While being chock full of useful information, most of these sources all made the same assumptions. Specifically, people tended to say things like "enter the connection string" and "DSN" assuming I knew what these were. I didn't.

Eventually I realized that I needed an Access ODBC driver installed - I couldn't believe it wasn't automatically installed in the standard Microsoft "ActiveBloat" installation methodology. After the standard exploration through the nether regions of Microsoft's site, I found the download: http://www.microsoft.com/data/download/mdacfull.exe. Yes, there is other stuff in this installation, but there seems to be no way to get just the ODBC driver without a bunch of unneeded stuff.

Once I had it, installation was a snap. If you are following along, the procedure is:

Double-click on mdacfull.exe (Forgot where you downloaded it? Type dir /s \mdacfull.exe at a DOS prompt).

Click Yes to accept Microsoft's license statement.

Click Continue to go forward with the installation.

At this point Setup looks for installed components and asks whether you want a Complete or Custom installation. Complete is fine, automatically installing everything. If you would prefer a more discriminating install, choose Custom. To do what's mentioned in this article you only need three components: ODBC, OLE DB, and the ODBC driver for Microsoft Access. You can save yourself about 2 megabytes of disk space by unchecking the other components.

Setup checks for sufficient disk space, copies files to your system, and presents a dialog box informing you that setup was completed successfully. Click OK and you're done.

Still my simple test script didn't work:

  'Database connection not made: [Microsoft][ODBC Driver Manager] 
  Data source name not found and no default driver specified 
  (SQL-IM002)(DBD: db_login/SQLConnect err=-1) at test.pl line 14.'

Figure 1

Errors like this abounded. Luckily, I happened to decide that I didn't like the editor I was using and downloaded and installed a different one from the net. When I went to the control panel to remove the old one. And there I saw the 32bit ODBC applet, which I opened (Figure 1), and which had tabs saying User DSN, System DSN, and File DSN. Jackpot! (DSN stands for Data Source Name, and is akin to using a filehandle in your scripts to reference a file. Just think of it as a database handle.)

Figure 2

You want to use System DSN since it allows access to your database by other users and system services. Click the Add... button, select Microsoft Access Driver (*.mdb) from the displayed list, and click Finish. The "ODBC Microsoft Access 97 Setup" screen will be displayed (Figure 2). For DSN, put in anything you like, but keep it short and sweet since you will use this in every script. Fill in the Description field if you like. In the Database section, click Select... and use the Explorer-like interface to find and select your database file. Then click OK. If you would like to add a username and password, click Advanced... and enter it in the appropriate fields (Figure 3). Now click OK and you should see your database listed under the System DSN tab (Figure 4). Click OK to continue.

Figure 3

Figure 4

That mystery solved, it only took a minute to get a valid connection string, and presto - database output. The test script was simply this:

#!/usr/bin/perl
#
# test.pl  - test connection to Access database

use DBI;

$dsn  = 'TPJ';        # Configure $dsn, $user, and $pass in...
$user = 'perl';       # ...Control Panel/32bit ODBC
$pass = 'lerp';      
$dbd  = 'ODBC';       # A must if using ODBC.

# Connect to the database
$dbh = DBI->connect($dsn, $user, $pass, $dbd, {RaiseError => 1})
    or die "Database connection not made: $DBI::errstr";

# Prepare the statement for execution
# The table in the CSV file was named "Rolodex"
$stmt = $dbh->prepare(" SELECT * FROM Rolodex " );   

# Check if statement prepared correctly
die "ERROR: Cannot prepare statement: $DBI::errstr\n" 
unless 					  (defined $stmt);

# Execute the statement at the database level
$stmt->execute;

# Print out first 5 records, each field on a line
for (1..5) {
    @record = $stmt->fetchrow();
    print "\n\n\nRECORD:\n";
    foreach $field (@record) {
	   print "$field\n";
    }
}

# Clean up and go home
$stmt->finish;        # Release the statement handle resources
$dbh->disconnect;     # Close database handle

When I incorporated the above into a CGI script complete with HTML statements, it became the LIST ALL script available on the TPJ web site:

#!/usr/bin/perl
# Lists all practices

use DBI;

$dsn  = 'SupportNet';    # Configure $dsn, $user, and $pass
$user = 'perl';         # in Control Panel/32bit ODBC
$pass = 'lerp';
$dbd  = 'ODBC';         # A must if using ODBC.

##==========================###=========================##

# This is just a bunch of HTML to start the document
print <<END;
Content-type: text/html\n\n
<html><head><title>SupportNet</title>
<script language="javascript">
<!-- hide code from browsers that dislike embedded scripts
    function display(x) {
        document.forms[0].elements[0].value = x;
        document.forms[0].submit();
    }
// -->
    </script>
    </head>
    <body  background="../img/backgrnd.gif">
    <a href="../SupportNet/home.html">Home</a>
    <center>
    <h1>Client List</h1>
    <form action="/cgi-bin/rolodex" method="POST" target="main"
    name="rolodexform">
    <input type="hidden" name="ID" value=" ">
    </form>
    <table border="1" align="SENTER"><tr><td>
	
END

# Connect to the database
$dbh = DBI->connect($dsn, $user, $pass, $dbd, 
                    {RaiseError => 1, AutoCommit => 0 })
  || print "<p><b>Database connection not made: $DBI::errstr</b><p>";

# Prepare the statement for execution
$stmt = $dbh->prepare( "SELECT ID, practice FROM Rolodex" );

if ( !defined $stmt ) {
    print "<b>ERROR: Cannot prepare statement: $DBI::errstr\n</b>";
}

$stmt->execute;     # Execute the statement at the database level

# Build hash from the extracted data
while (( $id, $prac ) = $stmt->fetchrow()) {  
    $practices{$prac} = $id;                  #  PracticeName = DB_ID#
}

$stmt->finish;      # Release the statement handle resources

# This "foreach" line can be read backwards as:
# 'Take the keys from the hash %practices, sort them, 
#  then process them one-by-one'

foreach $pract (sort keys %practices) {
    next if $pract =~ /^$/;                # Skip blanks
    print "<a href='javascript:display(\"$practices{$pract}\")'
onMouseOver=\"window.status=\'ROLODEX this\';return true\">$pract</a>
<br>\n";
}

print "</td></tr></table>";           # Print HTML footers
print "</center>\n\n<p>";
print "</body></html>";

$dbh->disconnect;                    # Close database handle

The HTML snippet to call this script looks like this:

<form action="/cgi-bin/listall.pl" method="post">
      <input type="submit" value="List All">
</form>

Armed with this knowledge, converting my little CSV-based web application to a SQL-based one became the rather trivial task it was supposed to be. Here's to your first attempt being just as easy - just don't tell people how little time it took, and accept the compliments graciously.

The Win32-Access-ODBC-DBI::DBD Checklist:

I'm assuming you have Perl and Access already installed, and have a comma-separated-values file ready to go.

Download and Install MS Access ODBC drivers
Import your CSV file into Access
Define the Data Source Name and login/password for your database in Control Panel/32bit ODBC
Run a test script to check connectivity

References

Interactive SQL Tutorial: http://torresoft.netmegs.com/.

Alligator Descartes' DBI page: http://www.symbolstone.org/technology/perl/DBI/index.html.

Freeware Win32 web servers:

Apache: http://www.apache.org
Pi3Web: http://www.johnroy.com/pi3/pi3web/
Sambar: http://www.sambar.com/
Xitami: http://www.imatix.com/html/xitami/index.htm
Micro$oft Personal Web Server: http://www.microsoft.com

Win32 SQL databases: While not free, MS Access is usually available in most companies as part of the MS Office Suite, and MySQL is available for Win32 for $100.

__END__

Craig McElwee can be rearranged to spell "Grew Camel Ice" and "Magic Eel Crew", though not simultaneously. He can be reached at cmcelwee@mindspring.com.

TABLE OF CONTENTS