Beyond Hardcoded Database Applications with DBIx


INTRODUCTION

Perl has been dubbed ``The Duct Tape of the Internet.'' Since being bestowed this title, it has moved on to providing a comfortable interface to an ever-growing number of external technologies. In this article I provide an introduction to one of the most convenient yet powerful interfaces to relational databases from Perl: DBIx::Recordset.


Look Mom, No SQL!

By and large, when using databases from a programming language, you are creating, reading, updating or deleting data --- CRUD for short. DBIx::Recordset supports CRUD through the functions Insert(), Search(), Update(), and Delete() respectively. The big difference between DBIx::Recordset and all other Perl tools available for CRUD is that you don't have to concern yourself with SQL generation. DBIx::Recordset generates SQL and insures that it is syntactically correct for whatever database you are using. There are large applications written using DBIx::Recordset that require absolutely no code changes to run on MySQL, Access, as well as Oracle. While not important for in-house applications which will always run on the same relational database, having completely database-independant Perl code is a necessity for applications which will ship off-site.


SAMPLE USAGE

Though DBIx::Recordset provides a large range of functionality, this article will focus on a real-world example which I developed during a recent contract job. The basic scenario was that the company had a user registration table that contained basic information about a client such as name, email address, and phone number. However, a new table was developed to store demographic information about a client in addition to the original information stored. My task was to copy all the current information from the original table to the new table designed to record user demographics. The basic information would simply be copied over and the demographic information would be generated randomly. Listing 1 shows uregisternew, the table designed for this purpose. The last 4 fields of uregisternew are the added demographics fields. Each of these fields contains a non-negative integer indicating the user's status for a particular demographic. The way this integer determines status is by indexing into an anonymous array. For example, the demographic for gender may be undetermined, Male or Female. Once a user indicates his gender, the index of his gender is looked up in the following array:

 $Angryman::User::profile{gender} = [ 'not determined', 'Male', 'Female' ];

Thus, a user whose gender is Male will have a 1 for his gender demographic field. There are similar anonymous arrays for age bracket, occupation, and income.

Listing 2 shows the actual program which performs my assigned task. We will take it step by step.

  #!/usr/bin/perl
  use Angryman::User;
  use DBIx::Recordset::Recordset;
  use Math::Random;
  use strict;

Here we load in the modules we need. Angryman::User has all the demographic anonymous arrays discussed above. DBIx::Recordset is what we are discussing in this article. Math::Random is an excellent Perl module that allows on to create random numbers of all sorts. And of course, if one wants bulletproof code then one always turns on strict.

  $::table{in}  = 'uregister';
  $::table{out} = 'uregisternew';

In these two lines we define the original table with basic user information and the target table which holds basic user info as well as demographics info.

  %::connect =  (
     '!DataSource' => 'DBI:mysql:db_name:192.168.201.103',
     '!Username' => 'mysql_username',
     '!Password' => 'mysql_password'
     );

Here we define the information for connection to the database. In the real script, this information was actually sequestered into a system configuration file, but, for the purposes of illustration, the actual code is shown here.

  # connect to database and SELECT * FROM uregister
  *::uregister = DBIx::Recordset::Recordset->Search ({            
        %::connect, 
        '!Table' => $::table{in}  
        });

This one line of DBIx::Recordset packs quite a punch: we connect to the database and retrieve all records from the original table. The connect hash gives DBIx::Recordset instructions on how to connect. The table directive indicates which table we want to retrieve records from. Because we bind the results of the Search() call to a typeglob, we actually get 3 return values. A scalar ($::uregister) is setup to allow object-oriented access to the database. An array (@::uregister) is setup which allows us to access the results of retrieving all records one at a time. A hash (%::uregister) is setup which allows us to access the fields of the current record.

# because we will re-use the target table many times, we separate the # connection and insert steps with this particular recordset

  *::uregisternew = DBIx::Recordset::Recordset->Setup({  
        %connect, 
        '!Table' => $::table{out} 
        });

While all of the CRUD functions support connection and database processing in one line, we will be inserting many records into the target table. We therefore connect to the database and target table in the above lines and re-use the connection on each iteration of the loop to avoid suffering time loss from excessive connection and disconnection.

 # iterate through the recordsets from the old table:
 while (my $uregister = $::uregister->Next) {

Remember how when we connected to our original table we setup the scalar, the array, and the hash? Well, in the line above we make use of the scalar's Next() method so that we can iterate across the retrieved records. On databases which reliably return the number of rows retrieved we can make use of array we setup as follows: for my $uregister (@::uregister).

    &randomize_user_profile;

Here we randomize the demographic fields and store the results of in a hash (%::profile).

    # INSERT 
        # the old table data into the new table and
        # the computed hash of profile data
    $::uregisternew->Insert({%{$uregister},%::profile});

Next we take the data in both hashes and insert it into the database. Note that Insert() takes an anonymous hash as its argument, not a hash. DBIx::Recordset automatically takes the key-value pairs of the hash and creates the necessary SQL statements to insert the data. This is a great time-saver. You needn't personally worry about getting your data in order and you also don't have to manually quote your string fields.


A DBI Version

One of the most important steps in Perl history was the authoring of DBI. Prior to DBI, several database-specific products such as oraperl and sybperl existed. With DBI, all basic database tasks such as connection, statement execution and disconnection can execute on many databases with no changes in Perl code. This section will discuss Listing 3, a DBI implementation of the same task.

  # connect to database and SELECT * FROM uregister
  my $dbh = DBI->connect('DBI:mysql:db_name:192.168.201.103', 
    'username', 
    'password');
  my $sth = $dbh->prepare('SELECT * FROM uregister');
  my $ret = $sth->execute;

Here we connect to the database and retrieve all records from the original table. Notice that it takes 3 lines to do so and we still do not have our database data in any Perl variables for automatic usage. It is possible to reduce the number of lines by one if we use DBI's selectall_arrayref() function to return results. But we will never be able to use one line of code because connection and statement execution occur are separate calls to the DBI API.

  &determine_target_database_field_order;

While this line of code seems rather innocuous, it reveals the most about the differences between DBI and DBIx::Recordset. The whole idea of this program is to retrieve records from one table and put them in another, right? With DBI, you may retrieve data into an array, an array reference, or into a hash reference. If I wanted to ignore the possibility of the target table ever changing, then I could retrieve my data into an array or array reference and then commit it to the target database. However, I wanted my code to work regardless of whether or not they moved around (or even deleted) a few fields in either database. I thus disqualified arrays and array references as viable means for this project and retrieved my records into hash references. Because one cannot be assured of order with hashes, I needed an array with the key names in correct order. The function &determine_target_database_field_order() does just that. With DBIx::Recordset, such a function is not necessary: it takes the hash references you give it and correctly quotes and orders it for writing to the database.

  # iterate through the recordsets from the old table:
  while ($::uregister = $sth->fetchrow_hashref) {

We fetch each row of the original table into a hash reference.

    &randomize_user_profile;

We create a hash with the demographics data.

    &fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API();

The purpose of this function is to take the data to be inserted and manually order and quote it. This is unnecessary with DBIx::Recordset as it handles these issues for you automatically.

    # INSERT 
        # the old table data into the new table and
        # the computed hash of profile data
    my $sql = "INSERT into $::table{out}($::sql_field_term) values($::INSERT_TERM)";
    $dbh->do($sql);

Here we insert the record into the target table.


Conclusion

DBIx::Recordset contains a wealth of functionality. It is currently the only CPAN database interface which allows for completely database-independent database use. Common tasks can be coded up quickly and concisely. In addition to the functionality discussed in this article, DBIx::Recordset has support for many other common tasks such as debugging, multi-table processing and HTML-based navigation of tables. So, the next time you want your database code up and running quickly and portably, give DBIx::Recordset a try.


Listing 1: uregisternew

   Field         Type   
 firstname   varchar(50) 
 lastname    varchar(50) 
 userid      varchar(50) 
 password    varchar(50) 
 address1    varchar(50) 
 city        varchar(50) 
 state       char(2)     
 zippostal   varchar(20) 
 email       varchar(50) 
 phone       varchar(50) 
 fax         varchar(50) 
 dob         date        
 occupation  varchar(50) 
 gender      varchar(20) 
 income      varchar(25) 
 age         varchar(255)


Listing 2: DBIx::Recordset implementation of program to copy user records from a source table (uregister) to a target table (uregisternew)

 #!/usr/bin/perl
 =head1
 uregisternew is a table with all the fields of uregister plus a few profile fields (ie, salary bracket, occupation, age) which contain a positive integer which serves as index into the array for that  particular profile field.
 The purpose of this script is to copy over the same fields and generate a valid array index for the new profile fields.
 =cut
 use Angryman::User;
 use DBIx::Recordset::Recordset;
 use Math::Random;
 use strict;
 $::table{in}  = 'uregister';
 $::table{out} = 'uregisternew';
 %::connect =  (
 '!DataSource' => 'DBI:mysql:db_name:192.168.201.103',
 '!Username' => 'mysql_username',
 '!Password' => 'mysql_password'
 );
 # connect to database and SELECT * FROM uregister
 *::uregister = DBIx::Recordset::Recordset->Search ({            
                                          %::connect, 
                                          '!Table' => $::table{in}  
                                         });
 # because we will re-use the target table many times, we separate the 
 # connection and insert steps with this recordset
 *::uregisternew = DBIx::Recordset::Recordset->Setup({  
 %connect, 
 '!Table' => $::table{out} 
 });
 # iterate through the recordsets from the old table:
 while (my $uregister = $::uregister->Next) {
 &randomize_user_profile;
 # INSERT 
 # the old table data into the new table and
 # the computed hash of profile data
 $::uregisternew->Insert({%{$uregister},%::profile});
 }
 # Angryman::User::Profile is a hash in which each key is a reference 
 # to an array of profile choices. For example:
 # $Angryman::User::Profile{gender} = [ 'male', 'female' ];
 # $Angryman::User::Profile{age} = ['under 14', '14-19', '20-25', ' ];
 # Because we don't have the actual data for the people in uregister,
 # we randomly assign user profile data over a normal distribution.
 # when copying it to uregisternew.
 sub randomize_user_profile {
 for (keys %Angryman::User::Profile) {
 my @tmp=@{$Angryman::User::Profile{$_}};
 $::profile{$_} = random_uniform_integer(1,0,$#tmp);
 $::profile{dob}='1969-05-11';
 }
 }


Listing 3: DBI implementation of program to copy user records from a source table (uregister) to a target table (uregisternew)

 #!/usr/bin/perl
   use Angryman::User;
 use DBI;
 use Math::Random;
 use strict;
 $::table{in}  = 'uregister';
 $::table{out} = 'uregisternew';
 # connect to database and SELECT * FROM uregister
 my $dbh = DBI->connect('DBI:mysql:db_name:192.168.201.103', 
                        'username', 
                        'password');
 my $sth = $dbh->prepare('SELECT * FROM uregister');
 my $ret = $sth->execute;
 &determine_target_database_field_order;
 # because we will re-use the target table many times, we separate the 
 # connection and insert steps with this recordset
 # iterate through the recordsets from the old table:
 while ($::uregister = $sth->fetchrow_hashref) {
   &randomize_user_profile;
   &fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API();
   # INSERT 
   # the old table data into the new table and
   # the computed hash of profile data
   my $sql = "INSERT into $::table{out}($::sql_field_term) values($::INSERT_TERM)";
   $dbh->do($sql);
 }
 # Angryman::User::Profile is a hash in which each key is a reference 
 # to an array of profile choices. For example:
 # $Angryman::User::Profile{gender} = [ 'male', 'female' ];
 # $Angryman::User::Profile{age} = ['under 14', '14-19', '20-25', ... ];
 # Because we don't have the actual data for the people in uregister,
 # we randomly assign user profile data over a normal distribution.
 # when copying it to uregisternew.
 sub randomize_user_profile {
   for (keys %Angryman::User::Profile) {
     my @tmp=@{$Angryman::User::Profile{$_}};
     $::profile{$_} = random_uniform_integer(1,0,$#tmp);
   }
   $::profile{dob}='';
 }
 # Hmm, I cant just give DBI my data and have it figure out the order
 # of the database fields... So here he we go getting the field
 # order dynamically so this code doesnt break with the least little
 # switch of field position.
 sub determine_target_database_field_order {
   my $order_sth = $dbh->prepare("SELECT * FROM $::table{out} LIMIT 1");
   $order_sth->execute;
   # In DBIx::Recordset, I would just say $handle->Names()... but here we 
   # must iterate through the fields manually and get their names.
   for (my $i = 0; $i < $order_sth->{NUM_OF_FIELDS}; $i++) {
     push @::order_data, $order_sth->{NAME}->[$i];
   }
   $::sql_field_term = join ',',  @::order_data;
 }
 # As ubiquitous as hashes are in Perl, the DBI API does not
 # offer a way to commit hashes to disk.
 sub fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API {
   my @output_data;
   for (@::order_data) {
     push @output_data, $dbh->quote
       (
        defined($::uregister->{$_}) 
        ? $::uregister->{$_} 
        : $::profile{$_}
       );
   }
   $::INSERT_TERM=join ',', @output_data;
 }