.gif) Features / Angela J.R. Jones / October 1,
1999
15 Steps to Convert Access Data to SQL
Server
How to convert the data in small- to mid-sized
Access JET database projects to SQL Server 6.x
I work for a small consulting company that has provided
clients with Microsoft Access desktop solutions since the
early 1990s. Three clients had recently outgrown their
multiuser Access applications and were looking for help with
upsizing to SQL Server. Although my consulting company had
offered only desktop solutions up to this point, it decided to
enter the client-server market. In preparation, several other
consultants and I decided to upsize a mid-sized Access system
to SQL Server.
For our training project, we chose to upsize an Access
system that we had set up for a pharmaceutical company. This
system, which tracked the required courses and physical exams
that employees took, was a typical Access database for us in
terms of size, complexity, and design. The system had an
Access interface and a Web browser interface. Although this
database wasn’t large, the IT staff members at this site
wanted to upsize it so that they didn't have to disconnect the
data from the Web server every time they needed to repair and
compact the data.
When we had initially set up this client's Access system,
we had separated the data (i.e., tables) and the code (i.e.,
queries, forms, reports, macros, and modules) into two .mdb
files. The data .mdb file contained about 7MB of data in 36
tables. The code .mdb file contained about 5MB of code.
Because we always normalize to at least the third normal form
and enforce referential integrity, we had created 39 enforced
relationships. All the table relationships had cascading
deletes. (Cascading updates weren't necessary.)
In every table, we had used an AutoNumber field as a
bookmark primary key to eliminate multiple-field primary keys,
which can greatly slow insert and update performance. Figure
1 illustrates how this approach works. As Figure 1 shows,
the intersection table between tblTrainee and tblCourse is
tblTraineeCourse. This table, which records the courses that
trainees take, has three fields:
- bmk_TraineeCourse, the AutoNumber primary key
- fgn_Trainee, a long integer foreign key to the
tblTrainee table bookmark
- fgn_Course, a long integer foreign key to the tblCourse
table bookmark
Of similar design is tblTraineeExamReq, an intersection
table that uses the AutoNumber primary key of
bmk_TraineeExamReq to record the physical exams that trainees
take.
In the conversion to SQL Server, we needed to preserve the
value of all the AutoNumber fields. To find out how to
preserve Access data, we conducted research. We found a lot of
information about upsizing Access applications. However, most
of the information discussed how to optimize the application
after you upsize the data. Little information existed on how
to use the Upsizing Wizard, which is part of the Microsoft SQL
Server Upsizing Tools for Access 97, to load the Access data
into SQL Server. (You can download the Microsoft SQL Server
Upsizing Tools for Access 97 from
http://officeupdate.microsoft.com/downloadcatalog/dldaccess.htm
if you don’t already have them.) Therefore, we thought that
using the Upsizing Wizard would be easy. We were wrong. We
worked for almost 24 hours before we finally had the 7MB of
data and constraints in place.
Instead of discussing our many failures, I'll discuss data
upsizing procedure that we now use regularly to upsize small
and mid-sized Access databases. (If you have a large Access
database, you need a bulk copy program—bcp.) Our data upsizing
procedure has 15 steps.
Step 1: Create the Database and Log
Devices We had heard that if you create a database
device twice the size of the Access .mdb file, you'll have
plenty of room for the conversion. However, that estimate is
incorrect if you have Memo fields. Memo fields, which upsize
to Text columns, take up an enormous amount of space because
the Upsizing Wizard creates a Timestamp column for each field.
For example, in our case, one particularly memo-heavy
application went from being a 3MB file in Access to a 20MB
file in SQL Server. Thus, we created database and log devices
that were at least five times bigger than the.mdb file.
Step 2: Create the Database
You must create the database and an ODBC data
source for SQL Server’s master database. The ODBC data source
obtains storage information when you ask the Upsizing Wizard
to create the database.
Step 3: Gather Data on Table
Relationships You must use the Access Database
Documenter to gather information about the table relationships
in the database. If you didn’t install the Documenter, you
need to. You can download the Documenter from the Microsoft
Office setup disk.
To gather the table relationship data, choose Tools,
Analyze, Documenter from your Access .mdb file database window
menu. After selecting the Current Database tab, choose
Relationships and click OK to print the details of all the
relationships in the database. If the Documenter isn’t working
properly, run it from the Relationship window. Review the
printout to make sure the referential integrity and cascades
are set correctly. Keep this printout handy because it can
help you determine the table dependency order in step 6.
Step 4: Make the .mdb File Compliant
The names in the .mdb file must comply with the
rules for SQL Server 6.5 identifiers. If the table names or
field names aren't valid SQL Server identifiers, the Upsizing
Wizard forces the table to become SQL Server compliant. If a
name is longer than 30 characters, the wizard truncates the
name to 30 characters. If a name contains a space or an
illegal character, the wizard changes the space or character
to an underscore (_). (For a list of illegal characters, see
SQL Server Books Online—BOL.) The wizard then
creates an Aliasing query, which is an Access query that
references the new SQL Server table. The Aliasing query takes
the name of the original table and translates all the new SQL
Server-compliant field names to their original names.
For example, suppose you have an Access (JET) table named
Employee Job Title, which has the field
DescriptionOfJobResponsibilities. Because SQL Server
doesn't recognize names with spaces or names containing more
than 30 characters, the Upsizing Wizard creates the table as
Employee_Job_Title and the field as
DescriptionOfJobResponsibiliti. The wizard then
attaches the new SQL Server table Employee_Job_Title to
the Access table Employee_Job_Title_remote. Finally,
the wizard creates an Aliasing query called Employee Job
Title that references Employee_Job_Title_remote and
renames the field DescriptionOfJobResponsibiliti to
DescriptionOfJobResponsibilities. Although the
application will work, it references an Aliasing query rather
than a linked table. As a result, the application will run
much slower that if it referenced a linked table.
You can use your favorite search-and-replace utility to
make the names in the .mdb file compliant with the rules for
SQL Server 6.5 identifiers. Our favorite utility is Find and
Replace, a search-and-replace utility for Access
(http://www.rickworld.com). Don't try to manually find and
replace all spaces and illegal characters. If you miss just
one space or character, you’ll have to rerun the Upsizing
Wizard.
Step 5: Remove All Recursive
Relationships Recursive relationships are
relationships between two fields in the same table that you
enforce. An example is an Employee table with a foreign key
Supervisor that references the primary key of that Employee
table. If you try to upsize this table, the Upsizing Wizard
will have no problems creating the table structure. However,
the wizard will encounter numerous difficulties loading the
data because it has to load all the Supervisors before it
loads the other Employees—and most Access systems don't
organize data that neatly. Therefore, you must remove all
recursive relationships and write triggers to enforce
recursive relationships. Listing 1 contains example SQL
trigger code, including code that enforces recursive
relationships. Another way to enforce recursive relationships
is to remove all recursive relationships and then, after the
conversion, create foreign keys that reference tables' primary
keys.
Step 6: Determine the Table Dependency
Order When you load data, you must first load the
tables with no foreign keys (i.e., independent or first-level
tables), followed by the tables that depend only on
first-level tables (i.e., second-level tables). Then you load
the tables that depend only on first- or second-level tables
(i.e., third-level tables), and so on. (First-level tables are
parent tables to second-level tables, which are child tables
to first-level tables. This analogy continues down through the
table levels.) Establishing the table dependency order is
important because if you load data out of order, the load
fails.
Step 7: Check All Default Values
Checking all default values is important,
especially if you have Number fields with a default value of 0
in tables that you don’t want to allow 0 values. For example,
if you reference the Identity field (which is similar to the
Access AutoNumber field) of a SQL Server parent table as an
optional foreign key in a child table, the default is 0, even
if you allow only nulls and those values in the parent table.
Here's why this problem occurs. Access displays the defaults
before you save a record. Thus, you could see a 0 in the
Access interface and remove it to set the field to null.
However, when you attach SQL Server tables, Access no longer
displays the defaults. Thus, it looks like you're entering a
null, but when you save the record, SQL Server tries to insert
the 0 value and the record update fails.
Step 8: Move the Data Tables into the
Code File You need to perform this step only if your
data and code are in separate databases. (In our training
project, we performed this step because we had separated the
data and the code into two .mdb files.) When the Upsizing
Wizard runs, it attaches all the SQL Server tables to the
database from which you're running the wizard. If you leave
the tables in a separate database and run the wizard from that
tables database, when you link your code database to the new
SQL Server tables, the wizard names all the newly linked SQL
Server tables dbo_TableName. You must then manually
rename each table. You can avoid this hassle by importing all
the tables from the data .mdb file into the code .mdb file
before you run the Upsizing Wizard. Be sure to include the
relationships, structure, and data.
Step 9: Make a Backup Copy of Your
Files Backing up your files at this point in the
data upsizing procedure is crucial. You’ve probably spent at
least 2 hours preparing the files for upsizing. If anything
goes wrong and you must rerun the Upsizing Wizard, you don't
want to have to start over. Although this step might seem
obvious, we were so excited about running the wizard that we
forgot to save our SQL-ready files in our first two upsizing
operations. We weren't too happy when we had to reconstruct
all our work.
Step 10: Run the Upsizing Wizard
When you run the Upsizing Wizard, you should upsize
all the tables at once to ensure that the table relationships
stay intact. If Table A has a relationship with Table B and
you upsize Table A but not Table B, the wizard breaks the
relationship between the two tables.
The Upsizing Wizard lets you specify the table attributes
to upsize, the data options to include, and any database
modifications. As Screen
1 shows, run the Upsizing Wizard with these options:
- Use Declarative Referential Integrity (DRI), not
triggers, for relationships. If you don't have much
experience with SQL Server, DRI is more predictable and
easier to use than triggers. You can have problems with
referential integrity if you choose triggers; the insert and
delete operations might behave unpredictably. However, if
you use DRI, you must write triggers for cascading deletes.
- Don't let the wizard decide where to put the timestamps
if you don't have a strong preference. The upside to
timestamps is that they track the sequence of changes in
your database if you have the time and expertise to figure
out how they work. The downside to timestamps is twofold.
First, timestamps add data storage space to your database.
Second, they make it significantly more difficult to bcp
data from Access tables to SQL Server tables. If you let the
wizard place the timestamps, you must do more advanced bcp
formatting so SQL Server knows how to map the fields from
the text file to the database fields. Advanced bcp
formatting is a time-consuming, tedious process.
- Create the table structure only—don't upsize any data.
Telling the wizard to only create the table structure is
essential if you have a system with referential integrity.
If you let the wizard upsize the data, it might upsize child
tables before parent tables, causing the data load for those
tables to fail.
- Link newly created SQL Server tables and save the
Password information with the linked tables (unless this
setup violates security constraints). Otherwise, when you
open the Access file, you'll likely get a prompt for a
password to reestablish your connection with SQL Server.
(Whether you get this prompt depends on the SQL Server
security mechanism in place.)
After you specify these options, click Next to go to the
final wizard screen. This screen gives you the option of
creating a log report. We can’t stress strongly enough how
helpful log reports are in determining what happened during
the upsizing operation. After you select whether you want this
report, click Finish to run the wizard.
Step 11: Check the Log After
the Upsizing Wizard runs, print the log immediately because
the wizard deletes this log. Check the log for Aliasing
queries and other errors. If you get errors, you must decide
whether to work around them or redo the data upsizing
procedure. If you decide on the latter, you must delete your
newly created database, revert to your saved .mdb files, fix
the problems on the Access side, back up your .mdb files
again, and rerun the Upsizing Wizard.
The log might show that your table-level validation rules
didn't upsize. The Upsizing Wizard doesn't upsize these rules,
so you must write triggers for them.
Step 12: Back Up the New SQL Server
Database Now that you have the data structures
intact, run a full SQL Server database backup. Don’t forget to
also back up your master database.
Step 13: Load Access Data into SQL
Server Following the order you established in step
6, load the Access data into SQL Server, table by table. If
you have a database with no AutoNumber fields, the data should
migrate to SQL Server with few problems. You can simply run
Append queries for all your tables. If you have large tables,
you might need to run several smaller queries for each
table.
If you have a database with AutoNumber fields, you can't
use Append queries. These queries don't preserve the data
values in the AutoNumber fields because the fields become SQL
Server Identity columns. As a result, if another table uses
the AutoNumber field as a referencing field in an enforced
relationship, the child record often references the wrong
parent record or won’t load into SQL Server at all.
In our case, every table except for the Constants table had
AutoNumber fields. Thus, we ran a Pass-Through query for each
table. We used this type of query to set the table's
IDENTITY_INSERT to ON, append the data, and then set the
table's IDENTITY_INSERT to OFF. (Only one table per database
can have its IDENTITY_INSERT enabled at a time.)
To partially automate the Pass-Through process, we used the
LoadData subroutine in Listing 2. You can’t totally automate
the process because you must load the tables individually in
the correct order. Before you use this subroutine, you need to
substitute the correct Data Source Name (DSN), UserID,
password, and database name for both qdf.Connect strings. You
run this subroutine from the Debug window, passing it the
names of the tables one by one.
If a table is too big to load at one time, uncomment the
line that sets the ODBCTimeout to 0 so the process can
continue indefinitely. If you uncomment this line, be
patient—your computer might appear to hang, but it’s probably
running fine.
Another approach for handling large tables is to break the
load into manageable segments by adding a WHERE clause to the
qdf2.SQL statement. Put the name of the AutoNumber field in
the WHERE clause, and choose appropriate values so you load
only a few thousand records at a time. If the table has
numerous OLE or Memo fields, you might want to load only a few
hundred at a time. Keep changing the value in the WHERE clause
and rerunning the procedure until all the data is in the
table.
Every few tables, make sure all the data is loading.
Sometimes we had data that wouldn't load because of omitted
optional foreign keys. Other times we couldn’t figure out why
records weren’t loading. We even checked for triggers that
didn’t allow nulls but found nothing. Oddly enough, the
solution in those cases was to delete the DRI foreign key in
the child table and recreate the table.
Step 14: Write Triggers For
relationships with cascading events, you must delete the DRI
foreign keys and write triggers. You also need to write
triggers for recursive relationships and table-level
validation rules. Writing triggers is typically a
time-consuming process—but don’t rush it and be thorough
because triggers ensure data integrity. Listing 1 contains
example SQL Server triggers.
Step 15: Test the New SQL Server System
You must test the SQL Server system numerous times.
Add new data to every table and break all the rules to make
sure the system acts predictably.
15 Steps to Success When
planning an Access-to-SQL Server upsizing project, don’t
underestimate the time you need for data conversion. By
following the tips in these 15 steps, I hope you'll be spared
a few painful hours we spent banging our heads against the
Upsizing Wizard. Now that we’ve performed this data upsizing
procedure a few times, we’re down to an average of 11 hours
for small data-conversion
projects.
|