Access Wizard Newsletter



The Access Wizard Newsletter		Tips, Tricks and Traps for Access Users and Developers September 2004

Normalization 101 or The Perils of Building a House from the Roof Down

When you build a house, you start from the foundation. If you started with the roof, you would certainly entertain the neighbors, but there is little likelihood that you would end up with a satisfactory place to live. The foundation must be solid, of quality materials, appropriately planned, and built with a high level of craftsmanship. If your foundation is flawed, the house might be OK for a few years, but eventually it would start developing problems that could cost you lots of money. In the worst case, a faulty foundation could develop serious cracks that could lead to catastrophic problems.

Databases, like houses, must also have a solid foundation. Databases that are well thought out from the beginning will take up less space, perform more quickly, and be easier to modify if the need arises. In addition, a well designed database will preserve the integrity of the data stored within it. This month begins a series that will help you to learn the proper ways to build a solid foundation for your database projects. As we move forward, we'll discuss specific methods and details of solid construction, and the different types of table relationships. In this issue, we'll focus on the most common and serious problems that should trigger alarm bells suggesting that your database foundation may not be as strong as it should be.

A database foundation consists of properly constructed tables, fields and relationships - techies call the process of optimizing the design "normalization". If you properly normalize then you will have a solid platform on which to build your application. If you do it inadequately or haphazardly, it will be like a house with a poorly poured foundation - it may work in the beginning, but it will cause you major hassles as you progress. It's very tempting when building a database to slap some fields into a couple of tables and then start blasting away with forms. You must avoid this path at all costs. Rather, invest time up front, to analyze your data needs and determine how to best lay out the data.

Over the next few issues I'll give you concrete advice on how to do this. This month, we'll use an example of a database for a medical clinic to examine four key symptoms -- red flags indicating that you have problems waiting to happen

In This Issue - Symptoms You Have a Problem Waiting to Happen

Tip of the Month
Symptom One: Tables go on forever
Symptom Two: One Potato, Two Potato, Three Potato, Four - Repeating Fields
Symptom Three: The same data appears again, and again, and again and again ... (all in the same table)
Symptom Four: 2 + 2 = 5 Storing data you can calculate as needed

Symptom One: Tables go on forever

Do you find that you have one monster table in your database that has so many fields that to see them all you have to scroll again and again? Do you find yourself adding more and more fields to this table and ending up with a queasy feeling that maybe this isn't the right way to do things?
If you find that you have very few tables with loads of fields in the tables, then you likely are not taking advantage of the power of a database. Some people make the mistake of developing an Access database that resembles a spreadsheet they would build in Excel. Databases are not Excel Spreadsheets where you just keep on adding columns; rather they are sets of related data. It's the existence of relationships that begins to open the power of databases. A data item needs to appear only once. With properly normalized tables a single change appropriately ripples throughout your application. To achieve this end you have to analyze the data, determine how the various pieces of data relate to each other (hence the term relational) and then break them down into the appropriate related tables.
When you are designing tables, always examine them in the relationship window (Tools | Relationships). Of course there is no hard and fast rule you can apply, but if you see that any one of your tables doesn't fit comfortably on the screen, then you may have this problem. We'll come back to this in future issues and more closely examine how to best solve it; as you may guess, normalization is the answer. Although it can be a complex topic, we'll break it down step by step to reach a solution.

Symptom Two: One Potato, Two Potato, Three Potato, Four - Repeating Fields

This one I see a lot, and it's due to a poor understanding of how to construct the relationships. For instance, let's say that you're building a database to track various diagnoses (Dx) for a patient. It might be tempting to set up a table with the fields Diagnosis1, Diagnosis2, and Diagnosis3. Each patient could then have 3 separate diagnoses. There are at least two problems with this. Inevitably you'll have a patient that needs to have a fourth diagnosis. As soon as you get one of those, it's back to the tables to add a field and back to your forms and reports to deal with the new field - an expensive proposition. The second problem comes about when its time to do analysis. Let's say that you want to know how many patients have a diagnosis of Cerebral Palsy. If you have one table with three diagnoses for each patient, then you have to write some complex SQL statements to aggregate each field or write some code to loop through each field in the table to look for Cerebral Palsy. It's possible to do that, but it's neither easy nor cheap.
The correct normalization, which builds flexibility into your database, is shown in the picture below. The critical table here is trelPatientDx. This table contains a single Dx for each patient is in a separate row of the table. I've included an additional field to set the priority of diagnoses for a single patient called fldRank. You'll need this field only if the order of diagnosis is important (e.g. DSM IV diagnoses for the fellow psychologists out there). With this construct, finding how many patients have Cerebral Palsy is a snap. If a patient needs more than 3 it's not a problem either, as a matter of fact having 103 is easy as well. A patient can have as many diagnoses as s/he requires, and there's no need to change forms or reports if you've set them up with this type of relationship to start with.

Symptom Three: The same data appears again, and again, and again and again ... (all in the same table)

Let's say you're tracking patients and their doctor with the doctor's telephone number. You may be tempted to have the fields Doctor and DrTelNumber in the patient table. But if Dr. Jones is physician for 250 of the patients in your database, she and her telephone number would appear 250 times in the same table (a major red flag!). Although you could do it this way, what happens when the doctor gets a new telephone number or the Area Code changes? You would have to run a query that updates every record with the new number. A better approach would be to set it up as shown below, with a separate table to hold the information about the doctor. The doctor is identified in the patient table with the DrKey from tblDr. This is a classic example of a "one to many" relationship (one doctor has many patients). In this construct you would simply go to the doctor table (or form) make a single change there and it would appropriately ripple throughout the database - a much easier and user- friendly way to update data.

Symptom Four: 2 + 2 = 5 Storing data you can calculate as needed

This is a really common problem with big consequences so I'll give you two examples.
First Example: Full Name stored as a field: Let's say you want to use a person's full name in a report. You may decide to create the fields FirstName, LastName and FullName. The problem is that if Mary Smith gets married and becomes Mary Jones, then two changes are necessary to update the data rather than one; if someone forgets to make that second change then you have a data problem on your hands. The bottom line is that there is no need for the FullName field. Whenever you need it, simply calculate it using the concatenation function (FirstName & " " & LastName). The ampersands will concatenate the First Name and Last Name fields, the quotes in the middle will put the space in between. This solution preserves the integrity of the data and saves space in your database.
Second Example: You want to set up a field to flag the date of the next physical exam, which is always one year from the last exam. You might be tempted to have two fields: LastExamDate and NextExamDue. The problem with this is two-fold:

Anytime you are required to figure out one date based on another date, you've created an opportunity to make an error.

If Last Exam Date was entered incorrectly, someone must manually correct Last Exam Date and recalculate Next Exam Due when the error is detected. The biggest danger, of course, is that the user will simply forget to recalculate the date the next exam is due - once again you've opened the door to bad data.

The solution to this is not quite as simple as concatenation; you have to use the built in function "DateAdd." Its purpose in life is exactly this scenario. You can read up on that on your own - Press F1 to bring up help and then do a search on DateAdd and it will give you the particulars. Access is a powerful program with the ability to create data based on a calculation much like a formula in Excel. This is one of the many examples
Conclusion and a Look Ahead
Proper analysis and creation of tables and fields and proper establishment of relationships between tables (normalization) is a critical exercise for all but the most simple database applications. In this issue, we've reviewed common symptoms of a shaky foundation. Take extra time in the beginning to get your foundation right and you'll save money, time and aggravation down the road.
In the coming months we'll continue our series of how to build a solid foundation as we talk more about normalization and review the different kinds of table relationships (one-to-one, one-to-many, and many-to- many.)

Tip of the Month

To get a quick assessment of some ideas to normalize your tables you can use the built in Table Analyzer (From the Database Menu -- Tools | Analyze | Table.)

This feature will not automatically redesign your database, but it will make some suggestions that you can analyze to better approach a properly normalized set of tables.





	Quick Links... Our Services Selected Presentations from Jim Connell of Custom Software Future Newsletter Suggestions, Questions and Comments Past Newsletters

     email: jim@custom-software.biz
     voice: 978-392-3462
     web: http://www.custom-software.biz

Custom Software · - · Westford · MA · 01886

Forward email

This email was sent to jim@custom-software.biz, by Custom Software.
Update your profile |Instant removal with SafeUnsubscribe™ | Privacy Policy.