$Account.OrganizationName
The Access Wizard Newsletter Tips Tricks and Traps for Access Users and Developers
April 2005

Normalization 104 – Dependency: Calculate When Possible

Recently, I received a request to convert an existing Excel spreadsheet for tracking clients in a sales organization into a friendly, easy to use Access Application. One of the first things I noticed when reviewing the workbook was that it contained separate columns for First Name, Last Name, and Full Name. The Full Name field was the first name, a space, and the last name. I asked the owner why he had set it up that way. He said that there were occasions when he needed to use the Full Name as displayed in the spreadsheet, and that it was most convenient to have it in one field. He felt it was important that the new Access application I was building have the same ease of use.

I asked what happened when a client's last name changed, for example, when they got married. He said that his staff would go into the spreadsheet and change the name in both the Last Name and Full Name columns. I asked if that ever caused errors in the spreadsheet. He rather sheepishly said that, yes, a couple of times they had forgotten to make the change in the Full Name column, and that once he had spelled the last name of a very good client differently in the two places. In that instance, he received a call from the customer, who asked him to correct the spelling.

Our conversation turned to the new Access database, which gave me an opportunity to educate my client about the power of Access and databases in general. I explained that the Full Name field can be derived from the First Name and Last Name fields (see Wizard issue November 2004 on atomicity for a review of concatenation), therefore eliminating the need to update the same information in two or more places, while still giving him the data in the configuration he needed, when he needed it. I also explained that redundant data (the same data entered in multiple places in the spreadsheet) is a common problem that can lead to many mistakes, and that deriving or calculating data from other data when it's needed is the best way to design a database application.

Dependency, or deriving data from other pieves of data, is the topic for this month's Wizard. There are effective ways to create dependent data (letting Access do the work), and some ways that are not so efficient (in most cases, manually entering the data). This month, we'll discuss why it is typically most effective to let Access do the work.

In this issue
  • Tip of the Month - The Microsoft Knowledgebase
  • The Basics
  • Examples of Problem Dependencies and Their Effects
  • There are Exceptions
  • Conclusion
  • Trap of the Month - Help! I have blank pages in my reports!

  • The Basics

    As a general rule, whenever data (e.g. the Full Name field) can be derived from other pieces of data within the database (e.g. the First Name and Last Name fields), it is a mistake to store the redundant data. There are multiple reasons this is true.

    First, data changes, and when it does, information that duplicates that data should change also. If you rely on the user to change the same information in two places, then you've opened the door for potential errors. The more opportunity there is for human error, the more frequently you'll have incorrect data.

    Secondly, if you have two fields that should have consistent data and they disagree, which one is right? In the above examples, how would you know which last name is correct? Unless you have a sophisticated auditing tool in place, you won't be able to tell which piece of data was most recently changed by a user.

    The last issue is storage. Creating a field for data that can easily be derived from other data takes up more storage space in your database. If your database is relatively small, and considering that the cost of storage has come down dramatically in the last few years, this argument is a minor issue. However, it could make a difference with very large databases.


    Examples of Problem Dependencies and Their Effects

    As I mentioned in the Introduction, there are effective vs. ineffective ways to create dependencies. The ineffective (or problem) dependencies result from poor database design decisions that open the door to potential errors. In this section, we'll take a look at specific examples of ineffective dependencies that typically occur in databases. As I'll demonstrate, these ineffective dependencies can be eliminated, yielding more consistent and accurate data.

    Here are some common examples of dependent data found in databases:

    Date of Next Review -I see this one a lot. It's common for a database to contain a field that stores a date that is dependent on another date. For example, many organizations require a yearly performance review. In this case, the employee database would have a field for "date of last review", as well as a field for "date of next review" (which is typically one year from the date of the last review). To carry this date of next review, however would be a mistake, let me explain why.

    When this date is stored, rather than calculated, the first problem comes in if the "date of next review" is manually calculated and entered by a human. A person must calculate the next year's review date (a chance to make a mistake every time a review is entered). If someone happens to get the original date wrong, and later changes it, s/he has to also remember to change the date of the next review - providing two chances to make a mistake. With just a tad more work in the beginning, the database can be designed with a query to calculate the "date of next review" exactly one year in the future, based on the "date of last review". For more information on how to do this, search for DateAdd in Access Help.

    Total Cost - This is a favorite among the accounting crowd. They design their database with one field to record the number of units bought or sold, and include the price per unit in another field. The problem comes in when they record the total cost in a third field and populate that field manually after crunching numbers on their calculators -- a prime opportunity to make a mistake. The dependent "total cost" field is easily calculated by Access in a query, form or report by multiplying the number of units by the price per unit (simply use the * symbol for multiplication). This method is of real benefit when, for example, the unit cost changes. In that case, the value must be changed in only one place.

    Age - I'm surprised when this one occurs, but it does! Some databases actually have a field for the age of an entity (e.g. person, pet, or object) which is populated manually. With this method, the age is guaranteed to be wrong eventually, simply because time passes. Manually updating this data would be a never ending task! This one is also easy to fix using the Date() function, which returns the current system date, and then using the DateDiff function to calculate elapsed years. Again, search for these functions in Access Help if you would like more information.

    Balance - This example is different from the previous examples in that the value is not only dependent on another field, but also on data that was previously entered in the same field (much like a spreadsheet used to keep track of a budget). Periodically I'll see a table that stores customer invoices (money owed to a company) and customer receipts, along with a field called Balance Due.

    Usually the balance due is not entered manually simply because the person who set it up realizes that one mistake in the beginning would lead to mistakes in subsequent entries. The problem arises, however, when an incorrect invoice or receipt is corrected. Every record in the Balance Due field after the correction, which depends on the invoice or receipt that was corrected, must also be updated. Writing the code to calculate or correct the Balance Due is time consuming (and that assumes that a human actually remembers to do so).

    Let's dig into this in more detail. Take a look at the following table:

    This table is designed to track receipts and invoices and to keep a running total of the balance due. The values in the Balance Due column are calculated manually. There are a couple of major problems with this design. First, the balance due is accurate only if the invoices and receipts are entered in the correct order. Second, if someone makes a mistake in entering data, the balance due will be incorrect for that date and for every row that comes after it. Every balance after the corrected row must be changed manually, one-by one (multiple chances to make a mistake there!), or by a program coded by the database administrator (a better solution, but time consuming to write).

    A much better approach to managing this information would be to design a table like this:

    Here both the invoices (money requested from the customer) and receipts (money actually received) are tracked in a single field, InvRcptAmt. Invoices are entered as positive amounts, receipts as negative (to show a credit against the invoice). This has the advantage of not requiring the entries to be in any specific order and allows for easy calculation of the balance due. The actual balance due (a value dependent on the numbers in the InvRcptAmt column) is not shown in this table, but it can be calculated in a query, form, or report. To do this, you would utilize the following function:

    Dsum("InvRcptAmt","tblInvRcpt","CustKey = 1")

    In this case, the function will return 6, the balance due for this Customer. I won't go into much detail about this Dsum function since I'll be covering that in a future Wizard. At this point, all you need to know is that it can add up a series of numbers and give you the result.

    Let me mention that this Dsum function is not the same as a running sum. With a running sum you get a balance run for every data row you show. The Dsum function, provides just a single number for all of the rows you show in your report or form.

    What if you really want a running sum (balance due at each point in time) in a report? As it turns out, Access has this ability built in. It's a tad complex and beyond the scope of this article; however, you can get the details by typing "running sum" into Access Help and you'll get a detailed explanation. If you need to get a running sum in a query or a report, it's also a bit complex, but Microsoft has also detailed how to do this. See http://support.microsoft.com/default.asp x?scid=kb;en-us;290136 for details.


    There are Exceptions

    There are times when striving for efficiency and calculating data from other data when you need it doesn't make sense. Sometimes it makes better sense to calculate and store the data in the database. For instance:

    • If you have data that doesn't change (e.g. historical data or data in a data warehouse), you may opt to store the data resulting from calculations in the database instead of calculating it in a query, form, or report.


    • If you have tons of data or complex calculations, the processing cycles required to make the calculations every time you need them may drag down the system and result in poor performance. In this case, you may consciously decide to trade the need for careful data entry and modification for improved performance by storing the calculated data in the database and recalculating only when necessary. This is a tricky decision and the recommendation will really depend on your particular circumstance. If you think you may fall into this category you may want to consult with your local Access guru, prior to committing to a design.

    • Theoretically, it is possible to derive any city and state from a Zip Code. In order to do this you must have and maintain a City/State Zip Code table. However, the overhead to set up and maintain this table is typically not justified.


    Conclusion

    Be on the lookout for data that can be created based on a dependency (calculated rather than stored in your application). Every time you avoid storing data that can be calculated on the fly, you reduce the opportunity for someone to make a mistake. Fewer mistakes result in better data, which ultimately leads to happier users!


    Trap of the Month - Help! I have blank pages in my reports!

    Sometimes when you run a report, you'll find that you have a blank page between each meaningful page of your report.

    The most likely cause of this problem is that the width of your report is greater than the width of your page. The limits here in the US are 8 ½ inches for portrait and 11 inches for landscape.

    The width of your report page will automatically expand as you move controls around in the design view. You can shrink back to an acceptable size in the report design view by hovering on the right edge of the report until your cursor changes to a double headed arrow. When it does, hold down the left mouse button and drag the right edge of the report to the left.

    When you're doing your calculation of total page width, make sure to include margins (you can see and change them in report design by choosing File | Page Setup from the main menu). Remember, in order to avoid the blank pages you must select a report width that is less than the physical width of the paper.

    © Copyright Custom Software All Rights Reserved


    Tip of the Month - The Microsoft Knowledgebase

    Microsoft maintains a tool called the Knowledgebase-a treasure trove of information featuring techniques, product bugs, and general information about its family of products. This resource is what the helpline techs at Microsoft use when you call them up and they don't know the answer to your question or problem.

    The Microsoft Knowledgebase is available at http://support.m icrosoft.com, and I frequently use it when I run into sticky problems. The Knowledgebase is home to the link I pointed you to earlier in this article for more information on running sums.

    Be aware that the Knowledgebase is not particularly friendly or easy to use; however, it may be worth a visit the next time you get stuck with a tough problem.

    Quick Links...

    Custom Software Home

    Access Wizard Archives

    Our Services



    Join our mailing list!

    Forward email

    This email was sent to jim@custom-software.biz, by jim@custom-software.biz
    Powered by

    Custom Software | - | Westford | MA | 01886