To solve this problem, we need to
identify the addresses that are almost, but not quite the same, and then
send a single mailing to each address. We do that by using the Left
function (introduced in the December
2004 AccessWizard) to determine the duplicates, and then take the
first item that occurs.
Let's
start with the following table

This
table, like many I've seen, contains repetitive data. This is often
caused by importing data from different sources or poor control of data
entry -- almost certainly the case with the Tree Killing Catalog Company.
Our
goal is to send out only 5 mailings, one each to Sally Barnes, Rocco
Connell, Sam Jones, Mary Lawrence, and Patricia Quigley
If we
try the standard approach of either a "group by" or
"selecting unique values" in a query, we get:

This
query eliminated the exact duplicate record for Mary Lawrence, but we are
still left with the near exact match of Sally Barnes (10 Vinton Dr and 10
Vinton Drive) and Rocco Connell with his variations on Street.
To
solve this problem, we'll use a built-in Access function called Left to
get the first 6 characters of the Street Address. We will then use the
isolated string from the address field to select the first record in the
database with that address.
There's
no magic in the number 6, I just scanned the data and figured it would
that it was sufficient to make a match, but it depends on your database.
For example, in Maryland, street addresses typically have 4-6 digits in the
address, so you'd need to choose a higher number.
Back to
our left function -- it takes two arguments:
1. The
string that contains the characters
2. The number of characters to return.
We
construct it as follows in the query window:
AddressFirst6: Left ([Address], 6)
When we
put this into our "group by" query we get:

Note
that this is a "group by" or Totals query. Do you see the
highlighted ∑ (in some Access versions it will be depressed) button at
the top of the screen in the menu bar? That provides the "group
by" - it's a way of telling Access that the records that are exactly
the same should be grouped together. We've accepted the default of
"group by" for all fields except our primary key field,
Customer Key. For this field we've chosen First, which means that it will
not group on that field, but rather select the first one it comes to for
the rest of the grouped fields. This is an important issue since this is
the lever that we'll use to return the full field info from the table.
When we
run this we get:

We're
now down to 5 records, which is what we're looking for, however we still
don't have the full address. We do, however, have a result called
"First of Customer key", our query above. We'll use this to get
the unique records with the complete address.
Our
plan will be to save this query as qryAddressFirst6, then use that query
to join with our address table to get only the unique addresses.
In
design view it looks like this:

Notice
that we've joined the query (qryAddressFirst6) to our address table
(tblAddress). We've joined on the Customer Key on both sides. In our
qryAddressFirst6, it's the Customer Key that we came to in our group by
query. In this type of join (called an inner join - See the October
2004 issue for more information on joins and relationships), only
those records which are exactly alike in the joined fields are returned.
When we
run this we get:

Here we
have our 5 addresses with no duplicates that we can use to create our
mailing list.
|