Building an Analytical Database
What do you do when confronted with filing cabinets full of paper (accounting
records, purchase orders, sales reports) and you can’t follow the natural
human inclination to walk away or hand it to someone else? One answer is to
build a database, to capture and cross-reference data for pattern identification
and summarization. We faced this situation in investigating a $600 million Ponzi
scheme which generated enough documents to fill a 15’ by 10’ room to the
ceiling. Our database design and review guidelines may be useful to you in your
document-intensive challenges.
Categorize the available information
What documents do you have? What do they purport to represent? Where did
they originate? Are there totals, summaries, or reconciliations to indicate
when all items have been captured? Is any information available on magnetic
media, or will it need to be keyed in?
Identify relevant items of information
What are the issues under review? How do the documents relate to those
issues? How can you maintain an audit trail? What is the expected end product,
and where will it be used? How is it likely to be challenged?
Design and implement a structure to capture information
Even if all documents are imaged to optical storage (increasingly common in
document-intensive cases), optical character recognition and full-text
retrieval do not substitute for a well-designed database. When totals,
averages, or counts of particular items are needed, entering data elements
into a database supports useful analysis. The data structure needs to capture
all potentially relevant items – what might take a short time to enter on
the first pass through the documents may require mammoth efforts to add
missing data later on.
Use all the information available
For example, in addition to accounting information, we captured phone
numbers for purchasers and sellers. We then obtained long distance telephone
call records and matched numbers called per the phone bill against calls
purportedly authorizing transactions. Comparison to a phone directory database
showed several anomalies. In one instance, numbers which the transaction logs
indicated belonged to two different groceries were actually listed to an auto
towing service in a different city. That anomaly might not be conclusive by
itself, but it provides a powerful indicator that something was wrong with the
transaction.
Establish validation routines to promote confidence in the data
Validation can range from monotonous (manually checking data entered
against source documents) to accounting-intensive (comparing totals to
transactions reflected in bank records) to high-tech (computerized comparison
of source document data to valid data obtained from external sources. For
example, this fraud involved grocery arbitrage, so we obtained a file of valid
Universal Product Codes (UPC) and compared UPCs reflected on purchase orders
against that valid file.
Standardize and expedite data entry
Set up a table of commonly used terms (names of banks, product names and
descriptions, frequent customer, standard comments, etc.) to speed up data
entry as operators select from a list of choices. But, make sure that valuable
information is not lost in the process. For example, consistent misspellings
in documents for fraudulent transactions could be hidden by the use of a data
table and "cleaning up" the database, possibly hiding potential
indicators of fraud.
Use the database to search for patterns
After compiling data on thousands of purchase orders and related documents,
we reviewed summaries of data elements in various combinations. We noted that
legitimate transactions were generally paid by a grocery company check within
30 days of delivery, while bogus deals were settled by wire transfers from a
known confederate company 85 days after purported delivery. Other patterns
(geographic location, reciprocal trading, size of transactions, person
authorizing) became clear through study, aiding in development of an effective
checklist to test whether or not a transaction was genuine.
Back up your data frequently
Our data entry teams worked for months against tight deadlines and at great
cost. Losing even a portion of the database to system crashes, operator error,
or deliberate incursion could mean the failure of the entire project, as we
needed to control data integrity throughout the process. We couldn’t afford to
waste more than a day in recovering from a data disruption, so we made
comprehensive daily backups, with even more frequent attention paid to
"mission-critical" files, and kept copies of backups securely
off-site.
For More Information Contact:
Bill Black
William H. Black, PC
Tel: 770.698.8020
FAX: 770.399.6731
Internet: http//billblackcpa.com
Email: whb@billblackcpa.com
|