Introduction to Document Management
Document management can mean many things
to many people, and can serve a variety of purposes. The
intention of this article is to list the components involved
in document management and to briefly describe each one.
At the conclusion, there are also some important factors
to help in selecting a document management system.
If you’ve never used a document
management system, then it is entirely possible that you
aren’t aware of how valuable these products can be.
Companies and individuals who manage a diverse array of documents
have found that document management systems serve to simplify
their lives and make both storing documents and later obtaining
those documents much easier.
Many companies are forced to go the way
of electronic documents because of The Sarbanes Oxley Act
of 2002, industry compliance (HIPPA), or because it is required
by their customers or vendors. The simplest form of electronic
document management is storing files in an organized directory
and categorizing files by the folder in which they are located.
If your company manages more than a few documents, this method
can quickly become very inflexible. Incorrect filing can
cause a document to disappear into a virtual black hole,
never to be seen again.
The entire process of document management
can be broken down into four categories: file capture, file
processing, file management and file storage. A company may
require one, two or all four of these processes.
File Capture
File capture may consist of scanning paper
documents, capturing existing electronic files (ex: .doc,
.pdf, .tif and scanned documents), and capturing documents
from applications with print drivers.
Scanning – if you
have a large quantity of paper documents that need to be
scanned and introduced to a DMS (Document Management System),
then you must consider: 1. How you want the information to
be retrieved and stored; and 2. How you want indexed information
to be introduced to the DMS.
The manner with which you plan to access
the documents later will determine which file format type
the files will be saved under. The most common types for
scanning output are .tif and .pdf. The advantage of .tif
file types is that they are the smallest file types and therefore
take up the least amount of storage space. The advantage
to .pdf file types is that they provide better options for
content text searching (searching every word of a document),
are easier to edit, and are overall more flexible. If you
will be using content text search, you might lean toward
.pdf output but if you will beretrieving files from indexed
information only, you might prefer .tif’s.
The method of introducing indexed information
(any field used to search and categorize documents) can range
from fully automated to fully manual to somewhere in between.
The more automated the process, the more file processing
will be required. File processing will be covered later in
this article. Extensive automation will make your project
more complex and costly, but if you handle a large volume
of documents, the automation will quickly pay off in the
form of reduced manpower.
The actual physical scanning of the documents
can also be fairly automated with batch scanning, bar codes,
and database validation. Batch scanning reduces the labor
in introducing the documents to the scanner. Instead of the
scanner operator separating every document, scanning it separately,
and then saving the file into a directory, the operator simply
places all of the documents into a feeder bythe bulk. The
scanner then detects a document change by a blank sheet,
bar code, or some other indicator. Bar codes can also be
used to represent a group of information or a client or project
to populate multiple indexed fields. If you have an existing
client database in your current ERP system or even QuickBooks,
this data can also be used to populate or validate indexed
fields.
The most common misconception about scanners
and scanner software is that they need to be compatible with
your DMS. Most scanner software will output the scanned files
into a directory where they can be handled by any DMS.
Capturing electronic files – Existing
electronic files - such as .doc, .xls, .dwg, and .dgn, -
are easier to capture into a DMS. These files contain hidden
properties called metadata, which can be mapped to afield
in the DMS. This information might include created date,
author, title, title block, and other useful information.
Once this data is mapped, these fields will be automatically
populated when the electronic files are introduced. These
files may be saved to a directory into which the DMS imports
them, files may be dragged into the system, or a mass import
may be used to bring in legacy documents. During the import
process, other index fields may be populated from the hierarchy
of the directory structure of the files. Unorganized documents
are more difficult to manage this way. Index fields which
are not “intuitively” populated in some manner
may require processing or some manual input.
Electronic files may also be captured
through print drivers. If you commonly print reports from
anapplication or save them to a directory, you may use a
print driver to introduce them to the DMS, which will ultimately
save time. Faxes may also be saved to a directory from which
the DMS can pick them up for distribution before they ever
go to paper.
File capture is relatively easy, but a
simple digital file without any additional processing isn’t
much use. You may take the file and name it and file it in
Windows Explorer, but when you are cataloging hundreds or
thousands of files, this is not a feasible system. Human
error, if nothing else, will prevent every single file from
being named correctly and stored in the proper location.
Even one misplaced file can wreak havoc for a business.
File Processing File
processing can help make files more manageable. Examples
of processing tasks include: separating and merging, OCR;
zonal OCR; forms recognition; conversion; routing; and database
(DMS) population. Some of the processing tasks can be completed
with scanning software and/or your DMS. Files can be processed
years after they are scanned or during the scanning process.
OCR (optical character
recognition) allows scanned documents to undergo content
text searching once the document is added into your system.
Word, Excel, and other digital files do not have to undergo
the OCR process to be content-searchable. Indexing the documents
makes the contentsearch very fast, even if you are searching
through thousands of files.
Recent improvements in OCR make the process
very accurate (up to 99%), however the accuracy of the OCR
is dependant on the quality of the document and to some extent
the hardware used to scan the documents. Most companies are
happy to enjoy the benefits of OCR and content text search
even with its imperfections.
Zonal OCR (OCR of a specific
zone on a page) Scanned documents can also be processed to
find certain information on the document and input it into
fields in your document management system. For example, an
invoice number may be required to organize and store the
document so that theinvoice number location is predetermined
in a template and then that number is read and input into
the document management system. This process is called “forms
recognition,” and may include many fields of information
from a single document. Depending on the type of documents
and the quantity of fields to be populated, this process
and be both complex and expensive, so it is important to
weigh the cost with the benefits.
File Management
There are many different types of applications
available on the market with which to manage files. Choosing
the one that is right for you can be complicated, and sometimes
requires a consultant.Companies that choose to create their
own systems are recreating the wheel and will be forced to
replace that system at some point in the future. Some critical
issues to consider when choosing the system that is right
for you are:
1. Types of documents you are managing (working vs. final)
2. Internal and External Requirements
3. Browser or Desktop Interface
4. Cost of Ownership
Final vs. Working DocumentsA final
document, such as a contract or an invoice, may
not need to be edited at a later date. It is saved for
reference and/or retrieval purposes, and will not necessarily
be needed again. These are called “final documents”.
Managing final documents is much cheaper and easier than
storing documents that require editing capabilities. If
you simply want to scan final documents and store them
for later retrieval, you may only require a simple and
inexpensive DMS.
A working document, on
the other hand, will need to be revised on one or more occasions.
These types of files might include manuals, sales literature,
or CAD files. The author or other colleagues may need to
edit them, or they may need distribution for specific purposes.
A more advanced and versatiledocument management system will
be needed so that the user can track changes, implement markups
and revise text.
Internal and External Requirements
It is always helpful to compile a list
of requirements for your proposed system, even if it is very
brief. The list should include requirements form users, industry
requirements, your internal IT requirements, and customer/vendor
requirements if they will have access to the system. A system
that does everything you require is worthless if your IT
department decides it does not comply with company policies.
Involving your IT department from the beginning is usually
a good idea.
Some industries have compliance rules
of which you will need to be aware. The health industry is
subject to HIPPA compliance, which requires a secure database,
audit trails, and passwords for any e-mail from the DMS.
All companies are subject to requirements of the Sarbanes
- Oxley Act of 2002, which was passed to prevent companies
from shredding documents and claiming stupidity as an excuse
for breaking the law. Internal IT requirements may prevent
you from using a specific type of database or prevent you
from providing external browser access to your system through
a portal.
Web VS. Desktop
Something else to consider in your research
should be whether or not you need a “web” or “browser
version” of document management. You my have seen “ web-based ” document
management systems, but that is not what we are discussing
here. Web- based document management systems would host your
documents at an “offsite” location and allow
your users to access their system. Most companies are not
keen on this idea because they lack control of what happens
to these documents. However, outsourcing the hosting of your
documents may be advantageous if you do not possess an IT
infrastructure.Most document management systems offer both
a desktop version and a browser version. You may use both
as opposed to having to decide between one or the other.
Most companies only offer a very limited version of the Browser
version; many are limited to search, view, and print capabilities
only.Browser Interfaces can be very beneficial because they
eliminate the need to install the application on every desktop.
Instead, they can be used to access documents from anywhere
in the world, and they provide a significant increase in
security depending on your method of storage (see security
section later in this article). Browser applications access
the documents through “services”, and as a result,
the directories can be locked down to prevent “back-door” access
while still allowing storage in the native file format.
If your DMS contains an internal viewer
in a Browser interface, you may give users the ability to
view documents without ever possessing the documents and
without having the native application on theirdesktop.
Using the combination of desktop and browser
versions is the ideal situation because some users may prefer
the desktop, like administrators and data entry users, while
standard users, customers, contractors, and vendors will
be better suited to the browser version. If you need to share
or distribute documents with external parties, then a browser
version could save you an enormous amount of time and money
in a relatively short period of time.
File Storage and Security
Application (User Profile) Security – polices
the names of those who have access to specific files or projects
as well as the people who have permission to print, edit,
or otherwise alter files.
Directory Security – polices access
to the documents via Windows Explorer or other directory
tools.
Most document management systems have
limited levels of Application Security within the application.
Usually included are an administrative level, a user level,
and a “view only” level. Some systems will allow
you to dictate how many levels you want and exactly which
functions are allowed for each specific level.
There are three main forms of document
security. Depending on your specific goals, one of them may
be more beneficial for you than the others. Below is a brief
discussion of each form of document security
Database Blob. This is
the most “secure” method of securing documents,
but this security does not come without a price. The files
are not stored in their native file format; rather, they
are converted into another form in the database.
Blobs also become very large. A file may
become 5-10 times larger when converted into a database blob.
Files with associated reference files, like in CAD files,
will loose their association because of name changing. Add-on
products are available to address this issue, but they can
be expensive and slow down your system. Blobs will also prevent
you from accessing your documents from an alternative method
if your document management system becomes unavailable.
Encrypting Files . Encrypting
documents changes the names of the files so that they cannot
be accessed or opened from the directory. A user may browse
the location in Windows Explorer, but that individual won’t
be able to identify a file or document by its filename. They
also would not be able to open the application because it
is encrypted and must be opened through the document management
system.
One of the disadvantages is that a user
can delete the files if he can find them (you cannot “lock” the
directory because the document management application, or
desktop, needs access to the directory). You can overcome
this problem with regular backups of the system. The files
have similar naming problems as blobs, reference links are
lost, and you may be held hostage by the DMS if it goes down.
Native File Format Storage. This
is the process of storing files in a directory in their original
format. This is the most flexible method because the files
are not altered. The administrator also maintains control
of the access to the documents regardless of what happens
with the DMS application.
The disadvantage is that the directory
must remain unlocked (as with encryption) for a desktop application
to access the documents. If you need to restrict access to
these documents outside of procedural regulations, then you
can store the documents on a hidden directory so that the
users do not know how to navigate the document repository.
This method is very effective, but not 100% secure like the
blob method.
The most ideal security solution would
be to store the files in their native format, but only allow
the users to access the system through a browser interface.
The browser accesses the files through services, so the directory
may be locked down to prevent accessing documents through
the “back door”. The browser option gives you
the best of both worlds: 100% file security with flexibility
and optimal storage capacity.
Benefits of Document Management
Most companies are pleasantly surprised
by the added benefits of document management that they never
before have considered. Document management saves time, money
and anxiety over the storage and transmission of important
documents. These are a few of the most common benefits that
companies enjoy after implementing a document management
system.
1) Time & Resource Management. The
managed distribution (transmittals) of documents is a time-consuming,
monotonous process for many companies. One or more employees
may spend hours trying to find specific documents; distributing
them to customers, clients and vendors; and then logging
the transactions into an Excel spreadsheet.
If you are using versions in this process,
the logging becomes even more time consuming. Natural byproducts
of a DMS include fully automated tracking and distribution.
The tasks that once took hours are now reduced to a matter
of seconds. Billable hours for employees who manage documents
and files are cut back significantly, and the audit trails
are assured to be accurate.
2) Workflow. Imagine
a real estate company with several agents who each have fifty
clients. Hundreds of documents are required for every real
estate transaction, and it is difficult to keep up with the
placement of these files. Document management allows them
to keep track of which documents need to be completed and
when, while providing a snapshot of the placement of each
document, all at once.
With an effective DMS this real estate
company can organize all of the documents by client and track
the status of all the documents. When an agent needs to know
the status of a client’s paperwork or wants to make
sure all of the paperwork is complete, they can quickly pull
up a snapshot of all of the documents without leaving the
desk.
3) Shipping Costs. The
examples given above don’t even touch on how much a
business can save on shipping costs. How much time do you
spend tracking down documents, packaging them and shipping
out FedEx because someone at another location needs a copy?
A good DMS will allow you to quickly find documents, distribute
them, and keep a record of what you sent to whom automatically.
Cost of Ownership
Cost of ownership considerations include
system complexity, hardware requirements, licensing methodology,
and module add-ons.
If you choose a complex DMS you may have
to assign a fulltime administrator to manage the system.
Not very feasible for most companies that want to manage
documents for a small group of users. Configurable systems
that allow a “part-time” administrator to perform
simple tasks will save you a lot of money in the long run.
Some systems require multiple servers
and additional software to operate. Be careful that you do
not forget to budget the “additional requirements” just
to make the DMS work.
“Named seat” licensing can
make a DMS very expensive. Try to find an application with “concurrent” or “floating” licenses. “Named
seat” licensing will require you to purchase a license
for every single user that needs to access the DMS for any
reason. The total quantity of licenses could be 3-4 times
greater for a DMS with “named seat” licensing.
Add-on modules can quickly send your company
into bankruptcy. Some DMS developers offer their systems
in modules and sell you an individual module according to
the functionality you desire. You might like this at first
if you only need small portions of the system, but over the
long run you will pay far more than if this functionality
is included in the system.
DMS Pricing
The cost of your system is directly related
to your requirements. If your requirements are simple and
you can use “off-the-shelf“ software without
forms recognition, you can probably acquire a system for
under $5,000. More complex systems that require onsite implementation
can range from $20,000 on into the millions of dollars. The
larger DMS companies that have been around for many years
are exponentially more expensive than the newer players.
The largest DMS companies will rarely install a system for
fewer than ten users or a price range of about $100,000.
Newer players in the market may have more modern code, better
functionality and they are more eager to install even the
smaller systems.
Don’t forget to also consider the
costs for importing legacy documents and set up document
processing procedures if they are required. The DMS vendor
you choose should provide a utility for importing your legacy
documents and most of the metadata. The manual process can
cost you thousands depending on your quantity of documents.
Setting up document processing procedures and particularly
forms recognition can be very expensive depending on the
complexity. The needs analysis of document scanning and data
capture is separate from that of DMS.
In summary, choosing
a DMS can take some time and effort, but selecting the wrong
system can waste valuable time and money. If your needs are
simple, you can choose a system within a few days. If you
require a more complex system, and you do not have internal
resources experienced in document management, then save yourself
time and resources by hiring a consultant. An experienced
consultant can help you narrow your selection in a fraction
of the time it would take you otherwise. Once you have two
or three choices that are a good fit, select the one with
which you are the most comfortable.
Consulting Services