Friday, July 10, 2015


I'm following up on a question I got this week, asking "What is a taxonomy?" The google answer leads to the Tree of Life and the grand classification we've assigned to all living things on earth. That grand structure is getting an overhaul by the way, reclassifying creatures by parsing their DNA, rather than the traditional grouping by observable features.

But in the context of records management, "taxonomy" has been taken over to describe a particular discipline of classification of information. More on that later. I'm going to lead in to this by describing a few other concepts, from the most disordered to the most ordered. This is not the historical progression, by the way. We are now living in the "foksonomy" era, were order, apparently, has been taken over by the mob.

It is in our nature to classify and organize. It helps us leap past the irrelevant, whether literally or figuratively, and swiftly capture our prey. Does any classification schema work as well, or are there some underlying principles? By way of example, please spot the flaws in this hypothetical classification:

The Celestial Emporium of Benevolent Knowledge
a)     belonging to the Emperor
b)     embalmed
c)      tame
d)     suckling pigs
e)     sirens
f)       fabulous
g)     stray dogs
h)     included in the present classification
i)       frenzied
j)       innumerable
k)     drawn with a very fine camel-hair brush
l)       et cetera
m)   having just broken the water pitcher
n)     that from a long way off look like flies
-         Jorge Luis Borges from “The Analytical Languageof John Wilkins

I suggest you take note of the flaws and take a moment to consider what makes them so wrong, because getting a sense of the possible flaws in a classification schema will save you worlds of hurt later.

The second grade of order is the "folksonomy" where the masses decide how to title and index their entries, with no requirement to align with what has gone before. An example of folksonomy is the photo sharing site, Flickr, which invites users to tag their photos with whatever terms are meaningful to them. Folksonomies are best served by large user groups and sharing, to allow for common terms to dominate. Disorder gradually resolves to order. In this age of mass information sharing, folksonomies may be the only sensible way to establish any sort of order to the internet. There is no one group of dedicated professionals large enough to impose order to the sheer volume of information that is being generated.

Where folksonomies fail us is when a particular record needs to be found, consistently, permanently. We regularly come across these demands in business. For instance, it may not be critical to find the same information twice if we are planning a "california" "summer" "vacation" two years in a row. Google will be our friend and will happily give us the most popular result. There is no reassurance however, that we will get the same result two years' running. But if we need to find the "scope change" to "Project XYZ" that was asked for by our "Critical Customer" some time last fall, we better find that particular scope change. This level of order, where particular records must be found, demands a planned structure that is defined and managed.

The next level of order is covered by the general term "classification", and broadly describes our natural tendency to impose order. The slots or boxes that we use could be any useful structure; whether it be by location, features, or date.

Before computer tagging, selecting the hierarchy of classification was critical, because the first order of organization would govern all that followed, the indexes that would be manually maintained, and so on. Take the classic library catalog card, for instance, and imagine the labor required to maintain the indexing.

This brief index includes the title, author, description, publisher, year published, cross index terms, and classification number (PS3557.R5355 F57 1991). From it's structure, I recognized this reference as the Library of Congress classification.

  • P stands for Language and Literature.
  • PS - American Literature
  • 3557 covers the years 1961-2000 [I feel the whisper of the librarian's sigh. From PS1 to PS699, there is an attempt to sub-categorize American Literature by period, region, subject, poetry, prose..... No more it seems.]
This is an example of a hierarchical structure, that moves from the most general to the most detailed. A good design will guide a user to the most logical place for a record.  

However, when I recall a book from the library these days, the default search screen is a single box. I put in a term that is significant to me, and the search engine cruises the entire index, all levels, all terms. Keep this in mind when setting up a hierarchical structure, that chances are users will bypass the structure altogether to find what they want, if the search tools are available.

Most common these days in business is to recommend a functional classification scheme. Where the hierarchical structure will reflect the functions of the business. This may be similar but not identical to the organizational structure. We don't want to re-order the records every time there is an organizational change. These schemes will be unique to the business, though there are some commonalities.

Here's one attempt to classify all business functions, developed by APQC:

Chances are you will recognize some of these titles as common to your business. In the APQC model, under each of these main categories, the functions are more fully described down to the activity level.

And, finally, we get to taxonomies. A taxonomy is a higher order classification that requires the rigor of organizing by function (what the organization does), in a hierarchical fashion from the most general to the most specific, while also defining synonyms, related terms, and cross references. A fully developed taxonomy will guide a user to the correct section of the index regardless of the term she uses. Development of such a taxonomy demands time with business users, to capture all of their business activities, say, in workflow diagrams, and to record their common terms.

Folksonomies: A User-Driven Approach to Organizing Content by Joshua Porter, April 26, 2005