Increasing volumes of digital records (aka ‘big data’) have led, through information practices, to very large managed information collections where the whole may be greater than the sum of its parts.
The collection is an artifact (an aggregate object) that may have the potential for emergent properties. Whether the records collection resides in SharePoint, another electronic document management system (EDMS) or the file-system is probably irrelevant because these are mainly just storage container buckets.
Combining the triplets of statistical vector space, knowledge representations (such as ontologies, taxonomies and authority lists) and natural language processing rules creates a recipe that has the potential to turn raw ingredients (text) into something more than the sum of its parts.
This may provide the potential to produce (through text analytics techniques embedded into search based applications) differentiating insights through latent associations and trends between words that are not present in any explicit single record — in collections too large for a human to practically read.
Whether this potential is actualized may depend on whether the organization has the means to make that transformation.
Consider a scenario where an entire collection of digital records are automatically analyzed and presented to the user through a series of algorithmically constructed search driven prompts to browse.
Where these categories have been constructed from the text, rather than superimposed a priori categories, where we can be blinded by what we know. This could be significant in terms of human information behavior and search outcomes, as the business professional is no longer limited by their own agency in terms of a priori knowledge of keywords or the a priori knowledge of specialists creating pre-defined categories/taxonomies as a means to explore and discover new knowledge.
Consider another scenario where an entire collection of digital records are automatically analyzed in a way to support analogue hunting.