What do you want?
Whose side are you on?
That would be telling. We want information. Information. Information.
You won’t get it!
Look at your estate, now back to me, look at all the documents you have, now back to me. I bet they’re scattered everywhere?? I bet you’ve got weird drafts of documents from projects you stopped running 3 years ago, but nobody quite remembers if they’re important or not. Which means you really can’t get shot of them.
But disks are cheap!!
Yep, they are. What’s not cheap are the dozens of beefy 4 socket, 12-core servers you’ve got to buy to index the literally bazillions of files you’ve got. You can pick up 1TB SATA drives for a hundred quid, double it if you want a fast one; but even then it’s not that fast. Or rather it is fast if you know where you are going.
Let’s make some assumptions, and they’re pretty big ones because I can’t find anything to verify them with.; but let’s say that on average you have 10 pages per document. Further on average a worker generates 3 documents a week. Every week they then amend the previous documents ten times. Of those 3 documents, they share 2 of them with 4 colleagues and they amend them… I could go on, but you start to get to a situation where at the end of week one you have 3 documents, then 30, then 64 etc… Ok so I’ve started to confuse myself, but how about this example:
You have 20 workers, creating 3 documents a week for a year. That is 3,120 documents.
Each document is revised 4 times, reviewed 3 times and approved once. (3,120 x 4 x 3) +3120 = 40,560 documents.
Now if each document has 10 pages; 40,560 x 10 = 405,560 pages.
So that means if each document was 1MB in size you’ve hit about 39.6 GB of new data a year.
Again this doesn’t sound like a lot, you could fit it all on a thumb drive after all; but the hard bit, the expensive bit, is indexing all of that data so someone can search it and reasonably find what it is they’re looking for.
So can I just buy bigger boxes?
Heavy duty computing equipment is an answer definitely, but it’s actually answering the wrong question. The right question is “What is important to my business?”. The solution is in defining an Information Management Policy, IMP, if you can handle the information in your organisation better than indexing stops being such an expensive activity and you can use that resource elsewhere.*
The key is in the business defining what the policy is and not IT. There are a handful of key elements that need consideration:
- This doesn’t necessarily mean anything secret squirrelly but what is the sensitivity of this information to your organisation. Even simple markings such as Low, Medium and High will allow you to create the necessary policies to handle this correctly. In practical terms you could use technology like Rights Management Services in Windows/Office to prevent sensitive information being passed on to people without a need to know.
- What activities are important to this information? If it’s your unreleased financials a week before announcing them to the market, then it is pretty important you know everyone who has read or contributed towards that document.
- Defining when a document is no longer required, or is superseded and how it should then be handled. If you are about to launch a marketing effort then you will want to store all the proofs, designs and ideas. Once you’ve gone live however, the 32nd iteration of the Sunday supplement ad that you placed is probably not relevant and there isn’t a need to keep it.
I’m going to go into this in further detail in a future post but there are some handy resources at Microsoft to read when considering how you manage your information:
- Information management policy planning (SharePoint Server 2010)
- Managing Information Management Policy in SharePoint Server 2010 (ECM) (Technical)
I’ll point out as well that SharePoint 2010 is a really, really powerful tool to help you carry out the IMP, but again that is a story for another day!
*Now I’m not saying that you shouldn’t invest in indexing, you should, and I will come on to talk about it in a future post.