What do you want?
Whose side are you on?
That would be telling. We want information. Information. Information.
You won’t get it!
-The Prisoner

Look at your estate, now back to me, look at all the documents you have, now back to me. I bet they’re scattered everywhere?? I bet you’ve got weird drafts of documents from projects you stopped running 3 years ago, but nobody quite remembers if they’re important or not. Which means you really can’t get shot of them.

But disks are cheap!!

Yep, they are. What’s not cheap are the dozens of beefy 4 socket, 12-core servers you’ve got to buy to index the literally bazillions of files you’ve got. You can pick up 1TB SATA drives for a hundred quid, double it if you want a fast one; but even then it’s not that fast. Or rather it is fast if you know where you are going.

Let’s make some assumptions, and they’re pretty big ones because I can’t find anything to verify them with.; but let’s say that on average you have 10 pages per document. Further on average a worker generates 3 documents a week. Every week they then amend the previous documents ten times. Of those 3 documents, they share 2 of them with 4 colleagues and they amend them… I could go on, but you start to get to a situation where at the end of week one you have 3 documents, then 30, then 64 etc… Ok so I’ve started to confuse myself, but how about this example:

You have 20 workers, creating 3 documents a week for a year. That is 3,120 documents.
Each document is revised 4 times, reviewed 3 times and approved once. (3,120 x 4 x 3) +3120 = 40,560 documents.
Now if each document has 10 pages; 40,560 x 10 = 405,560 pages.
So that means if each document was 1MB in size you’ve hit about 39.6 GB of new data a year.

Again this doesn’t sound like a lot, you could fit it all on a thumb drive after all; but the hard bit, the expensive bit, is indexing all of that data so someone can search it and reasonably find what it is they’re looking for.

So can I just buy bigger boxes?

Heavy duty computing equipment is an answer definitely, but it’s actually answering the wrong question. The right question is “What is important to my business?”. The solution is in defining an Information Management Policy, IMP, if you can handle the information in your organisation better than indexing stops being such an expensive activity and you can use that resource elsewhere.*

Information Management

The key is in the business defining what the policy is and not IT. There are a handful of key elements that need consideration:




I’m going to go into this in further detail in a future post but there are some handy resources at Microsoft to read when considering how you manage your information:

I’ll point out as well that SharePoint 2010 is a really, really powerful tool to help you carry out the IMP, but again that is a story for another day!

*Now I’m not saying that you shouldn’t invest in indexing, you should, and I will come on to talk about it in a future post.