Storage is cheap, right? Just buy some more capacity when you need it. I can go to Best Buy and purchase a 2TB drive for about $100. What’s the problem? …you ever heard a developer in your shop say something like this when complaining about getting more storage capacity…..
One of the things we try to help our customers with is managing the data they have a bit better versus just buying more capacity all the time. I’m telling you storage isn’t cheap and this blog post will attempt to prove it.
Think of your data as a liquid in a glass and the glass container as your storage infrastructure. We want to help our customers manage the liquid (their data) to keep the container to a manageable size. The larger the container the more our customers end up spending. Now, picture this. You are going to have more than one container if you’re managing your data properly, so now we have multiple liquid containers to purchase and manage. Here’s a little more background on the purpose of these containers and associated costs.
Container #1: This is your primary data. The data that runs the applications that run your business. This container is built not only to store the capacity you need for your data, but it’s also built to adhere to availability and performance requirements that your applications demand. If you don’t have access to your data or the access is slow you don’t have a business. RTO (Recovery Time Objectives) and application response time often govern these aspects of your data access. There are several technologies utilized to address these challenges, and guess what, they’re expensive. Things like RAID, redundant components like controllers and power supplies, load balancing, various drive types that perform differently, connectivity protocols, application integration and the like. I’m not going to deep dive on all those concepts in this blog, but know that they (and many others) exist and they all add cost to the storage solution. Hardware, software, maintenance and ongoing management costs all add up. So, you remember that $100 2TB drive? For enterprise class storage that has the resilience and performance we need, we’ll be spending more like $5,000 per useable TB versus $50 in the Best Buy example. That’s two orders of magnitude more, wow!
Container #2: As if Container #1 wasn’t bad enough, what about these other containers? We also need to help protect our customer’s data over time. We need to protect the data from accidental or malicious corruption or data loss. high availability and resilience isn’t enough. We need several points of time so we can go back in time in case we lose some information and we also may have the need to save information for long periods of time for compliance reasons. This second container is where we keep information over time for functions like backup, recovery and archival. So, not only do I need to store all my primary data in an expensive container, I need another container with several more copies to protect my data over time. Well, I hate to break it to you, but this container isn’t any less expensive than the original container #1. Come on now, tape is cheap isn’t it? Just chuck this stuff on some cheap tape media and call it good, right? ….wrong…. Well, there’s a lot more to it than that. We need to have some software to manage the data protection, recovery, archival and discovery processes. That stuff aint cheap nor are the people, time and infrastructure to manage all that. We now need more storage capacity for this data and the attributes of this container aren’t the same as container #1 as this data doesn’t usually have the same performance or availability requirements. This data has requirements around storing multiple copies (or points of time) for longer periods of time. We also need to get this information off-site if we’re really going to do this right. Wait a minute, am I talking about yet another container? Yes I am, container #3 is for disaster recovery purposes. We’ll talk about that one next. Back to container #2 for a minute. Container #2 needs to be able to hold large amounts of capacity, preferably on disk and then make another copy on a removable medium to store it offsite or even replicate it off-site if possible. Also, make sure we store several points of time so we can go back in time to recovery or discovery older information. So this container consists of more capacity than the primary container on at least two types of media and/or off-site storage. So I have to pay for the offsite storage location and shipping too, to store old tapes (and hope to god that I can recover off them if needed) and the software, people and management to be able to recover this information. Trust me, this container is every bit as expensive as container #1.
Container #3: Seriously, what happened to that $50/TB disk drive? We’re up to $10,000 per TB anyway and we haven’t even discussed container #3 in any detail. Container #3 is for disaster recovery. In the event we had something bad happen to our primary data and we needed to bring up our applications and data in another location, we need to protect our data over distance. Some of this can be mitigated with container #2 as we should have a copy of our data somewhere else, but this copy isn’t in running condition. It’s an offline copy just waiting for someone to restore somewhere so it can become online and useable to your applications and users. Well, guess what, this container isn’t inexpensive either. This container has to have similar features as container #1 plus all the other resources like servers, network, storage, applications, remote access (shall I continue) in order to have a place to restore the information to. If you have this infrastructure just sitting around waiting for the disaster it can be a very expensive insurance policy. Even if you rent this space in the event of a disaster it’s expensive to spin all this up and you better have practiced this at least a few times so you’re ready and you’ve covered all your bases with a disaster recovery plan. So, let’s be conservative and say that the people, processes and technology for container #3 is the same as #1 and #2 let alone more data needed for things like test and development, holy mackerel.
All this being said, we’re talking about a “managed” TB of data costing a minimum of $15,000+. Now factor in that storage utilization of 50% is about average and so we need to double the number again to $30,000 per useable TB of capacity. Kind of scary isn’t it.
This is why we want to help our customers manage the liquid in the glass (the data) versus focusing on just the cost of the primary container. If we can control the data we can control container #1. Since we know that containers #2 and #3 are proportional to the primary container we end up with a compounded savings effect.
We are expert in our ability to analyze your data and storage infrastructure. We have a data classification methodology that help customers shrink their data and therefore their containers and associated costs. We also have purpose build solutions to maximize our customers investments in their data management and supporting infrastructure. In as quickly as a day or two for smaller customers and at most a few weeks for a larger customer, we can demonstrate significant savings and a roadmap to help our customers with a data management strategy that works. Contact me for more information: Scott Pelletier – VP Technology, Lewan & Associates, 303-968-2338.