So it seems like an easy question – how do I decide how large a tape library I need? It’s one I get a lot so I thought I’d devote a few minutes to the topic.
Lets assume I have 30TB of data in my datacenter, and I’m going to do a full backup once per week. We’re also going to assume that my daily incrementals are 5% of the size of my full (1.5TB) and that I write all of those to tape as well.
Now, let’s assume (like many of my customers, and my own shop years back) that I want to ship tapes offsite twice a week, say on Tuesday and Friday. And that full backups are all staged to run over the weekend. I do not want to keep any tapes in the library which contain data when I ship.
Based on this we can compute the amount of data I need to ship on Tuesday and Friday; and how fast I need to write data to tape in order to be ready to ship it.
On Tuesday I want to ship 1 full backup (30TB) + 3 incrementals (1.5TB x 3); a total of 34.5TB of data. On Friday I’m shipping just 3 incrementals (4.5TB).
For Tuesday’s shipment we have a window which probably starts Saturday morning (say 6am) and runs till Tuesday morning (say 6am) to complete the tape writing. This window is 72 hours in length, and will thus require a rate of just under 500GB/hour to complete the tape out.
For Friday’s shipment we have a window which could start at say 6pm Tuesday and needs to complete by 6am Friday. This window is 60 hours, and represents a minimum throughput of 0.075GB/hour.
Based on this we’re probably not too concerned about the Friday shipment, and we’ll concentrate on making things happen for Tuesday.
First – how many drives do I need?
Well – do I need backward compatibility with older tape? LTO can read back two generations and write back one. So an LTO6 drive can read LTO4 and write LTO5. If I need to read my old LTO2 tapes, then I need to restrict myself to deploying LTO4 drives.
LTO6 can write data at a rate of 560B/hour before compression, LTO5 writes at 490GB/hour, and LTO4 writes at 420GB/hour. If you’re working with something older than that you’ll have do the math yourself.
Amount of Data / Backup Window Size / Tape Drive Throughput = Number of Drives (round up)
So for the 500GB/hour target I’d need 2 LTO4 drives, 2 LTO5 drives, or 1 LTO6 drive. I might want to consider adding a drive or two as ‘spares’ in case one breaks or is in use for (gasp) data restore operations.
Next up – how many slots do I need?
LTO6 holds 2.5TB on a tape, LTO5 holds 1.5TB, and LTO4 holds 0.8TB. Again if you’re older than that you’ll need to look up the math. Manufactures will also quote compressed capacities which are roughly 2x native. I find that while data will compress some, 1.5x is probably more realistic. I’m going to use that assumption in my calculations below.
Size of Data / Compression Factor / Tape Capacity = Number of tapes. (round up)
34.5TB is going to occupy 10 LTO6 tapes, 16 LTO5 tapes, or 29 LTO4 tapes.
At this point we know enough to size the library.
For LTO6 I need a library with 10 slots, and 1 drive.
For LTO5 I need a library with 16 slots and 2 drives.
For LTO4 I need a library with 29 slots and 2 drives.
Personally I like to add at least 10-15% to my slots and (as noted earlier) and extra drive or two. This provides headroom for growth and some inefficient tape use.
Based on this it seems that a library with 12-14 slots and 2 LTO6 drives would work well. Maybe one with 24 slots and 3 LTO5 drives.
A last factor to consider (not really sizing) is how tapes will be loaded and unloaded from the library. If I’m going to pull 30 tapes at a time, I really don’t want a library with only a single I/E slot. Ideally a slot count equal to the number of tapes expected to be unloaded at a time, but at a minimum one which helps to minimize the number of “return trips” to the library for each unload/load session.
That covers the scenario I described at the beginning. I know some will ask about keeping tapes in the library. That’s also something you can calculate based on the total amount of data (Number of Fulls * Size of Full + Number of Incrementals * Size of Incremental) you want to keep and dividing by the size of the tape. That gives you a minimum slot count. For this scenario I’d add about 25% to the number of slots for growth and “partial tapes”. The good news is that if you’re keeping your tapes in the library, then you don’t have to worry about IE ports.
With that I know pretty well how big my library must be. Now I can go shopping and find a device I like.