Best technology for archiving stuff?

juggler

Suspended / Banned
Messages
5,059
Name
Simon
Edit My Images
No
I'm not just looking for a backup solution.
I want to permanently archive lots - initially 500GB - of old raw files - keeping only jpegs.
I'll keep multiple copies of the archive, at least one of which is off site.

So what's the best technology for this?
Bung it on a server? Writable DVD? CD-R? USB HDD? Pay someone else to transfer it tape/DLT?
 
I use a SATA drive bay, normal SATA hard drives and a bare drive storage box, as drives are cheap and fast I just fill a drive up slip it back in its anti static bag and store in the aluminum storage box. A mate works in archive (national) and they do much the same, mind you it was a while ago I asked, I have found CD's and tapes to be hit and miss over the year, as well as slow! Just my own opinion. :)
 
Tape for long term storage, it's what it's designed for. Couple it with a hdd for quicker easier access and you have a practical and robust solution.
 
My only gripe with dedicated backup drives is drivers, in the end drivers depend on certain OS ver's to function, in the end you end up archiving whole PC's/Systems just to retain access to old backups, saying that I try to keep my backup's (old) on current tech, I still have old SCSI DAT and JAZZ drives with data on, as well as old MFM hard drives on PC DOS sat in a loft. Worth keeping in mind if your backup system requires dedicated drivers/software. Always interested to know what people use.
 
What sort of tape & drive?
Depends up budget, probably something like an LTO4 (800GB/1.6TB). HP / Quantum / IBM would be the go to brands.

You can pick up bargains on ebay as people move on their old kit because there data needs have grown.

A really worth while read (below) why tape is superior for longer term use but i would run it in tandem with hdd for practical reasons as noted above.

http://serverfault.com/questions/51...hard-drive-used-for-data-archival-deteriorate
 
I'm not just looking for a backup solution.
I want to permanently archive lots - initially 500GB - of old raw files - keeping only jpegs.
I'll keep multiple copies of the archive, at least one of which is off site.

So what's the best technology for this?
Bung it on a server? Writable DVD? CD-R? USB HDD? Pay someone else to transfer it tape/DLT?

Given that storage volumes of contemporary storage devices will outpace your archive requirements, I suggest that you use the most cost effective HDD solution at the time, fill it, stick it somewhere, and then in two to three years when cheap storage volumes have doubled, copy the data across from the old drives to new ones.

That way you regularly refresh the data, which is key to keeping it for long periods. There's no ideal solution.

I don't think optical storage has the lifespan necessary, despite previous expectations, and you'd need to copy it onto new storage anyway, which isn't as convenient. There are specialist archive optical storage mediums, but they're very expensive.

Multiple, cheap, consumer magnetic storage devices, refreshed on a regular basis would be my choice.
 
Even with tape you need to store it properly and run it round the spools from time to time. I have had to send 10 to 15 year old tapes to specialist recovery companies because the tape has become fragile.

I agree with @EightBitTony, the only really reliable way is to keep writing it to new media on a regular basis. When ever I get a new machine I keep the old hard drives, put them in external caddies and use them as back-up, the ones a couple of generations old get wiped and binned.
 
bluray optical media is supposedly more robust and less prone to degradation. it doesn't hold that much data in the scheme of things.

tape is going to be pretty pricey, £500 for a refurb LTO4. needs SAS or SCSI connector etc.

just get a pair of large HDD and store them off site, validate the data occasionally.
 
Depends up budget, probably something like an LTO4 (800GB/1.6TB). HP / Quantum / IBM would be the go to brands.

You can pick up bargains on ebay as people move on their old kit because there data needs have grown.

A really worth while read (below) why tape is superior for longer term use but i would run it in tandem with hdd for practical reasons as noted above.

http://serverfault.com/questions/51...hard-drive-used-for-data-archival-deteriorate

Thanks for that. Buying my own tape drive will be out of budget.
The thought that HDDs are likely to fail if left idle was behind my idea of archiving to a server.
 
There is a lot to be said for off site server backup, and the scale of such servers is mind blowing, guys running around with trolleys of drives replacing those that fail daily, the growth of the servers is also mind blowing, if you have good upload speeds unlike me thats where I would be storing my data.
 
remember drives are idle until you buy them and plug them in, they've probably been in a warehouse for months ;)

like i say, get 2 (preferably from different locations so the chances of same batches are slim) so should 1 fail then you have the other.

cloud backups are great if you have sufficient time/bandwidth to upload hundreds/thousands of gigs.
 
Last edited:
Some guidelines I've developed over 35 years in I.T....
  1. All media deteriorates.
  2. The more reliable a medium is claimed to be, the less you can trust it in practice.
  3. The only write and forget medium yet developed is punched paper tape. You'll need an enormous, fireproof room for 500GB, though.
  4. You can get around all of these problems by using active storage. That way, you see the problem while you can still do something about it.
  5. If something is important, keeping less than three physically independent copies is equivalent to placing your foot on the table, pointing a gun at it and pulling the trigger.
  6. Re (5): four copies is better.
I make no apologies for sounding bitter and twisted about this. I've seen tape libraries crap out, RAID racks suffer catastrophic failure and clowns who kept the single copy of a vital contract on their laptop.

Apart from that, Mrs Lincoln...

:mad:
 
Last edited:
Thanks for that. Buying my own tape drive will be out of budget.
The thought that HDDs are likely to fail if left idle was behind my idea of archiving to a server.

Yep, if you can keep the disks spinning, it increases the chance of the data remaining valid, although it also impacts the chance of the disk failing (but then, they fail when powered off for long periods too, so it's a coin flip). Any manual process (refreshing the data, taking another copy, remembering to move a drive off-site, etc.) also increases the chances of mistakes or something being forgotten, so an automated process is best.

One other option is something like Amazon Glacier (http://aws.amazon.com/glacier/), it's quite cheap, but fast recovery can be extremely expensive (slow recovery can be much cheaper, you just have to recover less than 10% of the volume in a one month period, or something complex like that). There are client products which allow you to interact with Glacier (and there are competitors who have similar systems). I had a play with it for a couple of moths and it's usable, but a little opaque - the terminology is a bit confusing and you're dealing with things other than files, but it's doable.
 
Some guidelines I've developed over 35 years in I.T....
  1. All media deteriorates.
  2. The more reliable a medium is claimed to be, the less you can trust it in practice.
  3. The only write and forget medium yet developed is punched paper tape. You'll need an enormous, fireproof room for 500GB, though.
  4. You can get around all of these problems by using active storage. That way, you see the problem while you can still do something about it.
  5. If something is important, keeping less than three physically independent copies is equivalent to placing your foot on the table, pointing a gun at it and pulling the trigger.
  6. Re (5): four copies is better.
I make no apologies for sounding bitter and twisted about this. I've seen tape libraries crap out, RAID racks suffer catastrophic failure and clowns who kept the single copy of a vital contract on their laptop.

Apart from that, Mrs Lincoln...

:mad:

No worries, you're preaching to the converted. I've lost track of the number of times I've tried to explain that RAID is for resilience, not backup.
 
Some guidelines I've developed over 35 years in I.T....
  1. All media deteriorates.
  2. The more reliable a medium is claimed to be, the less you can trust it in practice.
  3. The only write and forget medium yet developed is punched paper tape. You'll need an enormous, fireproof room for 500GB, though.
  4. You can get around all of these problems by using active storage. That way, you see the problem while you can still do something about it.
  5. If something is important, keeping less than three physically independent copies is equivalent to placing your foot on the table, pointing a gun at it and pulling the trigger.
  6. Re (5): four copies is better.
I make no apologies for sounding bitter and twisted about this. I've seen tape libraries crap out, RAID racks suffer catastrophic failure and clowns who kept the single copy of a vital contract on their laptop.

Apart from that, Mrs Lincoln...

:mad:
Love it, (4-6) will be my back-up bible.
 
Thanks for that. Buying my own tape drive will be out of budget.
The thought that HDDs are likely to fail if left idle was behind my idea of archiving to a server.
Understandable but perhaps something to work towards longer term :).

The solution i personally use for home use is as follow:
Media PC primary storage internal
Storage is backed up to external drive that is permanently attached to the PC
Internal storage is backed up with a primary storage server (raid6)
That server is then backed up to a backup server (raid6)
I have a couple of 3.5" drives that i use with a docking station on the media PC which live off site and only one returning to update it at any given time
I then have some usb powered drives that also have the data on them and generally one of those will always be in my laptop or camera bag and one of those generally goes everywhere with me.

Apart from the permanently connected external storage all the data is out of sync (manual freefilesync tasks) with each other to create some level of different versions between copies. It's to guard against copying corrupt data to each location even though i do hash comparisons from time to time (exactfile).

-edit-
Got carried away a bit, the above doesn't fully apply to the original concept of archival usage!
 
Last edited:
for what it's worth i think i have 5 copies

edit:

1) microserver array at site A
2) esata array at site A (sync'd over night)
3) external USB3 (sync'd manually and rotated between site A and B)*
4) external USB3 (sync'd manually and rotated between site A and B)*
5) important photo sets are archived to BR along with latest LR cat and stored at site B

*USB drive rotation is always so both drives are never at site A at the same time. USB drives are of different models, mitigating against bad batches of drives.
 
Last edited:
All that said, Amazon Unlimited is $60/year which is about the cost of a hard drive per year
 
Some guidelines I've developed over 35 years in I.T....
  1. All media deteriorates.
  2. The more reliable a medium is claimed to be, the less you can trust it in practice.
  3. The only write and forget medium yet developed is punched paper tape. You'll need an enormous, fireproof room for 500GB, though.
  4. You can get around all of these problems by using active storage. That way, you see the problem while you can still do something about it.
  5. If something is important, keeping less than three physically independent copies is equivalent to placing your foot on the table, pointing a gun at it and pulling the trigger.
  6. Re (5): four copies is better.
I make no apologies for sounding bitter and twisted about this. I've seen tape libraries crap out, RAID racks suffer catastrophic failure and clowns who kept the single copy of a vital contract on their laptop.

Apart from that, Mrs Lincoln...

:mad:


This is the best post I've seen here for a long time.
In line with other comments, I'd only recommend use of tape for a backup routine where the tapes are constantly refreshed and verified. In a long-term archival situation I'd expect the tapes to be stored in optimum conditions (temperature and humidity) and subject to some sort of testing regime.

I'd buy 3 HP Microservers. Then I would use the ZFS file system in RAIDZ (equivalent of RAID1) on each of them. One major feature that distinguishes ZFS from other file systems is that it is designed to protect against silent data corruption caused by data degradation. Schedule a regular scrub on each of the servers and but sufficient monitoring in place so that you are alerted if a scrub fails with checksum errors.

I built my backup server (initially) with a second hand XFS/AMD mobo/CPU/RAM and found silent data corruption across all the disks in both RAIDZ arrays. I replaced the MOBO/CPU/RAM with quality second hand items (Asus/Intel) and I haven't seen the corruption return.
 
I have a single external HD that is backed up via USB about once every six months [emoji52]

I really need to sort myself out, even if it is for personal / family use only.
 
A few years ago I did tests on DVDs and Blu-Rays as storage medium.

DVDs will corrupt at a fantastic rate if left exposed to sunlight and heat - but if kept stored in the dark (DVD folders etc - available at Poundland) will keep for about 10 years if a reputable brand is used - Phillips etc.

Blu-Rays vastly exceed that and I tested one over 3 years with at least 2 years sat on a windowsill with no degradation, de-laminating etc, at all.

In fact, despite all the doom sayers I have never had a single DVD de-laminate.

The Blu-Rays I tested were the ablative metal kind and I imaging the dye versions would fare just as badly as DVDs.

But as mentioned on here do make more than one copy of important data - my photos are stored on Blu-Rays, HDD, Flickr and the cloud because once lost they are almost certainly gone forever.
.
 
I have a single external HD that is backed up via USB about once every six months [emoji52]

I really need to sort myself out, even if it is for personal / family use only.

Do it NOW before the universe sorts them out permanently :(
.
 
Do it NOW before the universe sorts them out permanently :(
.
It gets worse. I've already lost about a decades worth of photos when my macbook died around five years ago, hence I'm backing up to a single source only so far.

A NAS arrangement with further back ups via USB may be the way to go?
 
Let's not forget the original query was about Archive, not Backup. It's a different requirement.

You shouldn't treat backups as archives, and vice versa, and you shouldn't treat live copies of data in the same building as a backup in any case.

The best online solution for Archives is Amazon Glacier by a long chalk.
 
Archival and backup are different in terms of requirement, but the former doesn't really scale down very well. Which is why Glacier is a good idea*
If backup volumes are ++ and bandwidth is --, I'd stand by my earlier suggestion.

*Data retrieval is spendy, so make sure the odds on retreiving data are low
 
Last edited:
Anyone tried M discs - an alternative recording surface that is supposed to have archival properties according to the info on the LG external DVD drive I bought recently?
 
Anyone tried M discs - an alternative recording surface that is supposed to have archival properties according to the info on the LG external DVD drive I bought recently?

Sounds good but apparently although you can replay the discs in any DVD player you need to find a burner designed for these discs.

The comments on Amazon seem to be mixed for both DVDs and Blu-Rays:

https://www.amazon.co.uk/M-DISC-Blu-ray-Permanent-Archival-Backup/dp/B00M862Q1K

https://www.amazon.com/M-DISC-4-7GB-Permanent-Archival-Backup/dp/B005Y4NKE0

Personally I'm going to stick to my tried and trusted methods - DVDs, Ablative Blu-Rays, HDDs and the cloud.
.
 
I only know of them because I just bought a new external DVD drive (LG - £16.99 from Amazon) and they were mentioned as compatible on the back of the box.
 
Back
Top