Data Handling - Storage and Backup

Please post discussions that do not fit into any other category.
Post Reply
EdReading
I have made 20-30 posts
I have made 20-30 posts
Posts: 28
Joined: Tue Feb 24, 2015 10:22 pm
9
Full Name: Edward M Reading
Company Details: MNS Engineers
Company Position Title: Supervising Project Surveyor
Country: United States
Linkedin Profile: No
Location: San Luis Obispo, CA

Data Handling - Storage and Backup

Post by EdReading »

Hello all,
I am interested in finding out how people are dealing with the large data files in a network environment. Our IT people aren't crazy about me dumping terabytes of data on the servers. I'm curious what others are doing as far as storage and back up. What works and doesn't.
Thanks!
Ed
User avatar
jcoco3
Global Moderator
Global Moderator
Posts: 1724
Joined: Sun Mar 04, 2012 5:43 pm
12
Full Name: Jonathan Coco
Company Details: Consultant
Company Position Title: Owner
Country: USA
Linkedin Profile: No
Has thanked: 70 times
Been thanked: 157 times

Re: Data Handling - Storage and Backup

Post by jcoco3 »

Hi Ed,

I think you will find these post relevant:
http://www.laserscanningforum.com/forum ... =43&t=4703
http://www.laserscanningforum.com/forum ... =57&t=7395
http://www.laserscanningforum.com/forum ... =49&t=7870

From what I have seen and heard, people are doing many different things for storage. Everything from a bunch of portable flash drives, to incredibly expensive high reliability systems that have site to site mirroring. My suggestion is to use a solution that has redundancy but fits your current project size and workload. Also give yourself room to grow by budgeting for larger and more robust storage solutions in the future. The data can really stack-up over time and from my experience you want to keep most of it indefinitely.

Wait here is one more link :lol: http://www.laserscanningforum.com/forum ... t=qnap+nas Sorry for all the reading, but it is all good stuff that you will want to know.
EdReading
I have made 20-30 posts
I have made 20-30 posts
Posts: 28
Joined: Tue Feb 24, 2015 10:22 pm
9
Full Name: Edward M Reading
Company Details: MNS Engineers
Company Position Title: Supervising Project Surveyor
Country: United States
Linkedin Profile: No
Location: San Luis Obispo, CA

Re: Data Handling - Storage and Backup

Post by EdReading »

Thanks a lot Jonathan!
Oddly, I did a search for "Data" and "Storage" and couldn't find them. :oops:
hypsometric
V.I.P Member
V.I.P Member
Posts: 201
Joined: Sun Oct 27, 2013 6:50 pm
10
Full Name: Arash Yaghoubi
Company Details: Hypsometric
Company Position Title: Director of Cartography
Country: USA
Linkedin Profile: No
Been thanked: 3 times

Re: Data Handling - Storage and Backup

Post by hypsometric »

another solution is to fire everyone and use a gigabit ethernet crossover cable straight to the server. :lol:
User avatar
Dedken
V.I.P Member
V.I.P Member
Posts: 370
Joined: Fri Mar 15, 2013 10:28 am
11
Full Name: Kenneth Bazley
Company Details: Sir Robert McAlpine
Company Position Title: Senior Geospatial Engineer for HDS
Country: UK
Linkedin Profile: Yes
Location: London

Re: Data Handling - Storage and Backup

Post by Dedken »

Dropbox. That way your IT department doesn't even have to touch the data.
All views are my own and are not representative of my employer, The King, God or anyone else for that matter.

"we need an instrument, to take a measurement" - I.MacKaye 1992
User avatar
andrew.grigg
V.I.P Member
V.I.P Member
Posts: 170
Joined: Tue Aug 07, 2012 1:29 pm
11
Full Name: Andrew Grigg
Company Details: 40SEVEN
Company Position Title: Senior Land Surveyor
Country: England
Linkedin Profile: Yes

Re: Data Handling - Storage and Backup

Post by andrew.grigg »

Dedken wrote:Dropbox. That way your IT department doesn't even have to touch the data.
Wow, that must take hours to upload projects. You must have a decent upload speed. I suppose, you could leave a PC on over the weekend whilst it uploads...
User avatar
Dedken
V.I.P Member
V.I.P Member
Posts: 370
Joined: Fri Mar 15, 2013 10:28 am
11
Full Name: Kenneth Bazley
Company Details: Sir Robert McAlpine
Company Position Title: Senior Geospatial Engineer for HDS
Country: UK
Linkedin Profile: Yes
Location: London

Re: Data Handling - Storage and Backup

Post by Dedken »

We haven't done it yet but it's in the pipeline (probably)... It would be done from the main hub. When you buy professional Dropbox accounts you get preferential upload speeds - they can afford to throttle the free acounts. That's what I was told by IT anyway!
All views are my own and are not representative of my employer, The King, God or anyone else for that matter.

"we need an instrument, to take a measurement" - I.MacKaye 1992
markdwyer
I have made 10-20 posts
I have made 10-20 posts
Posts: 10
Joined: Sun Feb 15, 2015 12:32 am
9
Full Name: Mark Dwyer
Company Details: Shearspace Pty Ltd
Company Position Title: Founder
Country: Australia
Linkedin Profile: Yes
Contact:

Re: Data Handling - Storage and Backup

Post by markdwyer »

Hello,

I'm going to say this as a former architect for massive scale data systems. When I say that, I'm talking about storing tens of petabytes per year. Some of this was as a supercomputing expert (medical, engineering, GIS and those damn physicists), some of it as a big data specialist for LiDAR (8TB raw + 2TB cooked per day, 7 days per week, 365 days per year). Lidar was compressed, but it didn't really matter ... the raw was an order of magnitude larger in size.

The only way to store the type of data you guys generate is with magnetic tape. Now, I'm not saying you use tape for everyday IO trashing but for the long term storage. Tape costs about $20 per TB for dual backups ($10 per TB single copy .. but a copy is not strictly a backup). The second tape you store offsite. Tape lasts for 25 years. It is the longest lasting storage medium that we currently know, and can prove. Once stored, it does not require electricity to keep active ... it can be stored on a shelf.

You need a tiered system. The top level ideally has a couple terabytes of fast disk (maybe SSD, I've found 10K raptors are also sufficient (and reliable) .. people have complained about slowness but when you look at the log files, you see that network switch was the bottleneck, not the disks ... you'll need a 10Gbps switch or above to notice disk performance degradation). This level is your everyday thrashing level. You work it as hard as you need. The second level has about 10x the storage but with the slower, standard, commodity HDD but an array that reflects your data 'backup' strategy (RAID of some description). The final layer is a tape system of some description. Data written to the first tier is mirrored to the second tier. Data that ages on the second tier is automatically pushes to the tape tier.

If you are a smaller organisation, you can probably remove the top tier.

Tape robots are surprisingly cheap. The expensive bit is the tape drives. When I was maintaining a system of 10TB of recording per day, 2 tape drives were needed to guarantee that data was written to two tapes daily (LTO6 will write at 160MB/s maximum). A tape drive retails roughly for $1K. It will last 5 years, minimum. After five years, the tape density will have increased 4x, justifying a purchase review.

A problem I noticed was the price of tape backup software. Stupidly, and needlessly, expensive. A days programming will do this. Laser data is very easy to push to tape. The files are large so writing to tape is actually perfect (long, continuous streams, as opposed to standard business backup which involves millions of files, each a couple of KB, lots of starting and stopping .. 'shoe shining'). If anybody wants to start a github project, I'd be happy to contribute.

All the above scales wide and deep.
Post Reply

Return to “Any Other Issues”