12th July 2017

Hording data and data retention needs

This blog post will be about a topic that I haven’t even thought about in my 13+ years of software development and technology. Hoarding – especially the data kind.

Image result for hoarding data

Tonight while eating my dinner I was watching a show on Netflix about hoarders and people collecting junk, which made me cringe. Over the years I have watched many such shows, without giving them much thought.
My house is always clean, free of clutter, I hate paper to start with and everything must have a place, I periodically throw away or sell unwanted stuff that I don’t use.
This is not always possible, as you know “life happens”, but I do try and because of this I have never given the hoarding topic much time.

Everyone that knows me will tell you that I have a thing against paper, or the use there of. In this day and age with cloud storage and connected everything the use of paper is quickly becoming a thing of the past.
Tonight I realized with a shock that this can easily lead to a different kind of problem, not just in our personal lives but corporate as well.

Lets first look at what is hoarding:

Compulsive hoarding, also known as hoarding disorder, is a pattern of behavior that is characterized by excessive acquisition and an inability or unwillingness to discard large quantities of objects that cover the living areas of the home and cause significant distress or impairment.
https://en.wikipedia.org/wiki/Compulsive_hoarding 

So basically hording comes down to collecting junk and not being able to get rid of it. I definitely don’t have a problem there, but it triggered a thought about all the different kinds of hoarding, especially in the technology world and I realized that an excessive collection of data can also be considered hoarding.
After a simple search on the net I quickly realized the topic have been debated quite extensively.

Digital hoarding (also known as e-hoarding) is excessive acquisition and reluctance to delete electronic material no longer valuable to the user. The behavior includes the mass storage of digital artifacts and the retainment of unnecessary or irrelevant electronic data. The term is increasingly common in pop culture, used to describe the habitual characteristics of compulsive hoarding, but in cyberspace.
https://en.wikipedia.org/wiki/Digital_hoarding

With a horror shock I now know that I have fallen victim to data hoarding, and yes, it happens to all of us. I have a lot of old hard drives with countless backups over the years, even as far back as my school days, data lying on cloud services, data from way back when, and it goes on and on.
I won’t classify this as a problem yet, as I haven’t yet shown a reluctance to get rid of it, I simple haven’t though about my personal data retention policy – we’ll get to this in a bit. But the question everyone should ask themselves is “Do you really need that data from back in 2005?“.

Hoarding data in business or corporations even now has a name, called Big Data, and people are making a living trying to give meaning to the endless amount of data that everyone, even businesses collect over the years.
It’s perfectly understandable to store data because of legislation or local laws, such as storing medical or financial information for a number of years.

As more and more systems, people and businesses becomes connected and start to generate vast amounts of information it becomes more and more pressing to know what data you should keep, what data you need, and the data that is causing clutter.

In our personal lives we are generating so much data on social networks, chatting, texting, emails, digital photography and videos that losing track of it all is a real concern.
I for one definitely did and will start to put measures in place to not only delete data but to organize the data that I need to keep.

Data retention defines the policies of persistent data and records management for meeting legal and business data archival requirements; although sometimes interchangeable.
https://en.wikipedia.org/wiki/Data_retention

Now that we know what data retention means, we will need to define what we will store, why and then lastly a plan on how we would clean up our data.

My steps for a data rention policy looks like this:

  • Is the data a temporary record?
  • Does the data primarily consist of intellectual property?
  • Is the data a permanent record?
  • Have I needed or used the data in the last 3 years?
  • Is there a legal or contractual requirement to store the data?

My plan of action to deal with my data problems will be as follows:

  • Sort photos and videos accross cloud services, social media, delete duplicates, organize into albums and consolodate into one service.
  • Look at all hard drives lying around and delete data that I have not used in 3 years, or no need to keep then consolodate the data that I do need or use.
  • Consolodate all IP and code written to VSTS under the respective projects, including the code written for micro controllers and hobby electronics.
  • Sort and store business related data and properly backup or archive a single copy in accordance to contracts.
  • Securely erase data from redundant or old hard drives and physically throw away the drives.
  • Ensure that I have a backup strategy in place that works, for example using the 3-2-1 strategy. This means having 3 total copies of your data, 2 of which are local but on different mediums (read: devices), and at least 1 copy offsite.

Now that I have a plan I can start getting rid of my digital clutter, clean up my life, and get away from this data hoarding thing.

If anyone have interesting stories regarding data hoarding, please do leave me a comment, or send me a message?

Facebook
Twitter
LinkedIn
Pinterest