Data Classification – What It Is, Types, and Best Practices
Data classification can help secure your data for compliance and company policy. But where should you even begin in the classification process?
To start, let’s go through the main data classification types. The four main classifications for data are:
However, these types may vary depending on organization. Each of these levels determines who has access to the data and how long the data must be retained.
This post, the first of three, will help organizations create a data classification program, including program prerequisites and task member responsibilities to ensure proper governance. I will detail the development process in a future post.
Conversations and meetings around what data classification is and how to define it in organizations have occurred for the past two decades. It is the classic “Coke can” experiment; a group of people sit around a Coke can and describe what they see, without saying “it’s a Coke can.” Everyone will have a unique view and no two descriptions will be the same.
“Data classification is difficult, boring, and unglorified but …”
Now imagine the same exercise but replace the Coke can with your organization’s data. Data classification becomes extremely complicated for an organization with different business functions, deliverables, and different needs. It can make you want to look for other things to do with your day. Data classification is difficult, boring, and unglorified. You will, however, need to embrace it to create an effective cybersecurity program.
Any article on data classification will tell you it must factor into an organization’s information security and compliance program. This generic statement will garner universal acceptance with your management team, but data classification requires a lot of heavy lifting. Data classification desires, needs, and even definitions vary between groups in an organization.
Data classification typically includes a three- or four-layer system akin to the below:
If you are new to data classification, begin with the 3-level system.
I recommend organizations new to data classification begin with the 3-level system as these levels and their corresponding actions and controls can be challenging to define. The 3-level system considers all internal data confidential so you can clearly communicate your goals across the business, including locations, processes, and applications. First, create the processes and procedures needed to support confidential data. You can identify the limited amount of Public and Highly Confidential data later through interviews and technical discovery.
Before You Start Your Data Classification Program
A data classification program cannot be created and deployed in a vacuum. The following cybersecurity program components must be in place before any data classification planning can begin:
- Asset Management – Owned by IT. The organization needs to know what systems contain the highly sensitive, Confidential, or Highly Confidential data. A data classification program without an effective asset management process already in place won’t work; you won’t get past the drawing board stage.
- Incident Response (IR) – Owned by Cybersecurity. You must have a plan and process in place in the event Confidential or Highly Confidential data has been breached. Organizations with immature cyber programs often struggle with Incident Response as data breaches containing different data types require different response levels. These response levels must be established prior to starting a data classification program.
- Regulated Data Sets – Owned by Compliance. Most data is regulated (e.g., financial data, intellectual property, etc.). You must determine what regulated data you have before you begin a data classification program. These data sets, once defined, will also help you establish your DLP rules and location search.
- Privacy Data Sets – Owned by Privacy. Much like the regulated data sets, privacy data needs to be predetermined. Don’t cut corners here. A blanket statement like “Well, it’s just personally identifiable information” will spell disaster. Your Cyber and Privacy teams must align on privacy data definitions and rules including:
- Will the organization classify Customer IDs as personally identifiable information (PII)?
- Are any PII data types more sensitive than others?
- Do any regulations require data to be contained to any specific location or jurisdiction?
Organizations must demonstrate compliance with several additional privacy requirements to ensure a successful data classification program.
Create a Data Classification Taskforce
A highly effective data classification program will have input from numerous business verticals.
You will find some departments more cooperative than others. You will for example not need to convince IT to participate. Virtually any CIO will want a mature data classification program as it allows IT departments to automatically prioritize the systems, business processes, and applications they provide and maintain.
“Get all the teams on the same page.”
I recommend you start with the Regulators. They usually understand the program’s importance and also know their data sets very well. Next, engage with Risk and Legal. They too know their data but will probably require some training on their role and their deliverables. You can work much more efficiently and effectively once all the teams are on the same page. Make them a part of the program development process going forward. Define the data classifications together. Co-develop the training materials required to inform the business units about the program. Then communicate (rather than dictate) procedural changes in handling certain data types to ensure compliance with the new classification program.
The Taskforce: Deliverable, Role, Motivation
Data classification programs frequently fail in their implementation unless each group contributes something to make the program successful.
In my next post, we will take a deep dive into the classification schema and best practices for defining data.