Data Discovery and Classification: Key Concepts for Data Mapping

Osano
Contact

If you’re a data person, or even if you’re not, you may have heard the statistic cited by Eric Schmidt, executive chairman at Google: “There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every two days.”  

That’s not even the crazy part: Schmidt’s quote is from 2011. Now, roughly 330 exabytes of information are created every day. 

If your brain isn’t broken yet, a single exabyte is equal to one billion gigabytes. If an exabyte was burned onto DVD, the stack of DVDs would reach halfway to the moon. 

Clearly, that’s a nearly unfathomable amount of data. But what is collected? Why? Where is it stored? And, perhaps most importantly, why does it all matter? On the one hand, these are existential questions. But they’re also questions that every business needs to ask for the sake of compliance, for operational excellence, and to ensure they’re using the right data in the right way—because it’s the right thing to do. 

Every organization has an ever-expanding data footprint, which makes it challenging to understand where data resides and whether it is handled in compliance with data privacy regulations like the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA) and the California Privacy Rights Act (CPRA), and others. 

A data map helps you figure out the answers by supporting data discovery and data classification. In turn, data discovery and classification enable you to fulfill a spectrum of compliance needs. We’ll dive into what a data map is, how it relates to data discovery and classification, and how data discovery and classification can support your organization’s compliance. 

What Is Data Mapping? 

First, let’s clarify what we mean by data mapping. It can mean different things in different contexts. A common definition is the technical process of mapping fields from one database to another. But that’s not the one we’re using for the purposes of this article. 

Instead, we’ll focus on the data privacy concept of a data map. That definition refers to a visualization of all the stores and flows of personal information across your organization. To create this visualization, you’ll first need a data inventory that lists out all the different applications and systems as well as metadata about those applications and systems, like owner/admin, connected data stores, types of data handled, and so on. That’s what we’re referring to when we talk about it in the context of data discovery and data classification. 

What Is Data Discovery? 

So, in the context of data mapping for data privacy compliance, what is data discovery? 

Essentially, it’s the process of discovering data within your data map. Outside of data privacy compliance, you could engage in data discovery for all sorts of purposes—such as ways to reduce spend, spotting redundant vendors, identifying new market opportunities, and more.  

Typically, organizations interested in this broader sense of data discovery invest in powerful business intelligence tools and data science experts. This approach enables you to ask pretty much any question about your data, but the trouble is, that flexibility comes with complexity. Ultimately, that complexity translates into slower outcomes. 

If your interest is primarily in achieving an outcome like data privacy compliance, data discovery can be a much narrower and less complex concept. When it comes to data privacy, data discovery is the process of finding data that must be managed to achieve compliance. 

For example, with your data map, you’ll be able to discover: 

  • The data needed to fulfill subject rights requests. 
  • What data you are sending to third-party vendors. 
  • Where you collect sensitive personal information. 
  • And other compliance-related use cases for data. 

While general business intelligence solutions for data discovery do exist, privacy professionals will want to invest in a privacy-focused solution for data mapping and data discovery that reduces complexity and operational headaches (such as being bottlenecked by in-demand data science resources) rather than adds to them.  

What Is Data Classification?  

Data classification is the process of categorizing data based on various characteristics, such as its sensitivity, importance, and access controls.  

These categorizations are well articulated and documented by standards organizations. In general, NIST and ISO standards suggest four classification levels:  

  • Public data: Information that is freely available to the public. This could include data from news articles, government data sets, or open-source software code.   
  • Private or internal data: Data meant for internal use within the organization. This includes employee records, internal memos, or payroll data. 
  • Confidential data: Data that requires strict access controls, such as personally identifiable information–or data that can identify an individual, such as their address and phone number–or information found in financial records. 
  • Highly confidential or restricted: This definition includes sensitive personal information, such as personal health records, biometric data, data subject to privacy laws, identity or access management data, national security information, or other types of sensitive data that are considered highly confidential or restricted.  

In terms of data privacy compliance, we’re most concerned about the latter two categories (although employee data, which would fall under “private or internal data” is covered under the GDPR and CPRA). Your data map should classify which systems and data flows handle sensitive personal information versus regular consumer information. Outside the context of data privacy compliance, you may need additional classifications in your data map to align with cybersecurity requirements and other regulatory needs. 

Automation and Integration: The Keys to Effective Data Mapping and Classification 

Privacy-focused data discovery and classification solutions need to provide an integrated approach to gaining visibility and control by combining data mapping capabilities with a suite of privacy compliance tools. Automation is also key: By automating data discovery, you can save significant time and effort. 

For example, the Osano platform works like this:  

  • Osano data mapping automatically discovers connected systems that process personal information by connecting with your organization’s single sign-on (SSO) provider or customer data platform (CDP). 

Screenshot 2024-06-27 at 9.24.25 AM

Here we see two sources of personal data stores: an Okta SSO instance and an assessment that will be sent to owners of data stores that sit outside of Okta. 

  • The platform scans systems containing personal data and provides metadata about the data field types, vendor data flows, and more, enabling you to prioritize high-risk systems for assessment. 

Data Map (1)

This data map shows flows of data between systems, as well as automatically and manually identified metadata. 

  • For systems outside your SSO or CDP ecosystem, Osano provides automated workflows to quickly map and track those data assets while informing relevant stakeholders of outstanding tasks. 

Then, the discovered data flows into Osano's Subject Rights Management to fulfill data subject access requests. Processing activities can also feed into Osano's Assessments to generate records of processing activities (RoPAs) 

Additional capabilities within Osano, such as cookie consent, PIAs, vendor assessments, and more, are all unified around your organization’s central data map. So, the data you collect, the information you discover, and the processes you set around it become part of a unified, integrated privacy program that helps you reduce work, improve compliance and get a handle on your own stratospheric stack of data. 

Written by:

Osano
Contact
more
less

PUBLISH YOUR CONTENT ON JD SUPRA NOW

  • Increased visibility
  • Actionable analytics
  • Ongoing guidance

Osano on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide