Data Discovery and Classification

Discover, classify, and assess data risks to close security and compliance gaps—all from a single pane of glass.

DDC Secrets

What is Data Discovery and Classification?

Data Discovery and Classification is the process of identifying and classifying your data and the personal information within it, allowing you to be more secure and more compliant.

Data discovery and data classification are the two most essential components of data management. 

  • Data discovery is the process of identifying and locating data sources, understanding data formats, and assessing data quality. 
  • Data classification is the process of labeling and tagging data based on its sensitivity, value, and regulatory requirements. Data is most commonly classified as public, internal, confidential, or restricted.

70 %

of enterprises are able to classify only 50% or less of their data.
– Thales Data Threat Report

Find and protect your data with CipherTrust

CipherTrust Data Discovery and Classification (DDC) discovers and classifies data enabling organizations to become more secure and compliant. Bring agility and confidence to your data management. CipherTrust Data Discovery and Classification provides complete visibility into the location of sensitive data across your enterprise, so you can uncover and close compliance gaps.

DDC scans structured as well as unstructured data stores for named entities in different formats and global languages to help you find any type of sensitive data, in any language, anywhere across your enterprise

  • Discover and Classify your Data
  • Better Secure your Data
  • Enhance your Compliance Posture

Why you'll love CipherTrust Data Discovery and Classification

Enhance your data security

  • Easily discover and classify your data

  • Central platform for data management

  • Real-time data dashboards and reports

  • Discovery of custom info types

Easily manage data security

  • Secure sensitive data
  • Understand your security risks
  • Discover secrets before they become security issues
  • Know what data needs what level of security

Address compliance requirements

  • Uncover compliance gaps
  • Reduce risk
  • Discard unnecessary data
  • Stay ahead of regulatory changes

    Data Stores

      
    Local storage
    Local storage
    SharePoint On Prem
    Exchange Server
    Local Windows and Linux
    Network Storages
    • Windows Share (CIS/SMB)
    • Unix File System (NFS)
    Network StorageWindows share (CIS/SMB) 
    DatabaseIBM DB2
    Microsoft SQL
    MongoDS
    MySQL
    Oracle DB
    PostgreSQL
    SAP HANA
    SQL
    Big DataHadoop clustersTeradata
    CloudAWS S3 Buckets
    Azure Blobs and Table
    Google Workspace (Gmail and Gdrive)
    Azure Table
    Office 365 (Exchange, SharePoint, & OneDrive)
    SalesForce

    Types of file supported

      
    DatabasesAccess DbaseSQLite
    MSSQL MDF
    & LDF
    ImagesBMP
    FAX
    GIF
    JPG
    PDF (embedded)
    PNG
    TIF
    CompressedBzip2
    Gzip (all types)
    TAR
    ZIP (all types)
    Microsoft TXT Backup ArchiveMicrosoft Binary / BKF 
    Microsoft Officev5
    6
    95
    97
    2000
    XP
    2003 onwards
    Office Files:Word, Excel, PowerPoint. Access, OutlooK, Other(.pub & .xps)
    Open SourceStar Office / Open Office /Libre Office
    Open StandardsPDF
    RTF
    HTML
    XML
    CSV
    TXT

    Types of file supported

       
    APAAustralia Privacy AmendmentHIPAAHealth Insurance Portability and Accountability Act
    APPIAct on Protection of Personal InformationKVKKTurkish Personal Data Protection Law
    CCPACalifornia Consumer Privacy ActLGPDGeneral Data Protection Law (Brazil)
    GDPRFinancialNYDFSNew York State Department of Financial Services
    GDPRGeneral Data Protection RegulationPCI DSSPayment Card Industry DataSecurity Standard
    GDPRHealthcareSHIELDPrivacy Shield Framework
    GDPRNational IDUK-GDPRGeneral Data Protection Regulation (UK)
    GDPRPersonal Details  
    DATA DISCOVERY AND CLASSIFICATION FEATURES

    Gain the visibility you need of sensitive data across your enterprise

    Enhanced security

    Without understanding where your data is, it’s hard to prevent data breaches and unauthorized access and implement security measures to protect your data. Classifying your data ensures it is managed according to compliance regulations.

    Enhanced Security

    Data Management

    Data management

    Eliminate excess storage and costs by reducing redundant, obsolete, or stored data that has no business value to your organization. Create streamlined workflows that automate future processes and grant access based on user needs.

    Ongoing compliance

    Keeps you up to date automatically with all major global and regional compliance requirements while reducing the risk of failed audits or fines by proactively identifying compliance gaps.

    Ongoing Compliance

    Secrets Discovery

    Secrets discovery

    When threat actors discover secrets such as tokens, API keys, passwords, or usernames, they can be used to break into IT systems. CipherTrust Data Discovery and Classification uses AI to proactively scan code for specific patterns, making developers aware of them before they become security threats. Secrets discovery proactively helps stop malicious actors before they gain unauthorized access to your data.

    Comprehensive platform

    Gain visibility into the location of sensitive data so you can uncover and close security and compliance gaps. DDC scans structured and unstructured data stores for named entities in different formats and global languages to help customers find sensitive data in any language anywhere across their enterprise.

    Platform Comprehensiveness

    Simplified Installation and Management

    Simplified installation and management

    Identifying sensitive data across the enterprise is time-consuming, underscoring the need for tools that facilitate quick deployment, enabling prompt scanning. The installation of DDC has been streamlined through automated scripts, simplifying on-premises deployments for single-node and five-node configurations. The setup process for cloud installations requires only a single click, automatically integrating all components into the cloud environment.

    IMPROVE YOUR DATA SECURITY

    Find out how CipherTrust Data Discovery and Classification can help your business

    Connect with a Thales expert for help tailoring a data security plan to your organization's needs.

    Request a Free Consultation

    Frequently asked questions

      How does CipherTrust Data Discovery and Classification work?

      CipherTrust Data Discovery and Classification provides complete visibility into the location of sensitive data across your enterprise, so you can uncover and close compliance gaps. 

      DDC scans structured as well as unstructured data stores for named entities in different formats and global languages to help you find any type of sensitive data, in any language, anywhere across your enterprise. 

      Once you have identified security blind spots you can quickly remediate using one of the CipherTrust Platform’s market-leading encryption solutions.

      What customer problems and use cases does DDC address?

      Compliance with security and privacy regulations: Organizations need to protect Personally Identifiable Information (PII) against data leaks and improper use to comply with the requirements of privacy regulations like General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), and Brazilian General Data Protection Law (LGPD). CipherTrust DDC provides complete visibility into the location of sensitive data, enhancing an organization’s ability to adopt appropriate data security controls and measure to protect sensitive personal data from loss and unauthorized access. 

      Increase data visibility: Organizations need greater data visibility to support better decision making for risk analysis and remediation, as well as reporting and compliance. According to the 2024 Thales Data Threat Report, 70% of enterprises are able to classify only 50% or less of their data. CipherTrust DDC automatically scans your data stores across on-premises, hybrid, and multi-cloud environments to help you protect and manage your sensitive data. 

      Reduce exposure risk during cloud migration: Major change programs like digital transformation involve moving large amounts of sensitive data from one environment to another. Uncontrolled dispersal of data across cloud platforms increases the potential of a data breach event, as well as infringement of privacy regulations. As IT environments become more complex, it becomes more difficult to discover sensitive data and have oversight or manage access across data sources. CipherTrust DDC provides visibility into exactly what information you have stored so you can plan an effective strategy for transformation to safeguard data at each stage of the process. 

      Secrets discovery: Modern development trends like containerization, DevOps and automation have contributed to a massive increase in the use of secrets (credentials, certificates, keys) for authentication. Secrets can be vulnerable to cyberattacks when not securely managed. The CipherTrust Data Security Platform provides a simplified workflow to address this risk. DDC automatically discovers more than 30 different types of secrets, including AES Keys, Auth Secrets, and SSH Keys. Once exposed secrets have been discovered, security teams can take actions to remediate the risk and improve security posture using CipherTrust Secrets Management.

      How does the solution help protect data and enforce compliance?

      CipherTrust Data Discovery and Classification efficiently locates sensitive data across an enterprise using a streamlined workflow that automates discovery, classification and protection, eliminates security blind spots, and provides a clear view of the sensitive data and its risks. 

      It comes with a comprehensive set of built-in templates for rapid discovery of regulated data. As a result, organizations can more easily uncover and close their data protection gaps, prioritize their remediation efforts, and proactively respond to a growing number of data privacy and data security regulations.

      How does DDC communicate securely with data stores?

      CipherTrust Data Discovery and Classification communicates with the data stores through agents. The agents can be installed locally or remotely to the data stores. The agents connect to the data sources using native protocols, e.g., NFS for Unix Share, SMB for Windows Share, HDFS for Hadoop, etc.

      Each protocol has its own way of protecting data. For example:

      • Databases: user and password authentication with SSL/TLS.
      • NFS can be secured using host access and file permissions configuration.
      • SMB uses user, password and domain authentication.
      • Hadoop uses a proprietary protocol.

      Customer is responsible for the following:

      • Using TLS or plain text
      • Hardening of the servers where the agents are deployed

      What are the deployment options?

      The solution is deployed on-premises by installing an agent on the host, or remotely via a proxy agent.

      What are the pros and cons for agent-based and agentless deployments?

      Below are the recommendations for agent-based and agentless deployments. Both approaches share the following pros:

      Agent-Based

      • Value 
        • Information does not need to be transmitted over the network to be scanned
        • Faster scan 
        • No need for credentials for scanning 
      • Recommendation 
        • Scanning data stores that allow an Agent to be installed locally. E.g. local storage and local memory on server or workstation with installed Agents

      Agentless

      • Value 
        • Faster deployment, as Agents do not have to be installed directly on Target hosts
        • Can scan multiple targets
        • It can scan any type of targets
        • It doesn’t consume resources on the target host
      • Recommendation 
        • Scanning data stores that can only be accessed remotely. E.g. database systems, email servers, cloud storages and network storage locations

      How does DDC compare with Data Loss Prevention (DLP) solutions?

      DLP solutions focus on preventing sensitive data from leaving the organization’s perimeter. CipherTrust Data Discovery and Classification focuses on data privacy and protection – identifying sensitive data, and getting a clear understanding of data and its risk.

      This enables organizations to take appropriate steps to protect their data and comply with data privacy and data security regulations.