One of the biggest challenges facing IT organizations is pinpointing the location of critical data throughout the enterprise. As businesses grow, data and its use grow exponentially. From personally identifiable information and customer information to trade secrets and data governed by regulations, much of this data is sensitive and critical to the business. It must be identified, monitored, audited, and protected from misuse and theft.
The need to comply with regulations, meet auditor demands, and minimize data risk adds to the challenge. To meet these needs, the focus must shift to the data itself. Traditional security solutions that focus on the external threat are not the answer. Data-focused technology is required.
Understanding where data is located is the foundation of a sound framework for assessing governance and compliance risk. Therefore, data discovery is a critical component of risk mitigation. While much attention has been paid to discovering data on laptops and other end point devices, a bigger problem looms within the data center. The same principles of data discovery must be considered inside the data center itself. This article explores data center data discovery.
Understanding data discovery
Typically, sensitive information is scattered across the enterprise. Regulations such as the Payment Card Industry Standard (PCI), Gramm-Leach-Bliley (GLBA), Sarbanes Oxley (SOX), and the Personal Data Privacy Act of 2007 require companies to protect data determined to be private. Customer information, employee information, and operational information can reside in databases and file shares hidden and unprotected in the data center.
Over the years, the data center can grow to include potentially hundreds or thousands of database servers that store this sensitive information. These servers can be scattered across the globe. Developers and quality assurance testers who create their own databases with sensitive information compound this situation so that most companies simply do not know where all their data is. This lack of visibility into critical data assets leaves companies exposed to significant risks.
The ability to understand where data stores are located, what is in those stores, and who has access to stored data is a critical step in understanding data risk and achieving data governance and compliance. The ability to identify data and determine its location, and whether it is in use or at rest, enables companies to assess the effectiveness of data classification policies and procedures. Additionally, visibility into data used for application development and testing is often uncovered in data discovery initiatives. Data discovery allows an organization to control data migration and replication for both application developers and testing needs.
Best practices for data discovery and risk management
Data risk poses significant business problems and needs to be controlled with the same discipline corporations bring to understanding and managing business risk.
Risk management requires knowledge about data assets- where they are, what is happening to them, what bad things might happen to them and, most importantly, the costs associated with the bad things that could happen to them. For companies with stable assets, this is built into standard operations. For companies with changing assets and evolving centers of value, the challenge is to become aware of the shifts and deal with them as quickly as possible. Data is one of those changing assets.
To this end, organizations must control three key areas of data risk. These risks fall into the category of “unknown unknowns”-- unknown data stores, unknown user access, and unknown location of sensitive data. Simply, organizations must:
1. Discover sensitive data -- Discovery begins with finding and identifying critical data assets in the network, determining what data stores exist, and finding information inside of those data stores.
2. Assess data activity risk -- To understand data risk an organization must know who has access to which data. Visibility into data usage and the associated risks is essential for developing the appropriate compliance and security strategy. This includes identifying how data is being used and which users and applications are accessing data from where and when. This assessment must encompass all applications, users, and processes relating to the access of sensitive data.
3. Ensure data compliance -- To comply with a variety of privacy and financial regulations, dozens of data protection requirements must be addressed. Compliance with these regulations includes the ability to provide regular and detailed reports that address the requirements of outside assessors and internal stakeholders.
The data discovery market
There are many solutions in the market today that claim to help identify data assets. Let's take a closer look at some of these tools:
- Data Classification Tools – Help “categorize” data, primarily for the purpose of tiered storage and are focused on finding unstructured data on a variety of file shares. This data can be categorized by content, file type, usage and many other variables. Once categorized the tools can also help identify the most appropriate storage solution.
- Data Leak Prevention Tools – Designed to prevent sensitive data from leaving the enterprise over email, instant message, or illegal copying of data to removable devices. The discovery component of these tools scans file shares, identifies different types of unstructured data, and then classifies it. Polices are then written to monitor the flow of this data and stop unapproved activity.
- Fileshare crawlers – There are a number of free tools on the market that will simply crawl a file share looking for different file types and create an inventory list of what they find.
Though these tools do serve a useful purpose, they are all missing a critical area of the enterprise–the data center. None of these tools help identify database servers, database systems or discover sensitive content in the databases. And as we know from the Verizon report, this creates tremendous risk in the enterprise.
Database discovery in the data center provides the answer
Database activity monitoring (DAM) solutions mitigate data risk by discovering critical data in the data center, intelligently monitoring and analyzing the activity that affects it, providing detailed auditing trails, and reporting on all user access to data stored in open systems such as databases, fileservers, and legacy applications such as mainframe systems.
Database auditing and monitoring helps assure core data by addressing four critically important data issues:
1. Data discovery – Where is sensitive data stored in the data center.
2. Data activity monitoring – How, where, what, when, and by whom is data being accessed?
3. Data risk assessment – If data risks are detected, can this risk be managed in an automated way to assure data for compliance and governance?
4. Data risk management -- Can stored data be protected from data theft, including data theft by authorized users?
How data activity monitoring works
An ideal data auditing and monitoring solution cost-effectively answers key questions about stored data such as: where is the data; who is accessing it; can data governance/compliance be achieved; can data breaches be prevented? In order to answer these questions, a DAM solution must possess four key capabilities:
1. Discover -- Customer information, employee information, and operational information must be located in order to protect it. A data auditing and monitoring solution must pro-actively find sensitive data-at-rest and data-in-motion, then classify it. This capability is critical in order to mitigate risk and identify gaps in compliance initiatives. Discovery and classification of data also provides insight into what policies need implementing for a sound data activity monitoring project to be successful.
2. Automate -- Automation of the data activity monitoring system provides significant reduction of cost to compliance and governance as opposed to following a manual process. A sound framework of auditing policies provides the ability to only audit what is required in an automated way across data stores. Implementing an automated workflow process to receive, review and approve automatically generated reports ensures compliance, and minimizes efforts to manually perform these tasks. Lastly, providing real-time alerting of critical events and automating forensic analysis of them can minimize risk in real time.
3. Monitor -- Effective monitoring across many data types, users and applications is required to ensure a successful data activity monitoring project to be a success. All types of users must be monitored to minimize risk and be compliant. In order to effectively monitor users and applications a solution must be able to identify suspicious or anomalous user activity or application behavior in real time.
4. Protect -- The ability to take action based on data activity provides the remediation needed to minimize data risk against non-compliance, misuse or theft of data. Key protection capabilities include alerting in real time based on policy violation or suspicious activity, terminating a user or application session to the data store when a violation occurs and notifying an enterprise security incident management (SIM) system for broad security event correlation.
Conclusion
Whether it takes the form of intellectual property, healthcare, financial, credit card or customer information, data is the lifeblood of the enterprise. Data auditing and monitoring is a critical technology for data protection, security and compliance.
Using database security technology, organizations can gain unprecedented control over the sensitive data in their care. From locating sensitive data and identifying data risk to obtaining ongoing intelligence about how data is being used, data auditing and monitoring is a powerful and highly cost-effective tool for mitigating the most costly and damaging forms of data risk.