The elasticity of the cloud is ideal for Big Data analytics – the practice of rapidly crunching large volumes of unstructured data to identify patterns and improve business strategies. It's now possible to parse data out to hundreds or even thousands of real and virtual servers, process it in parallel, and return analyzed results in a matter of hours, as opposed to the days and weeks it used to take using conventional computing methods.
While agility, availability and lower TCO will drive business to the cloud, organizations must address open questions around the security and compliance of protected health information (PHI), financial records and personally identifiable information (PII), and who has access to it.
It's standard practice for enterprises to implement extensive security controls behind their own firewalls. However, once an organization moves sensitive data or applications to the cloud, controls can be spotty or even non-existent. Often an enterprise that uses public cloud computing has little choice but to trust that their data is in good hands with their service provider.
Even if that provider has SSAE-16 (previously SAS-70) certified controls, the fact that a third party has any unrestricted access to an organization's data is problematic. Should a subscriber's data be compromised in any way, it's the data owner that's at risk for steep fines, potential litigation and brand damage resulting from the breach.
Security techniques such as event management, access controls and firewalls should be commonplace. But a fundamental best practice for protecting data in the cloud is encryption. This type of security poses a different challenge, however. How do you run applications on encrypted data?
Data needs to be in plain text or decrypted for processing, and that requires access to the encryption key. In cloud environments, the data owner will often store the key in the cloud alongside the encrypted data for easy access. This practice creates a sizable security risk, since anybody with access to that particular cloud server can also access the encryption keys. It also falls short of the security requirements needed to comply with HIPAA, PCI and most other data security regulations.
Tools that allow you to manage encryption keys separate from the encrypted data reduce these risks and provide greater flexibility in terms of where keys can be stored. In cases where your cloud or SaaS provider is offering encryption services, make sure you control the keys. This adds an extra layer of protection in case your provider is breached or subpoenaed. Revoke the key, and your data is encrypted and indecipherable to all unauthorized parties.
Another consideration is process-based access controls. An organization likely already has user- or role-based access associated with data. Adding process-based access controls ensures that even if a malicious actor or employee becomes an authorized user, that person would also have to be running an authorized process to access the data.
There are a number of factors that impact database and application performance, and encryption is just one. Application-level encryption tends to pack the greatest performance hit, while the file-level encryption penalty is much lower. For maximum application performance on Hadoop, run block-level encryption on a system utilizing the Intel AES-NI co-processor.
Most large organizations utilize a variety of database applications from MySQL and PostgreSQL to Cassandra, MongoDB and HBase. To ensure your encryption utility functions cross-platform and meets your performance standards, verify that it's been tested against the databases that are most important to you.
Finally, look for an OS-level solution that's transparent to your application. This means that the application (Hadoop, Cassandra, MySQL, MongoDB, Tomcat, etc.) doesn't even recognize there's an encryption utility sitting beneath it. This will significantly improve the performance of your application.
The National Institute of Standards and Technology Computer Security Division publishes security requirements, FIPS 140-2, for cryptographic modules. If your vendor solution uses FIPS-validated crypto modules, you can feel confident in the strength of the encryption.
Much of the fear, uncertainty and doubt about data security have been unfairly cast upon the cloud providers. However, recent studies suggest that the cloud can be as secure, if not even more secure, than storing data on premises. If you've got sensitive data about your products, customers, employees or market, then you should always encrypt it, whether or not there's a legal obligation.