Shrink Your Sensitive Data Footprint

Protecting data is hard. Knowing for sure that you have identified and mitigated every vulnerability takes a lot of work and constant vigilance.  The more servers you have to harden, the more databases you have to protect, the more work it is, and the more likely you will leave a hole somewhere. So it makes sense to reduce your sensitive data footprint as much as possible.

Data Masking

The concept of data masking has been around for a long time, but we have found that many companies don’t yet use it in their development environments. By having full production data sets in dev and QA environments,  we needlessly expand our sensitive data footprint and add significant vulnerability.

Think about it, our development environments are the least protected of all of our systems, and they are the most exposed. Developers frequently have full access to all the data in the development environment, and even have the ability to extract data, typically loading subsets of it into their own, even less protected, systems, in order to do local development and testing.  Many companies are now using cloud technology for development environments as well, which are outside of their corporate firewalls, adding even more risk.

This significant risk is relatively easily mitigated by masking all data in development and QA environments.

How it works

Typically, data masking vendors will have a tool that enables the discovery of sensitive data. Discovering this data is helpful even if you don’t plan to mask your data. The tools have default, expected, search patterns for sensitive data, and you can add your own as well if you feel that your system contains sensitive data that is not represented by the default patterns.

After discovering the data, you can specify a pattern or function that will be used to alter or obfuscate the data as it is stored in the database (or in the case of SQL Server, will alter it as it is extracted from the database). For example, you can choose to randomize the numbers within a phone number field in order to change the values but keep the formatting so development and testing can still be done using the data.

Data Masking offerings by Oracle and SQL Server

Oracle has a very robust Data Masking solution called Data Masking and Subsetting.  It is not cheap but the cost is not significant compared to the costs associated with a data breach.  And after you thoroughly mask the data, you no longer have to spend as much time and effort hardening the target environments, so the costs are further mitigated.  It works by identifying and then masking data that will be placed into development environments. So the development environments never have the real data.

SQL server’s data masking offering, Dynamic Data Masking (DDM), masks data as it is retrieved by sql queries.  This is less secure because the target databases do have the real data in them. Only users with unmask privileged users can view the real data but that does require additional security administration on the development environments. So while this is an improvement over having no masking, it is not as secure as using Oracle’s approach.

Third Party Data Masking Offerings

There are also third party tools for data masking such as DataVeil, which has a free version that includes some standard masking functions and a paid version that includes some more advanced masking functions. Data Veil works on both Oracle and SQL Server, as well as other databases.

Just Mask It

So the bottom line is that there are good options available for masking data in your development environments and doing so can significantly reduce your vulnerability to data leaks .  With any of these tools, it will take a little work up front to discover and to determine the proper masking functions for the data, but once done, it can be re-applied easily each time you refresh your dev environments and it is definitely worth the effort.

If you know of any other great data masking tools please leave a comment. If you would like to chat about data masking or have any questions you can find us at