As companies continue to redefine IT processes to cope with the semi-structured and unstructured data that characterize big data, they are also recognizing that standard data security practices that grew up with fixed-record, transactional data no longer address every big data security concern.
For starters, there are few controls on the mountains of big data that flow into companies on a daily basis. Big data can come from anywhere and in every form.
While companies can put controls in place to regulate the in-flow of this seemingly limitless data pipeline, there are very acute security concerns that emerge once the data is in enterprise data repositories where it can be accessed or shared. Who should be authorized to see the data in its entirety, and who within the organization needs to know some of the data, but not all of it?
“We are seeing major transitions in the big data market now,” said Venkat Subramanian, CTO at Dataguise, a data protection and compliance vendor. “Companies are moving from traditional data services to the big data market, and they are beginning to move more of their standard and big data applications from on-premises data centers to the cloud. Whether big data is stored on premises or in a cloud environment, appropriate governance measures for this data are needed.”
As part of big data governance, there are several security measures that companies can take.
1: Conduct regular reviews of user access to data
On a semi-annual or annual basis, IT should sit down with corporate stakeholders who access data from data lakes and repositories, and review data access permissions for all authorized personnel. Access permissions can be adjusted upward or downward based upon employee/contractor work responsibilities. When employees/contractors are no longer employed with the company, they should be immediately removed from access.
2: Data masking
In some cases, masking can be used to redact sensitive data elements (e.g., social security numbers, names) so this data isn’t shared with others outside of the company. Masking should especially be considered if the company wants to sell big data to third parties.
3: Encrypt data
If big data is stored in a single data repository that all employees with appropriate clearances are able to access, encryption can be used on the data. “The idea behind data encryption is that you give everyone maximum flexibility to get at the data that they need, and they can do so safely,” said Subramanian. “The encryption is a secure ‘wrap’ around the data.”
SEE: Encryption Policy (Tech Pro Research)
4: Monitor user behavior
There is an additional dimension to software security watches: This is continuously monitoring the access habits of each user, developing behavioral models for how the user accesses the data, and issuing an alert if for any reason there is an access anomaly or a usage pattern that surfaces that does not agree with how the user normally uses the data.
“What we are concerned with here is that there could be a security breach in the making,” said Subramanian. “This is very important, because the data we have tells us that it can take the average company as much as 250 days to figure out that there is a security breach. The behavioral monitoring that the security software uses is based on access rulesets that the business defines. It looks at how each user normally uses data. When a usage anomaly for a user occurs, the user’s access is immediately suspended and an investigation takes place. In some cases, an early stage security breach might be detected. In other cases, it is simply a situation where an employee has been given a new job role or responsibility that requires different ways of accessing and sharing data.”
SEE: 6 myths about big data (TechRepublic)
Make data security a high priority
Collectively, these approaches help companies with their big data governance, but it is still up to CIOs to work with end users in an historical area of user non-cooperation: getting users’ managers to regularly review access privileges of their employees, and to cooperate when usage abnormalities are detected.
“No one wants their feet to be held to the fire for a security breach,” confided one CIO at a marketing firm. “But when it comes down to scheduling a meeting to review data access policies and who should have what, it is always treated as a low priority meeting that can only be scheduled after campaign launches and project priorities are met.”