Published: Oct 20, 2020 by Matt Wood
Whether you’ve built or bought your data warehouse, and no matter how robust its feature set is, the fact remains that it’s just one of the many pieces that make up your organization’s Enterprise Data Management System (EDMS). So if you’ve got the latest and greatest data security features with your data warehouse, great – but let’s talk about building security into the rest of the links in the chain that protects your firm’s data and its reputation. This article will discuss some EDMS security controls that should be considered for implementation as part of your organization’s larger Information Security Program.
Data Inventory and Classification
Building an inventory of data assets is one of the best first steps to take in building your security program. Once you have it, you can begin to think about protecting the assets it has catalogued. Often this involves thinking about your org’s data from both the top-down perspective (“what type of data do we store?” - payment card data, HR data, etc), as well as the bottom-up perspective (“what systems/system locations hold our data?” - shared folders, sharepoint, email, databases, etc). To do this, build a list containing the data location, data description, business unit, and data owner (a specific person who can make decisions about the data). When building this list, I prefer to do so in a system like a wiki where it’s easily located and several contributors can update the document so it doesn’t become instantly outdated.
Once your data is inventoried, it can now be classified, which is a deceptively difficult task. The benefit of completing this task is that it should make the remainder of decisions about controls and policies much more straight-forward. Before naming the classifications, begin by thinking about how many different buckets your data fits into from a security perspective. These buckets will reflect how the data is handled and the consequences of a leak – is it public data and freely distributable? Is it data only for HR? What kind of harm will result from a data leak of a given class of data? When defining the name and number of levels, I usually like to start with the four often prescribed from the cybersecurity community:
With your data inventory and classifications in hand, you can now decide how you’re going to go about protecting the data in each of the categories. As an example, certain levels may require encryption or multifactor-authentication while others can be publicly available. I’ll go into some of the recommended controls to apply to some or all of your data classifications.
Least-Access and Role-Based Access Control
There shouldn’t be anything new here, but for those needing review: For any of the systems and controls we get into below, you’ll want the ability to define access by job role rather than the individual actor themselves. So rather than allowing “Frank” access to your HR data, you’d instead allow access to “HR members”. This way, when a person leaves the organization or changes roles, clean-up is much easier – no need to update every system the individual had access to. Ensure that components in your EDMS use this functionality – anything that doesn’t allow this is seriously operating with an ancient feature set.
Authentication - SSO and MFA
Single Sign-on (SSO) - This is an important technology worth calling out. Striving to align all of your SaaS providers, internal applications, etc behind single sign-on is important for ensuring adherence to password policies, account expiration policies, etc. It helps mitigate against abandoned accounts (among other things) from when a user leaves – a common attack vector. Additionally it provides a centralized audit trail for login activity, a common regulatory requirement in certain sectors.
Multi-Factor Authentication (MFA) - Enabling SSO will also allow easier integration with multi-factor authentication, an industry best-practice. Whether attackers are using social engineering or dictionary attacks, MFA is the leading technology for stopping attackers at the front door to your systems.
Network Access Control Lists
Whether your data is spread across systems contained on-premises or in the cloud, you should be able to define network access control lists to limit the exposure of your data at the network level. If you’re managing (cloud) infrastructure, more advanced firewalls can permit network access based on user roles, but even without such features, taking broad-brush strokes like preventing VPN users direct-access to your database is a good measure. The more you segment the network, the finer grained control you’ll have. Consider the interaction needed among these common groups of systems:
- workstation groups (intra-workstation communication)
- VPN users
- application servers
- logging systems
- file servers
- Internet users from a particular country (can you block access to anyone not in your country?)
Of course, if your data is stored with a SaaS provider (box.net, onedrive, etc), your options are more limited here, but still worth considering.
Service and API Security
Most EDMS’ are made up of a web of many different interacting software systems. These machine-to-machine interfaces are often configured then forgotten about until a problem arises. As you would imagine, there’s often plenty of opportunity to exploit security weaknesses in these interfaces, so let’s walk through a few items to check on.
Audit Logs - These systems should log connections and login meta-data for manual review and anomaly detection.
Certificate Pinning - Many of these systems can use key-based authentication, a strong authentication method. Consider requiring a particular counter-party certificate (or checking for known certificate fingerprints), which can offer some protection against certain man-in-the-middle attacks, but at the expense of maintenance overhead.
IP ACLs - As above, locking your API network connections down to a particular IP access control list (ACL) can offer an additional layer of protection.
Key-Based Authentication - Using key-based authentication instead of password-based authentication is recommended.
Certificate Monitoring - Have in place a solution for managing certificate renewals and/or expiration warnings. This will prevent system downtime. There are third-party solutions out there that can completely automate the renewal/reinstallation of certificates for you.
Continuous Log scanning
Even legacy components of your EDMS probably have some facility for logging events. Ideally, you’d aggregate all log events to a common logging system to speed review and correlation. On top of a centralized location, you can add a system like a security information and event monitoring (SIEM) solution to bring to your attention any anomalies or concerns that it may find in the system. It’s common for particular sectors to require regular monitoring of event logs. Additionally, many log systems can also alert on and/or scrub logs of sensitive information (credit card numbers, etc) in case you’ve accidentally sent some sensitive data into a log – in fact this sort of feature is often a part of many data leakage protection (DLP) systems that the larger organization may consider in its security program. Your policies may require this degree of data protection, especially if your security logs are monitored by a third-party.
Finally, your security posture cannot improve without regular testing. Regular 6 month or annual penetration testing ensures that you’re on the path toward continuous improvement. Additionally, this audit information is quite often requested when integrating with new customers or partners.
Data Retention Policies
Data Retention policies can limit the magnitude of risk posed by cybersecurity threats – the less data you retain (due to secure removal) the lower the amount of data that can be compromised. Additionally, the sector in which you function may dictate a minimum (and/or maximum) amount of time to retain data. You may want to consider EDMS components that can manage data retention automatically based on configuration that you set.
As more of your EDMS components are cloud-based, you may be passing more of your cybersecurity risk onto your vendors, hoping they’re taking the appropriate measures to secure your data. Do right by your organizations (and what may be required by regulators) – ask vendors for information on their security programs. Whether you build a cybersecurity due-diligence questionnaire (there are good templates available) or opt for an industry-standard audit report like a SOC-2, you need to understand what policies, procedures and controls are in place with your vendors to protect your data. Once you have this, you can decide how to mitigate any organizational risks that may be revealed.
Backup and Recovery
Delivering your firm’s data in a reliable manner requires preparing for disaster and data loss. Ensure you have a backup and recovery plan in place for your data. Newer, cloud-based EDMS are likely to take advantage of built-in backup/recovery solutions offered by cloud providers transparently. As an example, a document warehouse may persist its data to AWS S3 across multiple regions for durability and fault-tolerance. On-prem solutions will naturally require more planning for backup and recovery efforts. Ensure your disaster recovery solution involves geographic separation, whether it’s spreading your systems across multiple AWS regions or replicating data between your geographically-dispersed data centers.