Data is the lifeblood of the modern enterprise. It empowers businesses to drive strategy and insights when correctly managed and used. However, growing data streams and today’s data management complexity mean that many organizations still struggle with thorough and effective data governance. Consequently, it’s much harder now for organizations to understand how their data flows throughout the business, which can make it challenging to answer essential data compliance questions such as:
- Who owns this data?
- Where does this data exist physically?
- How is this data being used within the business?
With Data Lineage, organizations can confidently answer these important questions and maintain a detailed understanding of how critical data is used in the business.
What Is Data Lineage?
Data Lineage is a foundational practice for data governance. It helps businesses identify where data originates, its final destination, and its end-user. There are 2 different types of data lineage:
- Solution or data warehouse data lineage: Focuses on detailed transactions or processes, where data originates from, the processing steps it undergoes, and the logic that decides these transformation steps.
- Enterprise data lineage: Focuses on where the data resides physically in the world, which applications can access or modify this data, and how the data is used within the business.
Enterprise data lineage is essential for most organizations at a higher level of strategic resource management and risk reduction. It helps to understand how data works as part of an organization’s technical and business landscape. Knowing the provenance of data and where it is being used throughout the organization will deliver more effective cybersecurity risk management and swifter compliance auditing.
For the CIO (Chief Information Officer) or CTO (Chief Technology Officer) of organizations, Data Lineage allows for updated knowledge about the business’ data. This is important for several disciplines, such as data management, information security, risk management, and digital transformation. If executed using a data-driven, new Enterprise Architecture (EA) tool, Data Lineage keeps the data landscape updated and connected to the rest of the business architecture, ensuring critical data is utilized and managed optimally. This detailed, precise understanding of the organization’s essential data and how it flows is vital for making effective decisions about how the business operates from a technological point of view.
For the CISO (Chief Information Security Officer) and security organization, Data Lineage eases the documentation processes in due diligence for data compliance. It helps identify which data is being processed by applications, who manages those applications, which business capabilities they are connected to, and who owns the data. With Data Lineage, businesses can more easily ensure compliance with existing and potential regulatory and legal directives.
In addition to the questions above, Data Lineage can address:
- Which applications write information to a given data entity, and which read and use the information?
- Where in the business is the data used?
- Who is responsible for a given data entity?
- What is the confidentiality of the organization’s data entities?
- Which infrastructure are these data entities hosted on?
- Where in the world are these data entities physically stored?
Manage Risk Effectively
Large companies have data elements in the thousands, spread across all their various systems and applications. The complexity increases when factoring in infrastructure and physical location. Yet, out of all these thousands of elements, only a small fraction are truly critical to the business. It would be unfeasible and ineffective for companies to pursue mapping every single data element.
Organizations should always aim to capture and maintain data in a way that provides useful outcomes. With this perspective, the wiser approach is to focus on the critical data that carry the highest amount of risk or negatively impact profit, and therefore minimize potential exposure for the organization.
Organizations can then focus the efforts of IT and architecture teams on these vital data elements. Defining and implementing Data Lineage with a narrower scope will help the business gain visibility over where the most critical data is stored and where it exists physically. This will allow lean technology teams to focus their efforts on where it counts most for the business as a whole instead of pursuing the monumental task of documenting all sources of data.
Prevention is better than cure: waiting for a data breach to happen and then focusing on data use and creation could spell disaster for business. IBM’s annual Cost of a Data Breach Report found the average data breach in 2022 is $4.35 million, rising year on year. Evidently, data breaches are expensive, painful lessons in security that organizations should work to avoid if they want to maintain healthy business continuity.
Avoid Costly Penalties and Violations in Data Compliance Audits
Data Lineage also eases the effort of due diligence for data compliance. It facilitates the documentation process, delivering precise information about applications and data, reducing the time and resources needed to ensure compliance in the organization.
Clearing compliance audits is a fundamental part of doing business, regardless of how local or global a business is. Regulations in different areas of business and countries have varying requirements, adding to the complexity. It’s costly and time-consuming to carry out data-intensive reporting processes, and the penalties for non-compliance are hefty.
For example, General Data Protection Regulation (GDPR) requires businesses to abide by strict rules for managing and using Personal Identifiable Information (PII) of EU residents. This includes some commonly understood data such as names, personal identification numbers, mailing addresses, and phone numbers. It also includes very sensitive information such as personal health, sexual and religious orientation, and union affiliations. The penalty for failing to comply can cost as much as €20 million or even an astronomical 4% of the organization’s global annual revenue.
Key Takeaways for Improving Data Compliance:
- Data Lineage is an important part of risk management and information security.
- Be proactive and leverage Data Lineage to understand where the data is created, used, and stored.
- Understanding where critical data is will empower and focus efforts on where the risk is greatest.
Deborah wields words in the hope of demystifying the complex and ever-evolving world of Enterprise Architecture. She is excited about helping the curious understand the immense potential it has for driving effective change.