Beware the Traps of Data Governance and Data Management Practice

Author: Guy Pearce
Date Published: 1 November 2022
Related: Rethinking Data Governance and Management | Digital | English
中文

Data-driven decision-making (DDDM) can never entirely be about “[m]aking decisions based on hard data as opposed to intuition, observation or guesswork”¹ because most data are never in a perfect state for decision-making and hard data can never present all the data relevant to all decisions. Experience, intuition, observation or guesswork—call it judgment or lore—is information that naturally supplements complex decision-making in humans.

Lore is a nuance of data. To illustrate the key role of lore in decision-making, consider a map. It provides crucial navigation data and information, not only to fix positions and find routes, but also to highlight dangers (e.g., geographic features that can give rise to thermals and other local weather systems that impact air travel; the existence of shoals, rocks and shallows that impact maritime travel). But the best sensors and maps cannot always provide all the data and information needed to navigate a course. In the context of unknowns, lore is king.

Without mechanisms to provide assurance that data—even hard data—are fit for purpose, how can data be trusted to do anything other than to incur costs while occupying storage devices? Enter data governance and data management. In a nutshell, data governance defines personal authorities (who) with respect to an organization’s data in support of its data strategy, while data management constitutes the disciplines (how) that enable business insight.² Efficiently enabling business insight is a key goal of a data strategy (what), which is what drives data governance and data management.

However, neither data management nor data governance should be pursued mechanically—at least if sustainable success is an objective. They are subject to myriad nuances that, if neglected, can create unanticipated outcomes.

Lore vs. Data

In humans, lore has led the way for as long as humans have made decisions. Lore is valuable in everyday decision-making in familiar environments when applying time and money to data analysis is unnecessary. Indeed, lore introduces more data into a situation than can ever be captured, analyzed and used.

Intuition and feelings are types of data that are hard to capture, but they provide inputs regular data could never give. So, it is shortsighted to think that hard data could ever replace the richness of lore. Instead, hard data serves to augment lore, or vice versa, especially when individuals make decisions in uncertain situations.

However, lore is influenced by emotions and other factors that can bias decision-making. Because experiences are filtered—for example, by a person’s interpretation of the business environment, the people around them and even by themselves³—decision-making based on lore is subjective. Data are not necessarily subjective. Furthermore, data represent a fraction of the details of a situation, making them a simplifying factor in decision-making in much the same way that a mathematical model simplifies reality.

However, DDDM can elicit blind trust in decision makers, and the data can be subject to misinterpretation.⁴ The result could easily be worse quality decision-making than could be made with lore.

The difference between the shortcomings of data and of lore is that some of the shortcomings of DDDM can be mitigated by data governance and data management after the fact (ex post facto), whereas subjectivity needs to be mitigated before the fact (ex ante facto) to be effective. Ex ante mitigation is often more difficult to perform than ex post mitigation.

Figure 1 illustrates the power of decision-making with both lore and data (turquoise line). If there is only lore (green line) or only data (blue line), decision-making confidence is lower than when both lore and data are combined.

Note: Lore is shown below data, but it could just as easily be reversed in a particular situation.

Human emotions impact decision-making—beneficially or as bias—by means of changes in the depth of thought, changes in the content of thought, and changes in the content of implicit goals compared to traditional rational choice theory.⁵ Although some may argue for rational choice, an example of the benefit of emotion in decision-making is how a person “anxious about the potential outcome of a risky choice may choose a safer option rather than a potentially more lucrative option.”⁶ Rationality and emotion hold varying importance in the decision-making process; without emotions, the probability of optimal decision-making is decreased.⁷ In addition, decisions still need to be made in the absence of lore and data. Figure 2 illustrates various circumstances under which decision-making is performed in the absence of information based on a sample of 712 respondents to a BI survey.⁸ The data management response column shows at least one data management domain that can help resolve the issue.

Source: Adapted from “Why Companies Make Decisions Without All the Relevant Information to Hand,” BI-Survey.com, https://bi-survey.com/decision-making-no-information

Ultimately, well-governed data coupled with lore enables the highest-quality decisions because it creates an environment of optimal information.

It has been suggested that people use less data than they think for decision-making, with their minds made up long before they review what evidence is available.⁹ Furthermore, social science has shown that humans are imperfect information processors and may ignore certain critical information, especially when life-or-death decisions need to be made under highly ambiguous circumstances.¹⁰ So, DDDM is not necessarily as significant as some make it out to be. Furthermore, System 1 and System 2 thinking refer to rapid, intuitive responses and slower reflective reasoning, respectively¹¹—or lore-based thinking and data-based thinking, respectively.

Reason-based thinking tends to be less empathetic, while intuitive thinking might be more at play when firefighters run into burning buildings rather than slowing down to apply reason and second-guess themselves.¹² In other words, it is not one or the other—it is both that matter. Reason allows a more accurate understanding of the world, but intuition is what makes people human.¹³ DDDM without intuition can be left to robots, while the ability to make decisions from both bases is a human strength.

Ultimately, well-governed data coupled with lore enables the highest-quality decisions because it creates an environment of optimal information. As Mr. Spock, the data-driven character from the US television series Star Trek, keenly observes, “Insufficient facts always invite danger.”¹⁴

Danger, or risk, is a state that commands serious attention in IT governance, but its counterpart in data governance has received scant consideration other than in terms of privacy or security. There is much more risk associated with DDDM and data in general—for example, the vulnerabilities extant in the use, robustness, reliability, availability, capacity, interoperability and performance capabilities of the data environment.¹⁵ This is an important observation given that a key goal of data governance is to minimize risk and ensure the sustainability of the organization by means of effective risk management.¹⁶

Sustainable Data Management Success Demands a Nuanced Approach

Neither data governance nor data management can be deployed successfully if they are approached mechanically. Developing policies, processes, procedures, standards and guidelines—and assuming an organization will adopt them—will fail to result in adoption without additional effort in change management, coaching and training. Furthermore, a mechanical approach overlooks many crucial nuances that are vital to sustain the success of an organization’s data disciplines.

Managing the Expectation of Increased Bureaucracy
It can be challenging to introduce data governance into an organization because it can evoke a negative response based on the expectation of increased bureaucracy. Good data governance does involve effort, though. It influences an organization’s data attitudes, norms and behaviors (its data culture), and its data management activities so that the goals of data governance and the goals of the organization can best be met. In data-mature organizations, data governance may be a formalization of activities and structures that are already in place.

An instrument that can temper expectations is the business case for data governance. A business case is also a vital construct of IT governance. However, this applies not only to a typical business case for data governance at the organization level, but also to a nuanced business case at the business unit or even individual level. Figure 3 shows the benefits for data governance roles such as data owners, data stewards, data analysts, business intelligence developers and data scientists. There are differences between an executive business case for data governance and a people-centered business case for data governance; however, the latter is derived from the former.

These distinctions are important because the most relatable business case answers the question, “What is in it for me?” at an individual level. The better the answer, the stronger the buy-in from individuals for the upcoming transition—and with it the acceptance of the additional effort that may be required. In other words, top-line business case benefits might sell data governance to executives, but they will be meaningless for staff who do not see the benefit of all the effort in their day-to-day lives. They will see only more bureaucracy.

At the individual level, the benefits of data management align with key data governance roles and address the transformation and change management needs of the organization. These are vital to a successful change program and to IT performance and oversight as an IT governance discipline.

Organizational Strategic Alignment
An organization’s strategy is instrumental in determining the expectations of its technology and data, and, therefore, of its IT governance and data governance, respectively. Strategic alignment is a pillar of good IT governance.

Achieving alignment between an organization’s strategy and its technology is challenging. When introduced, the Strategic Alignment Model (SAM) of 1990 was instrumental in exploring how the alignment between the business strategy, IT strategy, business operations and IT operations domains would work.¹⁷ The SAM maintains significant value as a strategic alignment tool,¹⁸ especially with the modern focus on data already existing as part of both the business operations (demand side) and IT operations (supply side) components of the SAM. What is blurred in the SAM is the relationship between IT strategy and data strategy. Enterprise architects may appreciate that a distinction between data requirements and the technology that fulfills those requirements is an important one, as figure 4 indicates.

The SAM requires that alignment across all four perspectives be accomplished to achieve overall strategic alignment. The implications are that for alignment, neither an IT strategy (nor a data strategy as a nuance) can exist without a business strategy, and that a business strategy does not make sense without a corresponding IT strategy and corresponding data.

Defining Objectives
The goals of a data governance program should be clear and should include (but not be limited to) eliminating the issues of blind trust and misinterpretation. The program’s goals should be disaggregated into subgoals such as those shown in figure 5, with measures for their success defined and the time frame for success articulated.

Some subgoals that can help resolve blind trust issues include data quality management, master data management, data lineage management, access control management, compliance management and data life cycle management. Metadata, which help to resolve misinterpretation issues, consist of many dimensions, yet metadata are often neglected despite their vital role in successful and sustainable enterprise data management. Metadata management is such a key part of an organization’s data activities that an entire contemporary data platform paradigm—the data fabric—centers on it.

Setting objectives is important, but achieving them requires a plan. Without the nuance of clear objectives and a time horizon to aim for, one can easily end up adrift.

The Relationship Between Methods
With the objectives defined, the next step is to determine how to accomplish them. In general, almost everything within mature organizations is accomplished by means of a set of policies, processes, procedures, standards and guidelines (figure 6), but the nuances of the relationship between these are often misunderstood.

Policies can be defined as the set of rules an organization expects its people to follow. To make the rules actionable, the policies are interpreted as processes, and some of the processes are disaggregated into detailed task-level procedures. In some cases, the processes are defined with reference to a specific standard. The final formality is a set of guidelines, which are like standards, except that they are recommendations rather than specific requirements for development of the associated processes and procedures.

For the activities the organization deems important enough to control with rules, making them actionable by means of processes and procedures can be standards-driven, guidelines-driven or standalone (figure 6). It is possible that some processes and procedures may exist in the absence of reference policies, but they should be subjected to deep scrutiny prior to approval to determine why no policy drivers exist for them.

As with objectives, the best way to determine the success of a method is by using metrics. One category of metrics is input metrics, which entails measuring process throughput and then watching for and resolving bottlenecks. Another approach is to measure policy, process and procedure compliance (which requires an audit). For example, the output metrics category includes measuring and tracking data quality per critical data element or master data domain.

The Relationship Between Roles
A key data governance activity is to define, by name, the responsibilities for executing the data management methods. Although data governance does not have an explicit pillar for people resourcing, IT governance does. These are not the only people factors at play. There are nonoperational data stakeholders lurking in all areas of an organization, and they must be identified and brought along to ensure organizational buy-in for the data management journey.

The roles involved in full-scope data management range from the chief data officer to data owners; data stewards; data analysts; data managers; data architects; data modelers; data operations staff; business intelligence developers; content managers; various subject-matter experts (SMEs); and privacy, security, and compliance officers. Some roles are more technical, while others are more oriented toward business and operations. Each role has very specific activities to perform within the data ecosystem, and they are all held together by data governance, which defines which roles are responsible (or accountable) for what functions. Some roles are also integrated into teams or communities by means of constructs, such as the data council or the office of the chief data officer.

Data governance requires that data management activities are appropriately allocated to roles (for example by the chief data officer or head of data) and that there is assurance that the activities are performed on schedule and according to applicable standards. The segregation of duties (SoD) between those responsible for data governance activities and those accountable for them (figure 7) is a key nuance. For a single person to be both accountable and responsible for an activity—too often shown as A/R (accountable/responsible) in some responsible, accountable, consulted and informed (RACI) charts—violates the governance requirement to distinguish between accountability and responsibility. Segregation is a vital instrument used for mitigating conflicts of interest. Establishing important distinctions and balance between accountabilities and responsibilities in governance, including data governance, is all too often overlooked.

As a regulatory example of this distinction, the Risk Data Aggregation and Risk Reporting Principles set by the Bank for International Settlements (BIS) in 2013 held global systemically important banks’ boards and senior management accountable for the identification, assessment and management of data quality risk, and for providing adequate resources to perform these activities.¹⁹ The board delegates responsibility to senior management, and senior management delegates responsibility within the organization to people ultimately responsible for task execution.

Note that responsibility can be delegated but accountability cannot. For example, if anything should go wrong at the task level of data governance, the BIS would hold the board accountable, not the person delegated responsibility for performing the task.

Contemporary Nuances

Data governance is still quite immature in its development, so it is likely to change form based on the shortcomings of the current approaches.

Automation
The automation of data governance is a vision of the future. Aside from the challenges with what individual vendors mean by data governance automation and what the scope of this intervention should be major challenges with this vision includes:

Although technical metadata may be available for automation, the semantic layer—operational metadata—which is a key component for end users, is often ill-defined or nonexistent.
Although a data catalog may be automatable, the corresponding data dictionary is not, which means the catalog is still not meaningful to end users.
Although lineage may be automatable, it depends on the extract, transform, load (ETL) or the extract, load, transform (ELT) code written in a product by a major vendor being in a readable format. Legacy/cottage ETL/ELT is not easily automatable.
Although the curation of business terms across the organization can be automated, standardizing them is not automatable, which means the end user will still suffer from the absence of organizationally agreed upon terms or synonyms for terms.
Although data categorization and data classification are automatable, the outcomes still need to be validated by humans for completeness and accuracy.

In other words, the journey to data governance automation suffers from the same constraints as data governance in general. Data governance automation requires increasing organizational data maturity to be effective. In the absence of maturity, performing data governance—never mind data governance automation—may not be successful.

A nuance is to begin the journey to data governance and then to data governance automation (in this order) in parallel with a people transformation (change management) intervention aimed at increasing the data maturity of the organization. Believing that technology will solve data governance problems is easy, but it will not work without ensuring that as few of the organization’s staff as possible are left behind. Not everyone will survive the journey.

Centralization vs. Decentralization
The centralized data environments in use since the creation of relational databases and monolithic data warehouses are evolving not only as data lakes and data lakehouses, but as technology independent as data mesh architectures (technology independent) and as data fabric platforms (multitechnology) (figure 8).²⁰ While these two approaches tend to be spoken about together, they are dissimilar.

Source: Adapted from Tesfaye, L.; “Data Management Trends in 2022: Data Fabric v. Data Mesh v. DataOps? What Is Right for Your Organization?” Enterprise Knowledge, 11 January 2022, https://enterprise-knowledge.com/data-management-trends-in-2022-data-fabric-v-data-mesh-v-dataops-what-is-right-for-your-organization/

Data mesh architectures are centered on organizational SMEs as owners of their data products (something akin to the function of traditional data marts, where data are prepared with a specific purpose in mind). The sole purpose of data mesh architectures is defining these data products.²¹ The mesh depends on identifying data owners and data stewards as a core data governance step and mapping relationships between those owners and various data domains, some of which may be master data.

Data fabric platforms connect data and processes by means of appropriate metadata to ensure the effective utilization of the organization’s data assets. The data fabric depends on current (centralized) data management tools, while a mesh architecture begins the shift toward distributed data services.²²

The pursuit of data products is a key driver of both nuances, with a data product being a structure that brings everything the business needs about the data together (e.g., in domains) to be able to produce value from it²³ either for a shorter-term initiative or for repeatable, longer-term initiatives.

Cloud Data Governance Is Not Equal to On-Premises Data Governance
It is critical to understand that data governance in the cloud is distinct from data governance and data management for on-premises data. Oversight and management of all policies, processes, procedures, standards, guidelines and RACI charts are possible on premises. In the cloud, many of these resources are out of reach and within the vendor’s ambit instead.

How can a client perform the full governance of its data management activities that are not all within the client’s reach, never mind its control? The answer to this question illustrates why it is so important—a matter of due diligence before any cloud contract is signed—to request the vendor’s relevant policies, processes, procedures, standards, guidelines and RACI charts. This will help determine the alignment between the vendor’s data governance and data management activities with those of the organization.

The due diligence process also presents an opportunity to determine how data governance and data management activities are (independently) audited. With a cloud migration, the responsibility for data cannot be delegated to the (cloud) vendor.

Conclusion

No article of this brevity can reflect, in any meaningful way, all the components of modern data governance and modern data management. In fact, it is sobering to realize how much has been omitted in this overview of some of the nuances that enable sustainable data governance and data management success. For example, little relating to security, privacy, compliance, integration (fusion), life cycles, curation, ingestion, preparation, analysis, ethics, culture or even crowdsourcing has been addressed, not to mention the big theme of global data governance with respect to the Internet and to global data flows. All have nuances that enrich the data environment of a fully functional organization that plans for the future but executes for the present.

There are many subtleties required for sustainable data governance and data management practice that are not discussed in various data management or data governance courses or frameworks available today. Twelve specific subtleties worth noting are:

Recognize the place of lore in the data landscape. Data represent only a fraction of the artifacts relevant to decision-making, but for lore, memories fade and experiences become the stuff of myth and legend.
Be mindful of the overall goals of data governance and data management to efficiently enable insights and reduce organizational risk to help ensure that an organization is sustainable.
Realize that initiating a data governance journey creates an expectation of more bureaucracy, which should be countered early by a custom business case specifically highlighting the positive effects for staff.
Understand that the data journey must be tightly aligned with the organization’s strategy. Too many claims of strategic alignment are made only by quoting a strategic objective that aligns with data. Strategic alignment is a much stronger discipline than this reflects.
Define the goals that the data governance and data management journeys are meant to archive. Many of these goals may simply be to counter blind trust in data and to prevent data-driven misunderstandings.
Understand that policies, processes, procedures, standards and guidelines do not exist independently of each other. The latter four artifacts must be aligned with and support policy. Items in the latter four artifacts that do not relate to a policy item should be questioned.
Understand SoD. There is a big difference between being accountable and being responsible, and the same person cannot fulfill both roles for the same activity.
Realize that a successful technology deployment is not equivalent to data governance or data management success.
Be aware that data governance automation can introduce more risk into the organization if the organization has low data maturity.
Recognize that data fabric architectures depend on operational metadata being in place, such as data dictionaries, data catalogs and business glossaries.
Know that data mesh architectures are decentralized. This can introduce architectural integration challenges if the predominant organizational architecture is monolithic and centralized.
Take the time to perform solid due diligence on any proposed cloud vendor as data governance in the cloud is an almost entirely different discipline from data governance on premises.

Mr. Spock says that computers make excellent and efficient servants, but that he has no wish to serve under them.²⁴ Humans have been comforted by the fabric of lore for thousands of years; they could never leave their fate to machines. Management should, therefore, revel in humans’ ability to blend data and lore in such a way that the most efficient decision can be reached, especially for complex decision-making scenarios. Indeed, humans do not need to depend totally on DDDM. To do so would make their decision-making indistinguishable from that of machines, forcing the untenable conclusion that there is no benefit to being human.

Humans have been comforted by the fabric of lore for thousands of years; they could never leave their fate to machines.

Endnotes

¹ Ohio University, Athens, Ohio, USA, “Five Essentials for Implementing Data-Driven Decision-Making,” 8 March 2021, https://onlinemasters.ohio.edu/blog/data-driven-decision-making/
² Olavsrud, T.; “Data Governance: A Best Practices Framework for Managing Data Assets,” CIO, 18 March 2021, https://www.cio.com/article/202183/what-is-data-governance-a-best-practices-framework-for-managing-data-assets.html
³ Soyer, E.; R. Hogarth; “Fooled by Experience,” Harvard Business Review, May 2015, https://hbr.org/2015/05/fooled-by-experience
⁴ Marsh, S.; “The Pros and Cons of a Data-Driven Corporate Culture,” Recruiting Blogs, 16 August 2018, https://recruitingblogs.com/profiles/blogs/the-pros-and-cons-of-a-data-driven-corporate-culture
⁵ Lerner, J.; Y. Li; P. Valdesolo; K. Kassam; “Emotion and Decision-Making,” Annual Review of Psychology, January 2015, https://www.annualreviews.org/doi/epdf/10.1146/annurev-psych-010213-115043
⁶ Ibid.
⁷ Wray, J.; “The Weight of Emotions on Decision-Making: A Comparative Analysis,” Inquiries Journal, vol. 12, iss. 9, 2020, http://www.inquiriesjournal.com/articles/1798/the-weight-of-emotions-on-decision-making-a-comparative-analysis
⁸ Bi-Survey.com, “Why Companies Make Decisions Without All the Relevant Information to Hand,” https://bi-survey.com/decision-making-no-information
⁹ O’Brien, E.; “We Use Less Information to Make Decisions Than We Think,” Harvard Business Review, 7 March 2019, https://hbr.org/2019/03/we-use-less-information-to-make-decisions-than-we-think
¹⁰ Bazerman, M.; D. Chugh; “Decisions Without Blinders,” Harvard Business Review, January 2006, https://hbr.org/2006/01/decisions-without-blinders
¹¹ Pennycook, G.; “System 1 vs. System 2 Thinking: Why It Isn’t Strategic to Always Be Rational,” Big Think, 29 April 2022, https://bigthink.com/the-well/system-1-2-thinking-fast-slow/
¹² Ibid.
¹³ Ibid.
¹⁴ Daum, K.; “Fifty Star Trek Quotes Inspiring You to Boldly Go Into Your Future,” Inc., 13 October 2016, https://www.inc.com/kevin-daum/50-star-trek-inspiring-you-to-boldly-go-into-your-future.html
¹⁵ Pearce, G.; “Data Resilience Is Data Risk Management,” ISACA^® Journal, vol. 3, 2021, https://www.isaca.org/archives
¹⁶ Op cit Olavsrud
¹⁷ Henderson, J.; N. Venkatraman; “Strategic Alignment: A Model for Organizational Transformation via Technology,” Center for Information Systems Research, MIT Sloan School of Management, Cambridge, Massachusetts, USA, November 1990, https://dspace.mit.edu/bitstream/handle/1721.1/49184/strategicalignme90hend.pdf?sequence=1
¹⁸ Pearce, G.; T. Gaffney; “Digital Governance: Closing the Digital Strategy Execution Gap,” ISACA Journal, vol. 4, 2020, https://www.isaca.org/archives
¹⁹ Basel Committee on Banking Supervision, “Principles for Effective Risk Data Aggregation and Risk Reporting,” Bank for International Settlements, January 2013, https://www.bis.org/publ/bcbs239.pdf
²⁰ Ibid.
²¹ Ibid.
²² Ibid.
²³ Ibid.
²⁴ Op cit Daum

GUY PEARCE | CGEIT, CDPSE

Has an academic background in computer science and commerce and has served in strategic leadership, IT governance and enterprise governance capacities. He has been active in digital transformation since 1999, focusing on the people and process integration of emerging technology into organizations to ensure effective adoption. Pearce maintains a deep interest in data and their disciplines that accelerated with the launch of his high school data start-up many years ago. He was awarded the 2019 ISACA^® Michael Cangemi Best Book/Author award for contributions to IT governance, and he consults in digital transformation, data and IT.

Home / Resources / ISACA Journal / Issues / 2022 / Volume 6 / Beware the Traps of Data Governance and Data Management Practice