THE ESSENCE OF ARTICLE
OVERVIEW The availability of timely and accurate data is an essential element of the everyday operations of many organizations. Equally, an inability to capitalize on data assets will have farreaching effects. Organizations may incur unnecessary costs, operational delays, or be exposed to risks that could have been mitigated if effective data management processes were in place. This poses a particular problem for organizations that have limited capacity to tackle large data governance initiatives, but acknowledge the need to put these processes in place. In our Introduction to Data Governance we described the generic processes related to data governance without consideration of organizational size, team structures, information system architecture, timelines, or directives. In reality, many organizations face severe constraints in one or more of these areas. Taking cognisance of these limitations, this paper proposes a pragmatic approach to data governance and seeks to answer the following questions: What are the essential processes that I need to put in place to build a platform for future governance activities? What tools can I use to achieve effective data governance? What are the key governance controls that I need to make use of to ensure better data quality? Which roles are relevant? What metrics can we use to measure our activities, and enforce a culture of continuous improvement? 2. 3.
THE ESSENTIALS OF The figure below outlines a set of iterative steps and related deliverables that should enable an organization to implement or improve their current data governance. STRATEGY POLICIES STRATEGY The first step is to constitute a data governance team with a clearly defined strategy before embarking on any governance projects (Aiken, 2017). The strategy should provide the stimulus and direction for the change required by the organization to achieve an improved state of data governance. For the change to succeed, consider the following five aspects when drafting the data strategy: A vision: defining exactly what it is you are trying to achieve The skills required to drive the projects / objective in support of the vision Incentives that speak to why the change should take place Resource availability An action plan to achieve objectives LEAD DOMAIN MANAGERS WORKFLOW ANALYSIS DEFINE ROLES AND RESPONSIBILITIES LEAD Tools such as a charter, scope statement, and implementation roadmap provide additional context to achieve the strategy. STEWARDS REACTIVE MODEL DEFINITION MASTER ERDS - CONCEPTUAL, LOGICAL, PHYSICAL (STRUCTURE) FLOW DIAGRAMS (BEHAVIOUR) INTEGRITY VALIDITY TIMELINESS PROACTIVE RISK ANALYSIS / SWOT ASSESSMENT QUALITY COMPLETENESS STRATEGIC GOALS CONTROLS APPROVAL ISSUE RESOLUTION RELATIONAL CONSTRAINTS PREVENTATIVE MEASURES SKILLS INCENTIVE RESOURCES ACTION PLAN CONFUSION VISION INCENTIVE RESOURCES ACTION PLAN ANXIETY TOOLS: COMMON MATRIX TOOLS: ACTIVITY MATRIX / RACI MATRIX LOGGING DECISIONS REPORTING BUSINESS RULES QUALITY ASSURANCE VISION SKILLS RESOURCES ACTION PLAN GRADUAL CHANGE VISION SKILLS INCENTIVE ACTION PLAN FRUSTRATION VISION SKILLS INCENTIVE RESOURCES FALSE STARTS SUPPORTING : GLOSSARY VISION SKILLS INCENTIVE RESOURCES ACTION PLAN CHANGE SUPPORTING : DICTIONARY Figure 3: Adapted from Aiken, 2017 - illustrates the potential results when any of the components are not in place. 4. 5.
POLICIES & PROCEDURES The next step is to draft the data governance policies with high-level statements of intent relating to the functioning and management of data. A useful strategy in defining these policies may be to first focus on the areas of difficulty or risk that the organization is currently experiencing, and then to prioritize and evaluate the courses of action that will assist in addressing these. The DAMA standard categorizes functional areas of data management and can greatly assist with the categorization of policies. Data policy definition is an iterative process, with continued improvement of the policies based on the outcomes of data governance efforts. For example, analyzing workflow processes may uncover internal risks to the quality of the data. In response, policies and procedures are updated to prevent or reduce this from occurring again. WORKFLOW ANALYSIS The workflows, data steward behaviours, and flow of data throughout an enterprise will have the most significant impact on data quality. It is vital to understand and document the flow of data, and any processes that act upon data. These processes can be modelled using a standard notation such as Business Process Modelling Notation (Juric & Pant, 2008) or UML Activity Diagrams. Both standards enable processes to be modelled with swim lanes, which is a highly effective way to display business domains, teams or any logical grouping of entities. In addition, detailed system architecture diagrams enrich the understanding of workflows. DEFINE ROLES AND RESPONSIBILITIES At this stage, roles and responsibilities can be determined. Given an understanding of the workflows, business domains and processes that act on data, the Data Governance Lead can now identify the key stewardship roles, including the operational data stewards and data owners. These roles will be essential to drive the data strategy forward and maintain momentum. MODEL DEFINITION Only now do we recommend starting to work with the data itself. Whilst workflow analysis described the behaviours that affect data, data models will describe the structure of data. Data model definitions typically take one of three forms, namely: Conceptual data models that represent a very high level of detail, show only basic relationships between entities and data sets, and are useful in design or analysis processes without reference to details as to their implementation. Logical models may include more detail such as attributes, primary keys and foreign keys to model the relationship between entities. Physical models describe the previous two models at the lowest level of detail including table names, columns names and column data types. These are typically modelled using UML ERDs or Crows Foot notation. They will show a high level of detail for relational attributes. MASTER At this stage, it is important to understand what the primary sources of Master Data are, and how that data is impacted. If the potential exists for multiple sources to have an impact on data relating to key business entities, then the processes acting on the data must be fully understood, in order to put the necessary controls in place. Another area where governance is of particular importance is data enrichment. Governance controls are essential to ensure the quality of data when enrichment has taken place. This could be through external sources, derived data, or manually modified data in order to accommodate new requirements. A simple mechanism the governance team can make use of is to introduce a review and approval process as a control measure. RISK ANALYSIS / SWOT ANALYSIS Given the information that has been gathered to this point, the Governance Lead or governance working group can begin to analyze the system and processes for opportunities to address threats or weaknesses; capitalize on opportunities; or leverage strengths within the existing processes. The outcome of this analysis will provide a prioritized view of the governance controls to address. With reference to the Capability Maturity Model (see our Introduction to Data Governance) generally, processes that are least mature and pose the greatest risk to an organization should be of primary importance in prioritizing the strategies for governance controls. 6. 7.
CONTROLS Governance controls are essential to bring about improvements in data governance and give visibility and meaning to the data strategy. The type and design of these controls will depend on the specific issues identified during the analysis phases but wherever possible, automated data quality management should be a top priority. In addition, consider the following processes and controls to improve the quality of data and processes: INTERFACES Facilities for the management of data need to be understood and reviewed. If the manual management of data introduces risks to quality, then processes must be in place to mitigate the risks. If the same data is managed in multiple locations - such as spreadsheets as well as domain-specific applications then an effort needs to be made to either integrate the efforts or mitigate risks relating to differences in the management of that data. APPROVALS In order to achieve an acceptable level of quality or satisfy a specific requirement, domain or subject matter experts validate and approve the data. Visual representations of data through reports or dashboards enable the accountable user to quickly confirm the data quality. Clear accountability is vital to the approval process. All data within the system must have an owner or responsible team that understands the business requirements relating to the data. ISSUE ESCALATION AND RESOLUTION If a data steward becomes aware of data or process risk, they must be able to reliably log the error for investigation and resolution. If a particular scenario cannot be resolved via a standard business process, they must be able to escalate the issue to an appropriate level for further analysis. LOGGING It is important to log key events relating to data and processes. Not having a record of how or who made the modification to data, undermines any efforts to improve the data and related processes. This also makes business related teams highly dependent on the technical capacity for the resolution of problems. AUTOMATED REPORTING The governance team should give consideration to putting automated reporting processes in place. Automated reporting can be an exceptionally effective tool to monitor system states and inform the governance team about risk events. Ideally, the reporting controls should inform users about the ongoing state, as well as unique events that are outside of normal operating conditions. This applies to both the business domain context as well as the technical operations. If business dependencies will be impacted by a technical issue then the governance team needs to know about this. For example, certain outputs are expected to be generated within specific timelines. If timelines relating to those outputs are monitored, the governance team can quickly be informed when a tolerance level has been breached. DECISION LOGIC AND CONTROLS Workflows by their nature may have conditionality, divergence and convergence, and decision points. The implementation of governance controls will result in additional control points in order to evaluate outcomes and state, or to facilitate manual intervention. At the other end of the spectrum, the workflow analysis phase could identify the need to eliminate redundancy and duplication of effort. MATRICES AND PROFILES Data Matrices are invaluable references when analyzing data for subject areas, business domains, imports and exports. Data profiles add to this additional detail relating to the context of the data. This includes information such as data types; transformations that may occur as part of load processes, and transformations on export. 8. 9.
THE ESSENCE OF STRATEGY POLICIES WORKFLOW ANALYSIS DEFINE ROLES AND RESPONSIBILITIES MODEL DEFINITION MASTER PROCESS RISK ANALYSIS / SWOT ASSESSMENTS Figure 4: Iterative Cycle of Data Governance Improvement CONTROLS PLAN, DO, REFINE, REPEAT Data governance is a journey not a destination. In conclusion, it is important to note that the steps above are iterative in nature. After each iteration or at defined intervals, key lessons are noted and refinements made. This constant feedback loop will allow for quick wins, ensure that the data governance programme considers organisational changes and remains aligned with stakeholder expectations. In a subsequent article, we will make use a real case scenario to show how each of the steps above can be practically applied to achieve better data governance. References 1. Seiner, R. S. (2014). Non-Invasive Data Governance. Technics Publications 2. The DAMA Guide to the Data Management Body of Knowledge Enterprise Server Edition. (2009). Technics Pubns Llc. 3. Aiken, P. (2017). Data Strategy and the Enterprise Data Executive. Technics Pubns Llc. 4. Berson, A., & Dubov, L. (2011). Master data management and data governance. New York: McGraw-Hill Professional. 5. Seiner, R. S. (2017, May 17). How Is Non-Invasive Data Governance Different? Retrieved July 04, 2017, from http://tdan.com/non-invasive-data-governance-2/17265 6. Juric, M. B., & Pant, K. (2008). Business process driven SOA using BPMN and BPEL from business process modeling to orchestration and service oriented architecture. Birmingham, U.K.: Packt Pub. 10. SOUTH AFRICA 2nd Floor, Albion Springs 183 Main Road Rondebosch, 7700 Cape Town T: 27 21 685 9157 E: info@infovest.co.za UNITED KINGDOM Mansel Court Mansel Road Wimbledon London SW19 4AA T: 44 (0)20 8410 9876 E: info@infovest.co.za UNITED STATES 100 High Street Suite 1550 Boston MA 02110 T: 1 617 692 1150 E: info@infovest.co.za