Monday, March 30, 2015

EDW 2015: Data Governance Organization


Pharmaceutical model
Janet Lichtenburger and Nancy Makos presented a terrific view of how Data Governance works at their company. They implement functional (small) data governance, with the exception of data sensitivity.

Here is my brain dump of the session today. You're welcome to it if you can make heads or tails of it. :-)

Some numbers that show the challenge of Data Governance there are:
  • 5 people in Enterprise Architecture
  • 3 people in Data Governance
  • 130 people provide Data Stewardship support
  • 350k total employees

Data Governance is about Policies and Standards and is typically independent of implementations, as part of the Enterprise Architecture or Finance groups.

To encourage adoption, Data Governance could be considered an internal consulting service to support projects ,that is not charged back.

Organizational Model
Overall Model
  1. Enterprise Data Governance Executive Committee
    1. Meets only a few times a year
    2. Limited number of senior executives
  2. Data Governance Committees
    1. Each committee is chaired by Data Goverenance
    2. Divided into specific domains
    3. Meets as often as required by projects
    4. Up to 15 people on each Domain Specific Committee.

Data Steward Roles
  • Executive Stewards
    • Member of the Data Governance Executive Committee
    • Strategic Direction
    • Authority
  • Enterprise Stewards
    • Member of the Data Governance Committee
    • Development Support of Data Governance Policies
  • Operational Stewards
    • Communicates to promote policies
    • Endorses Data Standards
  • Domain Stewards
    • Recommends canonical structures
    • Endorses Data Standards

However the model is constructed, it must make sense for the business. All new policies should be considered from a Cost / Benefit analysis, the exception is with regulatory requirements. Regulatory and Legal compliance are critical to avoid jail.

Policies
The goal of policies is to drive behavior changes needed for Enterprise Information Management to succeed.

People generally don't like change, a way to get buy-in is to amend existing processes instead of creating new ones. These are smaller, less intrusive and stakeholders have already been identified. This also leads to more partnerships and shared endorsements of the changes.

There are typically a small number of extremely high value areas, policy should focus upon those.

Align with the Enterprise goals and other Enterprise ranging groups, there are a lot of shared concerns and ways that the teams can support each other.

Keep the policies easily accessible, do not hide them in a 500 page volume, instead keep them somewhere easily discoverable, such as a wiki on the corporate intranet.

Policies that are overly broad and not enforceable can quickly cause legal / compliance problems. In those cases, no policy is a better choice than an unenforceable one.

Data Classification
The source and type of data both define the data classification required. Similarly, data from several, more open sources can be combined to escalate the protection required.

  • Restricted
    • Financial information, such as credit cards
  • Protected
    • Regulated information, such as HIPPA data
  • Private
    • Named Persons
  • Internal
    • Business Data
  • Public
    • Everything Else

The data classification levels are combined with Information Security levels for systems to identify where the data is able to be transmitted.

Anonymizing Data
Safe Harbor
Remove 18 identifying attributes from the data, which renders it fairly useless.

Expert Determination
Expert certifies that the data available is too low of a probability to re-identify an individual.

Automation
There are tools out there, such as Parit(?) that are capable of doing this automatically after a survey and analysis of the data.

Standards Examples

USPS Publication 28 specifies international address requirements
ISO15836 standards for tagging unstructured documents
ISO/IEC 11179 -4 and -5 has naming standards for business metadata

Sunday, March 29, 2015

EDW 2015: Data Strategy

This session was presented by Lewis Broome (from Data Blueprint) and Brian Cassel (from the Massey Cancer Center).

Lewis presented a his strategic model and roadmap using the case study of a logistics company implementing an ambitious project.

Brian spoke about the challenges he faced within the cancer center as he implemented analytics across data hidden in silos. This was primarily culture based but once he was past that he was able to use the existing Data Analytics hub to build a specialized data mart to support strategic review of the data.

This was really great stuff and my summary doesn't do it justice by a long shot.
 
Data Strategy
Business Needs
In order to get anywhere with discussions about data and mays to improve it throughout the organization, the value of the effort has to be made clear. Clean data may seem like the most obvious need in the world, but that view is too low level to make it on to the radar of senior management. Instead, it needs to clearly address a business need.

There are three aspects to consider
  • How will mesh with the company Mission and Brand Promises?
    • Ex. FedEx: Your package will get there overnight. Guaranteed.
  • Does it improve the company's market position / provide a competitive advantage?
    • Michael Porter's Market Positioning Framework and his Competitive Advantage Framework provide a good way to think about this.
  • Will it improve the operating model and support the company's objectives?
    • Operating models improve by changing the degree of business integration or standardization.

If the data changes do not address any of these areas, it will not gain the support needed to succeed.

New capabilities that do not meet a business need aren't a program, they are a science project.

Current State of the Business
The current state assessment looks at
  • Existing Assets and Capabilities
  • Gaps in Assets and Capabilities
  • Constraints and Interdependencies
    • This can be the toughest stuff to identify.
    • BEWARE SHADOW SYSTEMS typically excel spreadsheets with macros or access data fixing before feeding it into the next step of a process.
  • Cultural Readiness

Cultural Readiness
Cultural Readiness depends on 5 different areas
  • Vision
    • A clear message of what the program is expected to achieve
  • Skills
    • Ensure that the right people are part of the program
  • Incentive
    • The value and importance of the program should be clear to all of the participants
  • Resources
    • Backing the program will require more than just good will, tools, environments and training may all be required
  • Action Plan
    • The system boundaries being developed should be clearly defined

Capability Maturity Model Levels
  1. Starting point
    1. There is some data in a pile over there.
  2. Repeatable Process
    1. This is how we sweep the data into a pile and remove the bits of junk we find.
  3. Defined Process
    1. Sweep from left to right, avoid the dead bugs. Leave data in a pile.
  4. Managed Process
    1. The entire team has the same brooms and the dead bugs are highlighted and automatically avoided by the brooms.
  5. Optimizing
    1. Maybe we can add rules to avoid sweeping twigs into the pile as well.

Roadmap
The Roadmap establishes the path of the Data Management Program to achieve the strategic goals.

Leadership and Planning
    • Planning and Business Strategy Alignment
    • Program Management
    • Clearly Defined Imperatives, Tactics and KPI
    • Accountable to CDO
Project Development
    • Outcome Based Targets
    • Business Case and Project Scope
    • Program Execution

Project Model
Big Projects tend to fail, at least twice sometimes more than that as the business learns what it really needs.

Always start with crawling and walking before going to running.
  • Governance should start with a small 'g', where it matters most. There are commonly 5-10 critical data elements, take care of those before setting targets higher.
  • Data Strategy as top-down approach works best. Otherwise it is uncoordinated and is only capable of supporting tactical initiatives.
  • Data Architecture must focus on the business needs, not individual systems or applications.

Saturday, March 28, 2015

Questions and where to seek answers

There always seems to be an unending stream of questions.

In the past, a good number of them were about things like "How can we use this shiny new feature?" "What are the best practices for these scenarios?".

However, times change and so do the questions.

Microsoft has been following the lead of the SQL Server community and steadily become more open with road maps and dialogs. We understand the platform and internals better than ever before, making it easier to address concerns.


Now the questions that need to be addressed are more strategic than tactical. Strategies, by their very nature are more challenging to find peers to discuss success and (more importantly) failures in plans and their implementations.

I'm attending Enterprise Data World this year to have a chance to discuss ideas, strategies and technologies to bring new insights and make things run better than ever.

This will be fun!