data management

Master Data Management. A way to gather data in to a central data hub for a) single point of truth b) summarized data c) historical analysis

Customer Data Integration. Includes a) uniquely identifying a customer b) arbitrary grouping of customers c) deciphering their relationships to other customers and groups

No primary keys across

Differences in attribute definitions: definitions are not the same or different

Differences in parent child attributes: relationships are not the same or different

Although they run in batch mode some can do near real time. Keep data hub in sync with operational stores

Is there a difference between a data hub and EDW?

Search for: Is there a difference between a data hub and EDW?

book: MDM and CDI for a Global Enterprise by Berson/Dubov

what is a good way to use ODS and EDW effectively?

Provides a virtualized view of a customer without creating a persistent physical image of the aggregation, perhaps using SOA or ETL

A customer may operate outside of an account or accounts. This will force identifiers tied to a customer independent of unique account numbers. This needs to be thought of in customer interactions.

what strategies would you use to expose numbers to customers? will that be a customer number or account numbers?

Find a primary key based on partial or full attributes

Discover who else is related to a given customer similar to google pickinup relevent ads for a given email or content

Being able to generate a unique key based on attributes

could be the full transactional data hub and a completely self contained master of the information it manages...

why not stick to one database then???

Why a key generation service?

why not use database generated keys?

it may have partial data for a client, which means the rest must come from somewhere else

it can be updated by clients and not just for reads. It may have to propagate that data to sources where they got originally loaded from. hub-to-source integration

Data may be updated in an ODS requiring a sync to the data hub.

what is subject area in the context of a metadata repository?

Search for: what is subject area in the context of a metadata repository?

This is a meta data table where every record in the MDM is linked to their dependent ODS records through foreign keys. Transactional safety is important.

hub is not the source or owner for any entities or attributes. It justh holds references to other ODSes

Data hub owns part of the data and changes to that data should be synchronized.

owns all data attributes becoming the true master in that space and propagates data up and down.

initial loads and delta loads are common strategies

You may want to implement unidirectional synching as opposed to bidirectional.

Search for: Compensating transactions may be necessary

Master/slave relationships may be better in defining ownership of attributes or entities. with out that bidirectional synching could get hairy.

Single ownership on a single data attribute is preferable.

How does transactional, summary, and historic elements work together in MDM?

Search for: How does transactional, summary, and historic elements work together in MDM?

In some extreme cases some data attributes have many masters. This may be queried from an attribute location service.

Metadata: recognize and address the challenge of semantic integration

The required business process granularity to be defined should be at the level of detail that is sufficient to define the logical data model of the CDI solution.

quote

Deliver an authoritative system of record for customer data that includes a complete, 360-degree view of customer data including the totality of the relationships the customer has with the organization.

Should DataHub and EDW the same effort?

The authors of the above book seem to think otherwise. They state

If the data warehouse is not available yet, we do not recommend mixing the MDM-CDI data hub project and a data warehousing effort, even though interdependencies between the two efforts should be well understood

They go on to say that CDI data hub will feed the EDW when one is in place.

The Data Model Resource Book

Vol 1 A library of Universal Data Models for All Enterprises

Vol 2 A library of data models for specific industries

by Len Silverston

Search for: Len Silverston data models

Evolving the CDI data model over multiple releases needs consideration

OASIS
XCRL
HL7

Contrasting MDM and Warehousing again

a) Data is cleansed and during extract, transform, and load b) sources data to build a well-defined data subject area or domain to serve a particular set of applications

a) sales data ware house b) financial data ware house c) product data ware house

What do you do with changing data then in a data ware house?

solving data quality issues in an integrated fashion across the enterprise...

what on earth does that mean??

in cdi data not just during the load but also in the process of matching and linking, identification and aggregation, and data synching and reconciliation

Data hub is "referential" the data quality is focused on maintaining references as opposed to the actual content.

Data warehouse can not be referential like a data hub could be. is that a true statement?

dataware house in unidirectional where as CDI bidirectional. what do they mean by that?

Search for: Who is Larry English and what are information quality principles

Once a day is a norm

Instead they want customer information available in real time..

is that the only difference? what if I make the dataware house real time? can I? why not?

DataExtend another product

They seem to suggest a referential hub for legacy corporations

They are not that forthcoming outrightly to suggest a "transactional hub". Not sure why.

a) record b) exception processing c) compensating transactions d) composite and complex transactions

and that it is still a young technology

have to be adaptable to the business model, rules, and semantics of a given industry.

MDM solutions are optimized for real-time operations and not for batch reporting. An important distinction.

A datamart is being recommended for reporting needs

A business domain specific, proven data model, should be at the top of the CDI data hub criteria.

There are some solutions however which help you to realize any data model using meta models and then allow you to generate some base level services.

Enterprise Message bus - canonical message transport
Transaction Manager - record, exception management
Identity resolver
Record locator
attribute locator
Distributed Query Constructor

1. Have transaction manager receive and persist

2. Identify the keys

3. Do a distributed query after locating systems and their access paths

SOA doesn't show up much

Emerging leader in the CDI Data hub space. b) customer specific customizable data structures c) a few hundred basic web services d) real time services to change address, roles, relationships, grouping, alerts, matching, duplicate suspect processing. e) composite transactions using web services f) pub-sub g) business object model h) interfaces i) Batch framework j) originally developed for insurance industry k) currently used in financial and others

a) meta-data driven b) works with any data model c) bath and real time synching with metadata d) Hierarchy manager for relationships e) cleanse and match f) can use external such as trilium g) MetaMatrix data services registry style h) originally for pharmaceutical industry

a) reference-hub b) meta-data driven federation c) real time d) organizational hierarchies e) web services f) flow-invocation g) federated data retrieval h) view records across lines of business i) auditor rolesre j) reporting database k) originally from medical records

Siebel Universal Application Network

Siebel Universal Customer Master

a) Customer data hub b) financial consolidation hub c) Product hub

a) analyzing customers for relationships b) Correlation Engine c) exception management d) web services

How does purisma support bi directional synching?

Search for: what is netweaver SAP?

Search for: SAS DataFlux

a) master customer reference database b) data quality ui and exception ui c) batch and real time sync d) rules in metadata

Search for: MultiVue

Search for: Object River

a) model driven b) enables any data model based on meta model definition c) generates all CRUD stuff d) soa e) generates basic web portal f) pub/sub on data changes

Search for: what is data profiling informatica

Search for: Similarity Systems

Search for: Acxiom an address database

Search for: Experian people database

Search for: Data Delta

Search for: Netrics

Search for: Exeros

CDI may continue to be vertical