Enterprise Apps


A New Approach to Customer Data Integration, Part 2

Customer data integration (CDI) is becoming important for enterprises who want to target, acquire, develop and retain customers. In order to benefit from it, however, an enterprise needs to create a unified and comprehensive customer view from all disparate data sources — including CRM, financial, product and external data services. Once integrated, unified customer views provide the entire organization with the ability to drive meaningful business action within and across operational systems.

Last week, we took a look at how to create neutral, enterprise-wide customer data models and rules-bases frameworks. This week, we’ll examine why CDI systems need to ensure cell-level survivorship, manage different data types, and provide proper tools for all users.

Cell-Level Survivorship

As part of its mandate to provide unified views of customers, a CDI solution must be able to determine the most reliable data — at the cell or attribute level — from multiple sources and to ensure survivorship of this data over time. Without this, the CDI solution will not be able to provide the “best version of truth” for the customer data across the organization and will fail as the system-of-record for other applications.

CDI solutions provided by application vendors have limited capabilities in this area. Typically their solutions do not support selection of the most reliable data at the cell-level, instead allowing selection only at the record level. Where CDI hubs do allow for cell-level selection, it is only through static rules that do not capture the dynamic nature of reliability. For example, a phone number not updated in three years is not trustworthy even if it comes from a source that is highly trustworthy. And they do not capture all the factors that can affect the reliability of data from a data source and hence provide poor rules for selecting the best version of the truth.

A complete solution for supporting dynamic cell-level survivorship must include a mechanism for measuring the confidence factor associated with each cell — based on its source system, change history and other business rules. The mechanism must take into account the age of the data, how much its reliability has decayed over time, and the validity of the data based on its format and completeness. The solution should permit the confidence factor to be assigned at the cell-level.

For instance, consider a situation in which source system A is more reliable for “customer name,” but source system B is more reliable for “phone number.” Before merging or updating the records in a good CDI solution, the system would be able to calculate and compare the current confidence level for each cell on each record and determine the most reliable data value that needs to be maintained in the consolidated “best of breed” record. This will ensure that the consolidated record represents the best version of the truth from all the available data sources across the organization.

Hybrid Treatment

An ideal CDI solution must recognize that there are three very different types of customer data that exist within an enterprise:

  • Master reference data is the foundational entity data (such as name and address) that is critical for uniquely identifying a customer across multiple systems and channels.
  • Relationship data defines the relationships, affiliations and hierarchies among various entities (such as individual to organization, organization to organization, or individuals within households).
  • Activity data is the data generated by customer activities (such as a financial transaction or a call-center interaction) as well as data derived or inferred from such activities (such as aggregated account balances or a customer profitability index).

Each of these data types has separate characteristics and challenges and therefore requires different treatment within the CDI solution. For example, reference data that exists in every system or data repository is often conflicting and does not have a system of record. On the other hand, activity data usually does have a system of record, and therefore there is little conflict in reconciling it. Also, while reference data is a very small subset of customer data and hence has smaller volume, the volume for activity data can be very large.

Relationship data can be managed effectively only after the underlying conflicts of reference data have been resolved. Relationship data also requires more visualization tools to display the complex relationships between entities. This means that a CDI solution must first build a persistent and trustworthy hub for reference data that can serve as the system of record before aggregating other types of data. Also, while it is required to aggregate reference data and relationship data in a persistent hub, not all activity data needs to be brought in to this hub but may simply need to be associated with the corresponding aggregated reference data. Thus, by offering the most appropriate solution for each data type, a CDI solution can provide the most reliable, efficient and scalable platform for an enterprise.

None of the CDI solutions offered by the application vendors provides separation of data types. Their applications do not need to treat these data types differently; consequently, the underlying data model and associated tools are not built to support separation of these data types. This burdens the CDI solution with a lot of unnecessary data and adds more complexity to an already complex data management process.

Manageability and Exception Handling

A complete CDI solution must provide a rich set of easy-to-use GUI tools for all its users (system designer, administrator, data steward, etc.) to manage the system implementation as well as the entire life cycle of the data and its exceptions. These tools must be a central aspect of the CDI solution and meet differing requirements of its users based on their roles. For example, data administration tasks (rules management, system configuration, data transfers, etc.) require a higher level of technical expertise and a different set of tools, while data content management tasks (data exception management, data audits, etc.) require less knowledge of the system but more easy access to the data and the metadata and hence require a different set of tools. By providing unique tools to address all these tasks, a CDI solution can significantly increase the productivity of its users, make the solution extremely manageable, and lower the cost of ownership for the entire system.

Application vendors often do not provide the comprehensive data stewarding consoles to manage the life cycle of data and exceptions. This is due to the fact that the application vendors have primarily built tools for end users and do not understand the requirements of a data steward. Therefore, enterprises need to invest additional IT resources and custom development that result in higher implementation and maintenance costs leading to higher total cost of ownership.


CDI is a strategic investment for most enterprises because of its impact on many facets of their business. While it is natural for enterprises to consider CDI solutions from application vendors because of their existing relationships and past successes with other process improvement exercises, this two-part article highlights some critical limitations of these CDI solutions that need to be assessed carefully before choosing such a solution.

A rigid system that is hard to adopt and even harder to adapt to ongoing changes can cost a lot more than initially planned and diminish the impact if it cannot capture and manage all the customer information present in the enterprise.

Anurag Wadehra is the Vice President of Marketing at Siperian Inc., a leading customer data integration and management provider. For further information, contact [email protected] or visit www.siperian.com.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

CRM Buyer Channels