Salesforce Data Migration: August 2014

Migration of Data from a Legacy system to an Existing Salesforce System Using Talend

This document describes the steps involved in doing a data migration from a Legacy system to a salesforce system.

Step 1: Understanding the Legacy System

Step 2: Understanding the Existing Salesforce System

Step 3: Identifying the Staging area.

Step 4: Analyzing the Staging data.

Step 5: updating the Salesforce system Data Model

Step 6: One common Codes and value Table

Step 7: Working on Salesforce Triggers and validations

Step 8: Talend’s Role in Data Migration

Step 9: perform the imports

1. Understanding the Legacy System:

Legacy is nothing but a very old application the Business is owning and very confident about its functionalities for years. The Legacy application runs under a Java platform and the back end of the system are in Oracle database.

The first major step is to identify the connectivity to the Legacy system. Once the connection and the data is available, we need to understand the functionalities of the Current application and its supporting applications.

Identify the entities in the legacy system and validate how many of those entities really carry the Business related information. Eliminating the unwanted entities during the initial analysis of the Source system will always avoid confusions in the later stage of the project.

2. Understanding the Salesforce System :

Migrating data to Salesforce is always challenging, and if the salesforce system is in live , then the challenge is doubled.

Salesforce Backend follows two major policy, standard and custom . The entities which are used common across all salesforce application is called as standard entity and the entities which are designed particularly for the business is called as Custom entity.

Since we have an understanding of the Source system , we need to relate the Business of the source system to the Salesforce system . We need to remember one thing throughout our analysis , “ The change is only on the Application , not on the business “ . So we need to incorporate the source system logic in to the existing salesforce logic.

Collecting the Entities and its related Business information from an Salesforce perspective will provide information for analysis of data migration.

3. Identifying the Staging Area :

Once the source and target system data is analyzed, next step is to identify the staging area. In the case of the relational database, then it is better to have the staging layer on top of the same database itself. This eliminates the data loss and the quality of the data remains the same in both source as well as the staging layer

If the Source system is a mainframe or kind of flat file system, then try to build a Staging layer on a relational data base. So, here comes the first job. Building the ETL for Source to Stage, Just like the normal ETL concepts, this staging layer will again be a replica of the Source system.

4. Analyzing the Staging data

In an Data migration perspective, it is not always necessary that all data in the source system is necessary for the staging layer nor in the salesforce system.

Once the required entities are available in the staging

Step 5: updating the Salesforce system Data Model

The Challenging portion of the Data Migration comes here in the entire project. The Salesforce data model which is in Live will be having a completely different Data Model and we need to fit the data from source in to the Existing Salesforce Data Model.

We need to identify the Salesforce Objects,that can be rightly mapped to the source table.

Once the Source and Target Entities are mapped , the next step is to do the field level mapping .

While doing the field level mapping , we could fine more number of fields available in salesforce Objects , for which no matching field is identified in Source system . Those can be ignored for initial analysis.

For the Fields, which we are able to map can be mapped . For the one , which we can’t do the mapping , we need to create new fields in salesforce Objects.

One common Codes and value Table

It’s common that the codes and values used in the source will have a different value in the target. It will be tough to use the decode statement everywhere.

It would be better, if there is a Lookup reference table created manually. Just like a table below. So , whenever there is a need for the salesforce.

source table	sour column	value	sfdc object	sfdc field	field value
src_tab_1	src_col_1	1	Tgt_obj_1	Tgt_col_1	One
src_tab_1	src_col_2	2	Tgt_obj_1	Tgt_col_2	two
src_tab_1	src_col_3	3	Tgt_obj_1	Tgt_col_3	three
src_tab_1	src_col_4	4	Tgt_obj_1	Tgt_col_4	four
src_tab_1	src_col_5	5	Tgt_obj_1	Tgt_col_5	five

Talend’s Role in Data Migration:

When migrating from a more complex source with multiple relational tables such as Access, MySQL, or MS SQL Server, using data loader can be time consuming and error prone. This is where Talend comes to the rescue.

· Once a Talend job is set-up you can run it as many times as you like and each run takes a matter of minutes. This allows an iterative and flexible approach to data migration projects.

· Talend comes with built-in connectors for pretty much any data source you can think of (including SQL Server.)

· Talend natively supports the Salesforce.com Bulk Data API.

· Talend is open source and completely free!

Working on Salesforce Triggers and validations:

Salesforce Application has got two major constraints with regard to data migration. Validation rules and triggers.

Triggers:

These are functions, which are built in salesforce application that will get started on certain action. In case, the Data migration is hitting any of the fields which are part of these triggers, then these records will be rejected during the load. So the key thing for migration is , SWITCH OFF the Triggers.

Validation Rules:

Validation Rules comes in to picture , only for a existing salesforce system . Incase of populating certain information , we may need to populate few extra more information which is not available in the source system to migrate. It’s the Data Analyst job to hardcode few areas , which looks valid and could be populated to salesforce. This way we will not violate the validation rules.

Perform the Imports :

The Final portion is to push the data to salesforce. The final data in the salesforce Replica schema will be the source and the target will be the salesforce system.

Incase of Existing system, we need to migrate certain existing information.So we need to go for Upserts in few loads. Upsert in salesforce can be done, only when we have a external id field in salesforce. It is more like a primary key in Traditional Database system.

There are cases , where we can find the Third party ETL Tools like Talend ,Informatica , etc.. not working as expectrd. In such cases , we can always depend on the tool Apex Data Loader ( Salesforce Native Tool) to do the same insert activity only if it is a simple insert.

Regards,

Naveen
ETL Data Analyst

Wednesday, August 6, 2014

Migration of Data from a Legacy system to an Existing Salesforce System Using Talend