Migration of Data
from a Legacy system to an Existing Salesforce System Using Talend
This document describes the steps involved in doing a data
migration from a Legacy system to a salesforce system.
Step 1: Understanding the Legacy System
Step 2: Understanding the Existing Salesforce System
Step 3: Identifying the Staging area.
Step 4: Analyzing the Staging data.
Step 5: updating the Salesforce system Data Model
Step 6: One common Codes and value Table
Step 7: Working on Salesforce Triggers and validations
Step 8: Talend’s Role in Data Migration
Step 9: perform the imports
1. Understanding the Legacy System:
Legacy is nothing but a very old application the Business is
owning and very confident about its functionalities for years. The Legacy
application runs under a Java platform and the back end of the system are in
Oracle database.
The first major step is to identify the connectivity to the
Legacy system. Once the connection and the data is available, we need to
understand the functionalities of the Current application and its supporting
applications.
Identify the entities in the legacy system and validate how
many of those entities really carry the Business related information.
Eliminating the unwanted entities during the initial analysis of the Source
system will always avoid confusions in the later stage of the project.
2. Understanding
the Salesforce System :
Migrating data to Salesforce is always challenging, and if
the salesforce system is in live , then the challenge is doubled.
Salesforce Backend follows two major policy, standard and
custom . The entities which are used common across all salesforce application
is called as standard entity and the entities which are designed particularly
for the business is called as Custom entity.
Since we have an understanding of the Source system , we
need to relate the Business of the source system to the Salesforce system . We
need to remember one thing throughout our analysis , “ The change is only on
the Application , not on the business “ . So we need to incorporate the source
system logic in to the existing salesforce logic.
Collecting the Entities and its related Business information
from an Salesforce perspective will provide information for analysis of data
migration.
3. Identifying the Staging Area :
Once the source and target system data is analyzed, next
step is to identify the staging area. In the case of the relational database,
then it is better to have the staging layer on top of the same database itself.
This eliminates the data loss and the quality of the data remains the same in
both source as well as the staging layer
If the Source system is a mainframe or kind of flat file
system, then try to build a Staging layer on a relational data base. So, here
comes the first job. Building the ETL for Source to Stage, Just like the normal
ETL concepts, this staging layer will again be a replica of the Source system.
4.
Analyzing
the Staging data
In an Data migration perspective,
it is not always necessary that all data in the source system is necessary for
the staging layer nor in the salesforce system.
Once the required entities are
available in the staging
The Challenging portion of the
Data Migration comes here in the entire project. The Salesforce data model
which is in Live will be having a completely different Data Model and we need
to fit the data from source in to the Existing Salesforce Data Model.
We need to identify the Salesforce
Objects,that can be rightly mapped to the source table.
Once the Source and Target
Entities are mapped , the next step is to do the field level mapping .
While doing the field level
mapping , we could fine more number of fields available in salesforce Objects ,
for which no matching field is identified in Source system . Those can be
ignored for initial analysis.
For the Fields, which we are able
to map can be mapped . For the one , which we can’t do the mapping , we need to
create new fields in salesforce Objects.
One common Codes and
value Table
It’s common that the codes and
values used in the source will have a different value in the target. It will be
tough to use the decode statement everywhere.
It would be better, if there is a
Lookup reference table created manually. Just like a table below. So , whenever
there is a need for the salesforce.
source table
|
sour column
|
value
|
sfdc object
|
sfdc field
|
field value
|
src_tab_1
|
src_col_1
|
1
|
Tgt_obj_1
|
Tgt_col_1
|
One
|
src_tab_1
|
src_col_2
|
2
|
Tgt_obj_1
|
Tgt_col_2
|
two
|
src_tab_1
|
src_col_3
|
3
|
Tgt_obj_1
|
Tgt_col_3
|
three
|
src_tab_1
|
src_col_4
|
4
|
Tgt_obj_1
|
Tgt_col_4
|
four
|
src_tab_1
|
src_col_5
|
5
|
Tgt_obj_1
|
Tgt_col_5
|
five
|
Talend’s Role in Data Migration:
When migrating from a more
complex source with multiple relational tables such as Access, MySQL, or MS SQL
Server, using data loader can be time consuming and error prone. This is where
Talend comes to the rescue.
·
Once a Talend job is set-up you can run it as
many times as you like and each run takes a matter of minutes. This allows an
iterative and flexible approach to data migration projects.
·
Talend comes with built-in connectors for pretty
much any data source you can think of (including SQL Server.)
·
Talend natively supports the Salesforce.com Bulk
Data API.
·
Talend is open source and completely free!
Working on Salesforce Triggers and validations:
Salesforce Application has got
two major constraints with regard to data migration. Validation rules and
triggers.
Triggers:
These are functions, which are
built in salesforce application that will get started on certain action. In
case, the Data migration is hitting any of the fields which are part of these
triggers, then these records will be rejected during the load. So the key thing
for migration is , SWITCH OFF the Triggers.
Validation Rules:
Validation
Rules comes in to picture , only for a existing salesforce system . Incase of
populating certain information , we may need to populate few extra more
information which is not available in the source system to migrate. It’s the
Data Analyst job to hardcode few areas , which looks valid and could be
populated to salesforce. This way we will not violate the validation rules.
Perform the Imports :
The
Final portion is to push the data to salesforce. The final data in the
salesforce Replica schema will be the source and the target will be the
salesforce system.
Incase
of Existing system, we need to migrate certain existing information.So we need
to go for Upserts in few loads. Upsert in salesforce can be done, only when we
have a external id field in salesforce. It is more like a primary key in
Traditional Database system.
There
are cases , where we can find the Third party ETL Tools like Talend
,Informatica , etc.. not working as expectrd. In such cases , we can always depend
on the tool Apex Data Loader ( Salesforce Native Tool) to do the same insert
activity only if it is a simple insert.




No comments:
Post a Comment