Thursday, 11 December 2014

Change Data Capture in Informatica

Change Data Capture can be used for small data integration projects which include very less workflows. We need to use Informatica mapping variable to implement the change data capture.

We will be using Informatica Mapping Variables to achieve our Change Data Capture logic.

Informatica Mapping Variable:

Informatica Mapping Variable will be created in PowerCenter Designer, which can be used in any expression in a mapping, and also in a source qualifier filter, user-defined join, or extract override, and in the Expression Editor of reusable transformations.

Mapping variable can take the starting value from:
  • Initial value
  • Default Value
  • Parameter file
  • Pre-session variable assignment
  • Value saved in the repository
In informatica, integration Service looks for the start value in the order mentioned above. Value of the mapping variable can be changed with in the session using an expression and the final value of the variable will be saved into the repository. The saved value from the repository is retrieved in the next session run and used as the session start value.

How to Set Mapping Variable Value?

You can set/change the mapping variable value with in the mapping or session using the Set Function. We need to use the set function based on the Aggregation Type of the variable. Aggregation Type of the variable can be set when the variable is declared in the mapping.
  • SetMaxVariable: Sets the variable to the maximum value of a group of values. To use this variable, the aggregation type of the mapping variable must be set to Max.
  • SetMinVariable: Sets the variable to the minimum value of a group of values. To use this variable, the aggregation type of the mapping variable must be set to Min.
  • SetCountVariable: Increments the variable value by one. In other words, it adds one to the variable value when a row is marked for insertion, and subtracts one when the row is marked for deletion. To use this variable, the aggregation type of the mapping variable must be set to Count.
  • SetVariable: Sets the variable to the configured value. At the end of a session, it compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final value to the repository.

Change Data Capture Implementation:

Now, let’s start building our mapping with Change Data Capture.
Here we are going to implement Change Data Capture for CUSTOMER data load. We need to load any new customer or changed customers data to a flat file. Since the column LATEST_DT value changes for any new or updated customer record, we will be able to find the new or changed customer records using LATEST_DT column.

Firstly, create mapping variable while making mapping as shown in below image.

        $$M _END_ DATE as Date/Time



Now bring in the source and source qualifier to the mapping designer workspace. Open the source qualifier and give the filter condition to get the latest data from the source as shown below.

STG_CUST.LATEST_DT > CONVERT(DATETIME,'$$M _END_ DATE’)



Note: Latest value for the variable $$M_END_ DATE is retrieved from the repository every time the session is run.

Now map the column LATEST_DT to an expression transformation and create a variable expression as below.

        SETMAXVARIABLE($$M _END_DATE, LATEST_DT)



Note: This expression will make sure that, latest value from column LATEST_DT is stored into the repository after successful completion of the session run.

Now you can map all the remaining columns to the downstream transformation and complete all other transformation required in the mapping.
Informatica Mapping

Above explained things you need to configure Change Data Capture, Now create your workflow and run the workflow.



If you look into the session log file, you can see the mapping variable value is retrieved from the repository and used in the source SQL.

You can see mapping variable value stored in the repository, from workflow manager. Choose the session from the workspace, right click and select 'View Persistent Value’.



5 comments:

  1. Really Good blog post.provided a helpful information.I hope that you will post more updates like this Informatica Online Course

    ReplyDelete
  2. The main motive of the Big data engineering services is to spread the knowledge so that they can give more big data engineers to the world.

    ReplyDelete
  3. I feel this is a very good set of Informatica posts that are stressing on the topics of its utility and its capability to solve complex operations.

    Informatica Read JSON

    ReplyDelete