Loading Data in a Database Destination
Hevo replicates the data from your Source system to the database Destination based on the primary keys defined in the Destination table. As part of loading the data, deduplication is done to ensure that only unique records are replicated and duplicates are dropped. This deduplication is done using the value of the primary key.
If the Source table is mapped to an existing Destination table that does not have a primary key, you need to set one from the Schema Mapper. However, only the records loaded after the primary key is defined are deduplicated. Any duplicate records that already exist in the Destination table remain unchanged.
If you want to remove existing duplicated records, you must first set the primary key, truncate the Destination table, and then restart the historical load for the corresponding object.
Note: Truncating the table will permanently delete all existing data.
Deduplicating Data Loaded to the Database
The key steps involved in the deduplication process are:
-
Load data into a temporary table.
-
Adding metadata fields to the Destination schema.
-
Apply the data from the temporary table to the Destination table in one of the following ways:
-
Update existing rows with ingested Events if the primary key already exists.
-
Append the ingested Events as new rows if primary key does not exist.
Note: Deletion of an Event is handled as an update, by setting the value of
__hevo__marked_deleted
field to True. -
-
Delete the temporary table.
The Data Loading Process is illustrated below.
Note: The CPU time and storage space of the Destination is consumed for the duration of the data loading process.
Data Loading Process
The following diagram illustrates the process of loading data to a Database Destination: