PostgreSQL
Hevo supports the following variations of PostgreSQL as a Source:
Hevo recommends logical replication as the default mode to ingest incremental data from your PostgreSQL Source database, especially for high transaction volumes. However, you can also select a different ingestion mode, such as Table, XMIN, or Custom SQL.
Refer to the required PostgreSQL variant for the steps to configure it as a Source in your Hevo Pipeline and start ingesting data.
Resolving Data Loss in Paused Pipelines
For Pipelines created with Logical Replication ingestion mode, Hevo replicates the data using the log generated by the Source. Pausing a log-based Pipeline for more than 24 hours may lead to data loss, as a result of the log being deleted. The log can get deleted due to the expiry of its retention period or limited disk storage space in the case of large log files.
In case there is a loss of data after resuming a paused Pipeline, restart the historical load for all the objects to ingest the lost data. To do so, in the Pipeline Overview page:
-
Select the Objects check box to select all the objects in the Pipeline. You can also select specific objects by selecting the check box next to their names.
-
Select the Restart option from the MORE drop-down to start the historical data ingestion.
The historical load starts immediately. The re-ingested data does not count towards your quota consumption and is not billed.
Dropping a Replication Slot Post-Pipeline Deletion
For Pipelines with Logical Replication as the Pipeline mode, Hevo creates a replication slot in the Source to record any changes. If you delete the Pipeline, Hevo automatically drops this replication slot. However, sometimes, due to issues at the PostgreSQL Source end, the slot might not get dropped automatically. If that happens, use the following command to manually drop the slot:
SELECT pg_drop_replication_slot('<slot_name>');
For example, to drop a slot named test_postgreSQL_slot
, use the following query:
SELECT pg_drop_replication_slot('test_postgreSQL_slot');
Source Considerations
-
When you delete a row in the Source table, its XMIN value is deleted as well. As a result, for Pipelines created with the XMIN ingestion mode, Hevo cannot track deletes in the Source object(s). To capture deletes, you need to restart the historical load for the respective object.
-
XMIN is a system-generated column in PostgreSQL, and it cannot be indexed. Hence, to identify the updated rows in Pipelines created with the XMIN ingestion mode, Hevo scans the entire table. This action may lead to slower data ingestion and increased processing overheads on your PostgreSQL database host. Due to this, Hevo recommends that you create the Pipeline in the Logical Replication mode.
Note: The XMIN limitations specified above are applicable only to Pipelines created using the XMIN ingestion mode, which is currently available for Early Access.