PostgreSQL (Edge)
Hevo Edge supports the following variations of PostgreSQL as a Source:
Handling Source Partitioned Tables
PostgreSQL allows partitioning tables into smaller pieces, where each piece is stored as a separate table in the database. For PostgreSQL version 12 and above, Hevo Edge supports loading data from your partitioned tables, and for this, the Split table partition option has been provided. You can enable or disable this option while configuring your PostgreSQL Source (any variant) in the Edge Pipeline. This option determines whether data from your partitioned table is loaded into a single Destination table or independent tables. However, for PostgreSQL versions 13 onwards, Hevo also uses the value of the parameter publish_via_partition_root in the specified publication key to determine this behavior.
The following table explains Edge’s behavior for loading data from partitioned tables based on these two parameters:
PostgreSQL Version: 12
Value of publish_via_partition_root |
Split table partition Enabled? | Hevo Behavior |
---|---|---|
Not applicable | No (FALSE) | An error message is displayed, as this configuration is not supported. |
Yes (TRUE) | The Source table’s partitions are treated as separate entities, and data from them is loaded into individual tables at the Destination. |
PostgreSQL Version: 13 and above
Value of publish_via_partition_root |
Split table partition Enabled? | Hevo Behavior |
---|---|---|
TRUE | Yes (TRUE) | An error message is displayed, as this configuration is not supported. |
TRUE | No (FALSE) | The partitioned table is treated as a single entity, and its data is loaded into a single Destination table. |
FALSE | No (FALSE) | An error message is displayed, as this configuration is not supported. |
FALSE | Yes (TRUE) | The Source table’s partitions are treated as separate entities, and data from them is loaded into individual tables at the Destination. |
Handling Toast Data
PostgreSQL allows for columns to store large amounts of data. For this, it uses The Oversized-Attribute Storage Technique (TOAST), where if the column value exceeds a certain threshold (around 8 KB, the default page size in PostgreSQL), it is compacted and may also be stored in a separate TOAST table. Such a column is referred to as a TOASTed column.
Hevo Edge identifies the toasted columns in the ingested data and replicates data from them to your Destination tables using the merge operation.
Note: Hevo Edge does not replicate data from toasted columns if your Pipeline loads data in the Append mode.
Resolving Data Loss in Paused Pipelines
For log-based Edge Pipelines created with any variant of the PostgreSQL Source, the data to be replicated is identified from the write-ahead logs (WAL) by the publications created on the database tables. Hence, disabling the log-based Pipeline may lead to data loss, as the corresponding WAL segment may have been deleted. The log can get deleted due to the expiry of its retention period or limited storage space in the case of large log files.
If you notice a data loss in your Edge Pipeline after enabling it, resync the Pipeline. The Resync Pipeline action restarts the historical load for all the active objects in your Pipeline, thus recovering any lost data.
Note: The re-ingested data does not count towards your quota consumption and is not billed.