Share
Generic PostgreSQL Setup Guide

PostgreSQL is a highly stable object-relational database management system (ORDBMS) used as the primary data store or data warehouse for many web, mobile, geospatial, and analytics applications. It provides various replication methods, which makes it suitable for use in high-availability and data distribution scenarios.

You can ingest data from your PostgreSQL database using Hevo Pipelines and replicate it to a Destination of your choice.


Data Type Mapping

Hevo maps the PostgreSQL Source data type internally to a unified data type, referred to as the Hevo Data Type, in the table below. This data type is used to represent the Source data from all supported data types in a lossless manner.

The following table lists the supported PostgreSQL data types and the corresponding Hevo data type to which they are mapped:

PostgreSQL Data Type Hevo Data Type
-    INT_2
-    SHORT
-    SMALLINT
-    SMALLSERIAL
SHORT
-    BIT(1)
-    BOOL
BOOLEAN
-    BIT(M), M>1
-    BYTEA
-    VARBIT
BYTEARRAY
-    INT_4
-    INTEGER
-    SERIAL
INTEGER
-    BIGSERIAL
-    INT_8
-    OID
LONG
-    FLOAT_4
-    REAL
FLOAT
-    DOUBLE_PRECISION
-    FLOAT_8
DOUBLE
-    BPCHAR
-    CIDR
-    CITEXT
-    DATERANGE
-    ENUM
-    HSTORE
-    INET
-    INT_4_RANGE
-    INT_8_RANGE
-    INTERVAL
-    LTREE
-    MACADDR
-    MACADDR_8
-    NUMRANGE
-    TEXT
-    TSRANGE
-    TSTZRANGE
-    UUID
-    VARCHAR
-    XML
VARCHAR
-    TIMESTAMPTZ TIMESTAMPTZ (Format: YYYY-MM-DDTHH:mm:ss.SSSSSSZ)
-    JSON
-    JSONB
-    POINT
JSON
-    DATE DATE
-    TIME TIME
-    TIMESTAMP TIMESTAMP
-    MONEY
-    NUMERIC
DECIMAL

At this time, the following PostgreSQL data types are not supported by Hevo:

  • TIMETZ

  • Arrays

  • Any other data type not listed in the table above.

Note: If any of the Source objects ingested by Hevo contain unsupported data types, Hevo fails such objects and does not load their data to the Destination.


Handling of Deletes

In a PostgreSQL database for which the WAL level is set to logical, Hevo uses the database logs for data replication. As a result, Hevo can track all operations, such as insert, update, or delete, that take place in the database. Hevo replicates delete actions in the database logs to the Destination table by setting the value of the metadata column, __hevo_is_deleted__ to True.


Source Considerations

  • If you add a column with a default value to a table in PostgreSQL, entries with it are created in the WAL only for the rows that are added or updated after the column is added. As a result, in the case of log-based Pipelines, Hevo cannot capture the column value for the unchanged rows. To capture those values, you need to:

    • Restart the historical load for the respective object.

    • Run a query in the Destination to add the column and its value to all rows.

  • Any table included in a publication must have a replica identity configured. PostgreSQL uses it to track the UPDATE and DELETE operations. Hence, these operations are disallowed on tables without a replica identity. As a result, Hevo cannot track the updates or deletes for such tables.

    By default, PostgreSQL picks the table’s primary key as the replica identity. If your table does not have a primary key, you must either define one or set the replica identity as FULL, which records the changes to all the columns in a row.


Limitations

  • Hevo does not support data replication from foreign tables, temporary tables, and views.

  • If your Source table has indexes (indices) and or constraints, you must recreate them in your Destination table, as Hevo does not replicate them. It only creates the existing primary keys.

  • Hevo does not set the __hevo_is_deleted__ field to True for data deleted from the Source table using the TRUNCATE command. This action could result in a data mismatch between the Source and Destination tables.

  • You cannot select Source objects that Hevo marks as inaccessible for data ingestion during object configuration in the Pipeline. Following are some of the scenarios in which Hevo marks the Source objects as inaccessible:

    • The object is not included in the publication (key) specified while configuring the Source.

    • The publication is defined with a row filter expression. For such publications, only those rows for which the expression evaluates to FALSE are not published to the WAL. For example, suppose a publication is defined as follows:

      CREATE PUBLICATION active_employees FOR TABLE employees WHERE (active IS TRUE);
      

      In this case, as Hevo cannot determine the changes made in the departments object, it marks the object as inaccessible.

    • The publication specified in the Source configuration does not have the privileges to publish the changes from the UPDATE and DELETE operations. For example, suppose a publication is defined as follows:

      CREATE PUBLICATION insert_only FOR TABLE employees WITH (publish = 'insert');
      

      In this case, as Hevo cannot identify the new and updated data in the employees table, it marks the object as inaccessible.


See Also

Last updated on Aug 17, 2024

Tell us what went wrong

Skip to the section