Share
Google Cloud PostgreSQL (Edge) Setup Guide

Google Cloud PostgreSQL is a fully managed relational database service provided by Google Cloud Platform (GCP), hosting a PostgreSQL database with high availability, scalability, and security. Google Cloud takes care of the infrastructure management tasks, such as provisioning, backups, and replication, allowing you to focus on deploying, managing, and scaling your PostgreSQL database.

You can ingest data from your Google Cloud PostgreSQL database using Hevo Pipelines and replicate it to a Destination of your choice.


Data Type Mapping

Hevo maps the PostgreSQL Source data type internally to a unified data type, referred to as the Hevo Data Type, in the table below. This data type is used to represent the Source data from all supported data types in a lossless manner.

The following table lists the supported PostgreSQL data types and the corresponding Hevo data type to which they are mapped:

PostgreSQL Data Type Hevo Data Type
-    INT_2
-    SHORT
-    SMALLINT
-    SMALLSERIAL
SHORT
-    BIT(1)
-    BOOL
BOOLEAN
-    BIT(M), M>1
-    BYTEA
-    VARBIT
BYTEARRAY
-    INT_4
-    INTEGER
-    SERIAL
INTEGER
-    BIGSERIAL
-    INT_8
-    OID
LONG
-    FLOAT_4
-    REAL
FLOAT
-    DOUBLE_PRECISION
-    FLOAT_8
DOUBLE
-    BPCHAR
-    CIDR
-    CITEXT
-    DATERANGE
-    ENUM
-    HSTORE
-    INET
-    INT_4_RANGE
-    INT_8_RANGE
-    INTERVAL
-    LTREE
-    MACADDR
-    MACADDR_8
-    NUMRANGE
-    TEXT
-    TSRANGE
-    TSTZRANGE
-    UUID
-    VARCHAR
-    XML
VARCHAR
-    TIMESTAMPTZ TIMESTAMPTZ (Format: YYYY-MM-DDTHH:mm:ss.SSSSSSZ)
-    JSON
-    JSONB
-    POINT
JSON
-    DATE DATE
-    TIME TIME
-    TIMESTAMP TIMESTAMP
-    MONEY
-    NUMERIC
DECIMAL

At this time, the following PostgreSQL data types are not supported by Hevo:

  • TIMETZ

  • Arrays

  • Any other data type not listed in the table above.

Note: If any of the Source objects contain data types that are not supported by Hevo, they are marked as unsupported during object configuration in the Pipeline.


Handling of Deletes

In a PostgreSQL database for which the WAL level is set to logical, Hevo uses the database logs for data replication. As a result, Hevo can track all operations, such as insert, update, or delete, that take place in the database. Hevo replicates delete actions in the database logs to the Destination table by setting the value of the metadata column, __hevo_is_deleted__ to True.


Source Considerations

  • If you add a column with a default value to a table in PostgreSQL, entries with it are created in the WAL only for the rows that are added or updated after the column is added. As a result, in the case of log-based Pipelines, Hevo cannot capture the column value for the unchanged rows. To capture those values, you need to:

    • Resync the historical load for the respective object.

    • Run a query in the Destination to add the column and its value to all rows.

  • Any table included in a publication must have a replica identity configured. PostgreSQL uses it to track the UPDATE and DELETE operations. Hence, these operations are disallowed on tables without a replica identity. As a result, Hevo cannot track the updates or deletes for such tables.

    By default, PostgreSQL picks the table’s primary key as the replica identity. If your table does not have a primary key, you must either define one or set the replica identity as FULL, which records the changes to all the columns in a row.


Limitations

  • Hevo does not support logical replication of partitioned tables.

  • Hevo does not support data replication from foreign tables, temporary tables, and views.

  • If your Source table has indexes (indices) and or constraints, you must recreate them in your Destination table, as Hevo does not replicate them. It only creates the existing primary keys.

  • Hevo does not set the __hevo_is_deleted__ field to True for data deleted from the Source table using the TRUNCATE command. This action could result in a data mismatch between the Source and Destination tables.

  • You cannot select Source objects that Hevo marks as inaccessible for data ingestion during object configuration in the Pipeline. Following are some of the scenarios in which Hevo marks the Source objects as inaccessible:

    • The object is not included in the publication (key) specified while configuring the Source.

    • The publication is defined with a row filter expression. For such publications, only those rows for which the expression evaluates to FALSE are not published to the WAL. For example, suppose a publication is defined as follows:

      CREATE PUBLICATION active_employees FOR TABLE employees WHERE (active IS TRUE);
      

      In this case, as Hevo cannot determine the changes made in the employees object, it marks the object as inaccessible.

    • The publication specified in the Source configuration does not have the privileges to publish the changes from the UPDATE and DELETE operations. For example, suppose a publication is defined as follows:

      CREATE PUBLICATION insert_only FOR TABLE employees WITH (publish = 'insert');
      

      In this case, as Hevo cannot identify the new and updated data in the employees table, it marks the object as inaccessible.


See Also

Last updated on Dec 16, 2024

Tell us what went wrong

Skip to the section