Share
Google BigQuery (Edge) Setup Guide

Google BigQuery is a fully managed, serverless data warehouse that supports Structured Query Language (SQL) to derive meaningful insights from the data. With Hevo Edge, you can ingest data from any supported Source system and load it to a Google BigQuery Destination.

Hevo stages the Source data in Google Cloud Storage (GCS) buckets before loading it to your BigQuery Destination. Hevo can transfer data to the GCS buckets only if the bucket and the dataset are co-located, except when the datasets are in the US multi-region. Read Location considerations for further details about the requirements to transfer data from GCS buckets to BigQuery tables. Buckets are basic units of storage in GCS, and they hold your data until Hevo loads it into your BigQuery tables.

Roles and permissions

Hevo requires that the following predefined role be applied to your Google Cloud Platform (GCP) project and assigned to the connecting Google service account:

Role Description
BigQuery User Grants Hevo permission to:
-  Create new datasets in the project.
-  Read a dataset’s metadata and list the tables in the dataset.
-  Create, update, get, and delete tables and views from a dataset.
-  Read and update data and metadata for the tables and views in a dataset.
-  Run BigQuery jobs and queries in the GCP project.

Note: Hevo is also granted the BigQuery Data Owner role on the datasets it creates.

Hevo uses a GCS bucket to temporarily stage data ingested from the Source before loading it into your BigQuery Destination tables. By default, your data is staged in a Hevo-managed GCS bucket. However, you can specify your own GCS bucket while configuring the BigQuery Destination. In case of the latter, you must assign the following role to the connecting Google service account:

Role Description
Storage Admin Grants Hevo permission to list, create, view, and delete:
-  Objects in GCS buckets.
-  GCS buckets.

If you do not want to assign the Storage Admin role, you can create a custom role at the project or organization level and assign it to the connecting Google service account. Grant the following permissions to the role that you add to give Hevo access to your GCS resources:

  • resourcemanager.projects.get
  • storage.buckets.create
  • storage.buckets.delete
  • storage.buckets.get
  • storage.buckets.list
  • storage.buckets.update
  • storage.multipartUploads.abort
  • storage.multipartUploads.create
  • storage.multipartUploads.list
  • storage.multipartUploads.listParts
  • storage.objects.create
  • storage.objects.delete
  • storage.objects.get
  • storage.objects.list
  • storage.objects.update

Google account authentication

In Edge, you connect to your BigQuery Destination with a service account. One service account can be mapped to your entire team. You must assign the required roles to your service account, enabling it to access GCP-hosted services, such as BigQuery and GCS buckets. Read Authentication for GCP-hosted Services to know how to assign these roles to your service account.


Modifying BigQuery Destination Configuration in Edge

You can modify some settings of your BigQuery Destination after its creation. However, any configuration changes will affect all the Pipelines using that Destination.

To modify the configuration of your BigQuery Destination in Edge:

  1. In the detailed view of your Destination, do one of the following:

    • Click the More () icon to access the Destination Actions menu, and then click Edit Destination.

      More Icon Edit Destination

    • In the Destination Configuration section, click EDIT.

      Edit Destination Icon

  2. On the <Your Destination Name> editing page:

    Edit BigQuery Destination

    Note: The settings that cannot be changed are grayed out.

    • You can specify a new name for your Destination, not exceeding 255 characters.

    • In the Authentication section, you can modify the service account configured in the Pipeline. Click the Service account private key field to upload the private key associated with the desired service account.

      Note:

      • The service account must be assigned the BigQuery User role.

      • If the BigQuery Destination is configured to use your GCS bucket, the Storage Admin role or a custom role with the necessary permissions must be assigned to the service account.

  3. Click TEST & SAVE to check the connection to your BigQuery Destination and then save the modified configuration.


Data Type Mapping

Hevo internally maps the Source data type to a unified data type, which is referred to as the Hevo Data Type in the table below. This data type represents the Source data from all supported data types in a lossless manner. The Hevo data types are then mapped to the corresponding data types that are supported in each Destination.

Hevo Data Type BigQuery Data Type
-  ARRAY
-  JSON
-  STRUCT
JSON
BOOLEAN BOOL
BYTEARRAY BYTES
-  BYTE
-  INTEGER
-  SHORT
-  LONG
INT64
DATE DATE
DATETIME DATETIME
DATETIMETZ TIMESTAMP
DECIMAL -  NUMERIC
-  BIGNUMERIC
-  FLOAT64
-  DOUBLE
-  FLOAT
FLOAT64
-  TIME
-  TIMETZ
TIME
-  VARCHAR
-  XML
STRING

Note: If your Source data type is mapped to the UNSUPPORTED Hevo data type, it is not replicated to your BigQuery Destination.

Handling the Time with Time Zone data type

For BigQuery Destinations, Hevo maps the Source columns of the TIMETZ and DATETIMETZ data types to the Destination columns of the TIME and TIMESTAMP data types, respectively. Data values from the Source columns are loaded to the Destination columns in the Coordinated Universal Time (UTC) format. For example, the value 05:30 PM (IST) in the Source column is stored as 12:00 PM (UTC) in the Destination column.

Handling the Decimal data type

For BigQuery Destinations, Hevo maps DECIMAL data values with a fixed precision (P) and scale (S) to either the NUMERIC (DECIMAL) data type or the BIGNUMERIC (BIGDECIMAL) data type. This mapping is decided based on the number of significant digits (P) in the numeric value and the number of digits in the numeric value to the right of the decimal point (S). Refer to the table below to understand the mapping:

Precision and Scale of the Decimal Data Value BigQuery Data Type
Precision: 38
Scale: 9
NUMERIC
Precision: 76
Scale: 38
BIGNUMERIC
Precision: > 76
Scale: > 38
FLOAT64

Read Numeric Types to know more about the data types, their range, and the literal representation BigQuery uses to represent various numeric values.

Handling of Unsupported Data Types

Hevo does not allow direct mapping of your Source data type to any of the following BigQuery data types:

  • ARRAY

  • GEOGRAPHY

  • INTERVAL

  • RANGE

  • STRUCT

  • Any other data type not listed in the table above.

Hence, if your Source object is mapped to an existing BigQuery table with columns of unsupported data types, it may become inconsistent.


Destination Considerations

  • BigQuery recommends using the Avro binary format to load data as it is faster. Hence, Hevo creates data files using this file format.

  • As BigQuery does not support loading encrypted files from GCS buckets, Hevo writes unencrypted data files to the GCS bucket. Hence, it is recommended to enable the Use your GCS bucket when configuring BigQuery as a Destination in Edge.

    Note: By default, BigQuery applies encryption to data stored in your Destination tables.

  • BigQuery creates a job to track tasks run by applications, such as Hevo, in the system. It creates, schedules, and runs jobs, such as query, load, or copy, whenever data is queried, loaded, or copied. The query and load jobs that Hevo runs in your BigQuery Destination to identify data and load it are subject to the quotas and limits imposed by BigQuery. For example, the maximum row size in a query job can be 100 MB. Thus, if any rows in the query job, which Hevo runs, exceed this limit, Hevo does not load data from the Source object(s) containing those rows.

  • BigQuery APIs and jobs are subject to daily account limits. Your Hevo Edge Pipeline may fail if these limits are exceeded, as Hevo uses the BigQuery APIs and runs jobs to complete operations such as loading data or creating a dataset.


Limitations

  • At this time, Hevo supports only user-managed service accounts to authenticate connections to BigQuery.

  • Only private keys generated using the JSON format for user-managed service accounts are supported.

  • Hevo supports loading data to BigQuery partitioned tables. However, for field-based partitioning, your partition key must be the corresponding Source object’s primary key. Read Supporting Partitioned Tables in BigQuery for more information.

  • Hevo does not support GCS buckets created with the dual-region location type.

  • Hevo sanitizes your Source table and column names if they do not meet Hevo’s defined pattern for these identifiers. Hevo’s safe pattern allows alphanumeric characters and underscores in table and column names. Further, Hevo supports only up to 1024 characters for table names and 300 characters for column names.

  • Currently, Hevo ignores the casing of Source column names. Hence, it does not distinguish between a column named COLUMN1 and column1 when creating them in your BigQuery Destination tables. However, it propagates the casing for your Source schema and table names. For example, a schema named TEST is created as the dataset <destination_prefix>_TEST in your BigQuery Destination.

Last updated on May 19, 2025

Tell us what went wrong

Skip to the section