Hevo Operator

HevoPipelineOperator is Hevo’s implementation of the Airflow Operator. It defines the Pipeline sync task in a Directed Acyclic Graph (DAG).

Parameters

The HevoPipelineOperator accepts the following parameters:

Required

Name	Type	Default	Description
`pipeline_id` (Required) Note: Supports Jinja templating.	`INT`	NA	The unique identifier of the Hevo Pipeline to be synced. Note: The Pipeline must be in the Initialized state.

Optional

Name	Type	Default	Description
`action`	`PipelineAction`	SYNC_NOW	The operation to trigger for the specified Pipeline. Allowable values: SYNC_NOW: Trigger the incremental sync for the Pipeline. Note: The Pipeline must be in the Initialized state. RESYNC: Restart the historical load for all the active objects in the specified Pipeline.
`job_type`	- `JobType` - `STR`	INCREMENTAL or RESYNC	Identify the job to track after initiating a sync. Allowable values: INCREMENTAL: A job that syncs new and updated data. This is the default value for the SYNC_NOW action. HISTORICAL: A job that syncs all existing data from the Source. RESYNC: A job that re-ingests historical data for all active objects in the Pipeline and updates the Destination tables if changes are detected in the re-ingested data. This is the default value for the RESYNC action. RESYNC_WITH_DROP_AND_LOAD: A job that drops and recreates the Destination tables for all active objects in the Pipeline. During this process, the tables are temporarily unavailable. Hevo then re-ingests all data from the Source objects and loads it into the newly created Destination tables.
`connection_id`	`STR`	hevo_airflow_conn_id	The ID of the connection created in Airflow for Hevo API credentials and connection details.
`poll_interval`	`INT`	15	The time (in seconds) between status checks while waiting for the Pipeline job to complete.
`retry_limit`	`INT`	10	The maximum number of attempts after initiating the sync to identify the active job.
`deferrable`	`BOOL`	True Note: It is recommended to set this parameter to True in production environments.	A flag to determine whether the task should initiate the sync, release the worker slot, and defer the monitoring to the Airflow triggerer. Note: The Airflow Triggerer service must be running.
`wait_for_completion`	`BOOL`	True	A flag to determine whether the Operator should wait for the Pipeline job to complete. Allowable values: True: Wait for the job to complete. False: Do not wait for the job to complete. Capture the job ID and transfer it via the cross-communication mechanism (Xcom) in Airflow to a `HevoSensor`.
`accept_completed_with_failures`	`BOOL`	False	A flag to accept partial failures for the job. Allowable values: True: Treat the Pipeline job’s Completed with Failures status as success. False: Fail the task when the status of the Pipeline job is Completed with Failures.
`ensure_new_job`	`BOOL`	True	A flag to prevent the operator from triggering a new job when previous jobs are still running. Allowable values: True: Fail the task if a job is already in progress in the Pipeline. False: Trigger a new job even when an active job is detected in the Pipeline.
`drop_and_load`	`BOOL`	False	A flag to determine whether the Drop and Load operation should be run for the specified Pipeline during the RESYNC action. Allowable values: True: Drop existing data from the Destination tables for all active objects in the Pipeline and then load the ingested data into them. False: Evolve the schema and load data ingested from the Source for all active objects in the Pipeline into the existing Destination tables.

The following table lists the methods that the HevoPipelineOperator provides:

Method	Description
`execute`	Runs the sync operation for the specified Pipeline after validating its status.
`execute_complete`	Completes task execution when the Pipeline job reaches a final state, such as Completed, Completed with Failures, or Failed.
`_wait_synchronously`	Waits for the Pipeline job to complete, blocking the worker slot.

Example

The following example creates three HevoPipelineOperator tasks: trigger_pipeline_sync, resync_pipeline, and trigger_only.

The trigger_pipeline_sync task runs the SYNC_NOW action for the specified Pipeline in the Trigger and Asynchronous Wait mode, as indicated by the deferrable=True and wait_for_completion=True parameters.
The resync_pipeline task runs the RESYNC action for the specified Pipeline in the Trigger and Asynchronous Wait mode, as indicated by the deferrable=True and wait_for_completion=True parameters.
The trigger_only task runs in the Trigger and Continue mode, as indicated by the deferrable=False and wait_for_completion=False parameters.

from airflow import DAG
from airflow.hevo.models.job import JobType
from airflow.hevo.models.pipeline import PipelineAction
from airflow.hevo.operators import HevoPipelineOperator

# Deferrable operator with SYNC_NOW (default)
trigger_sync = HevoPipelineOperator(
    task_id="trigger_pipeline_sync",
    pipeline_id=123,
    action=PipelineAction.SYNC_NOW,  # Default - can be omitted
    job_type=JobType.INCREMENTAL,  # Default for SYNC_NOW - can be omitted
    deferrable=True,
    wait_for_completion=True,
    poll_interval=10,
    accept_completed_with_failures=False,
    ensure_new_job=True,  # Default: True - prevents duplicate jobs
    connection_id="hevo_production"
)

# Full historical resync
resync_pipeline = HevoPipelineOperator(
    task_id="resync_pipeline",
    pipeline_id=123,
    action=PipelineAction.RESYNC,  # Full historical reload
    deferrable=True,
    wait_for_completion=True,
    poll_interval=30,  # Less frequent polling for long jobs
    accept_completed_with_failures=True
)

# Trigger and continue
trigger_only = HevoOperator(
    task_id="trigger_only",
    pipeline_id=456,
    deferrable=False,
    wait_for_completion=False  # Returns job_id via XCom
)

Last updated on Mar 02, 2026

Hevo Operator

Parameters

Required

Optional

Example

Tell us what went wrong