Handling Failed Events in a Pipeline
Whenever Hevo encounters an Event that it cannot process in a Pipeline, it marks the Event as Failed. Some key causes of Event failures include:
-
Source inaccessibility: The Source connection settings have changed or the Source is unavailable.
-
Destination connection failure: The Destination connection settings have changed or the Destination is unavailable.
-
Unsupported data types in the Source data: The datatypes of incoming Events do not match the configured schema of the Destination object.
-
Invalid data types in the Source data: The Source data types are not supported by Hevo or the Destination.
Failure Handling Modes
The success or failure of a job depends on the successful replication of the complete set of objects it contains. If any errors or failures are encountered during a job run, these are handled based on the failure handling mode you define at the time of creating the Pipeline:
-
Strict: This error-handling mode ensures that data is loaded to your Destination only if all Events in the batch of objects selected for replication are successfully processed. This option is useful if you have post-load processes with costly downstream queries or queries that cannot run on incomplete data. When this mode is selected for a job, there is no separate selection at the object level. If even one object fails, the job is aborted and the entire batch of objects and the Pipeline are marked as Failed. Events from objects already processed till then are left as-are in the Destination. The next job picks up from the last successful Event. For a load mode of Append, this may cause duplicate Events in subsequent job runs.
This mode is currently not available for selection.
-
Moderate: This error-handling mode loads data to your Destination even if some objects fail. The objects that are processed successfully are loaded to the Destination. Upon job completion, its status changes to Completed with Failures. If any object fails, the job proceeds to the next object. For this mode, you must also select the method for processing the objects that contain failed Events:
-
Don’t load data for Objects with failed Events: No data is loaded for the object if even one of its Events fails.
-
Load data for Objects with less than 10K failed Events: The successfully processed Events from the failed objects are loaded as long as the total count of failed Events for the job does not exceed 10K. If the number of failed Events across all objects and replication stages of the job exceeds 10K, the entire job is declared failed.
-
Viewing Failed Events
An object can fail at the ingestion or data-loading stage. You can view the list of failed objects for a job and the number of failed Events within each object, grouped by the failure reason in the Object Details section of the Job History tab of your Pipeline.
To do this:
-
Click on a failed job on the Jobs page. The job details open in the Job History tab of the associated Pipeline. In case any replication stage has failed and Hevo is still attempting to process data for it, the job status appears as In Progress.
Alternatively, in the Job History tab of your Pipeline, click on a failed object to expand its view.
-
View the job and object-level failure details in the respective job details section.
-
Expand an object row to view the failed Events grouped by each failure reason.
Note: The Objects Failed and the Events Failed counts do not include objects that you have skipped during Pipeline creation.
Resolving Events Failures
Hevo provides you two ways of resolving Event failures:
-
Deleting the failed Events from the Source data
Next, you must restart the object so that Hevo can process it again. If any of the Events for it were previously loaded and the load mode is Append, this may cause some duplicates in your data.
Step 1: Resolve the failures
Option 1: Fix the underlying issues
-
Click on the failed object to view the failed Events grouped by the failure reason.
-
Click Download Events to download a .ZIP file containing the list of failed Events in CSV format.
-
Fix the errors in the data in your Source database or table.
Option 2: Delete the failed Events
You can delete the failed Events in the Source to not load them to the Destination.
Step 2: Restart the object
Once you have resolved or removed the failed Events errors:
-
In the Object Configuration tab, scroll to the desired object and click the More icon.
-
Click Restart.
A historical job is created for the object that loads all the data available for it in the Source again.