Batch Processing
Qrambo provides a batch processing feature to facilitate the processing of multiple tasks at once. This feature is useful for data annotation cases, where you need to annotate large amount of data, without any realtime needs.
How it works
- Create a batch processing job, once the job is created, you will get a batch id.
- In the background, we will automatically start processing these items. For example, if the action is to ‘create_task’, we will call the Create Task API for each item in the batch.
- You can check the status of the batch processing job by using the batch id. Possible statuses are:
PENDING
: The job is pending and waiting to be processed.PLANNED
: We have sliced the job into smaller chunks and will process them in the background. You can check theslice_count
to see how many chunks we have.RUNNING
: We are currently processing the job, you can look atcurrent_slice
to see which chunk we are processing.COMPLETED
: The job is completed. You can check the statuses of each chunk to see if they are completed successfully or not.
Additionally, you can check the status of each slice by using get slice endpoint with the batch id and the slice index. In there you can see the status of the slice, and the result of the slice. Possible statuses are:
PENDING
: The slice is pending and waiting to be processed.PLANNED
: We have created items from the slice, but not started processing yet.RUNNING
: We are currently processing the slice.SUCCESS
: The slice is completed successfully (i.e. no items are failed).PARTIAL_ERROR
: The slice is completed, but some items failed.ERROR
: All items in the slice are failed.
Characteristics
- Batch Operations API does not retry for the failed items. It is the responsibility of the user to handle the failed items.
- Batch Operations API does not support the cancellation of the job. Once the job is created, it will be processed in the background. If you need this feature, please let us know.
Best Practices
- We recommend to use batch size between 1000 to 10000 items.
- In general, you should expect the batch processing to take 2-3 minutes for 1000 items.
Last updated on