Control the retry policy from C/C++
Using Golem's retry mechanism
Golem applies a retry mechanism to all workers. In case of a failure, Golem will automatically recover the worker to the point before the failure and retry the operation. An exponential backoff and an upper limit on the number of retries are applied.
If the maximum number of retries is reached, the worker will be marked as failed and no further invocations will be possible on it.
This mechanism is automatic and applied to all kind of failures. To rely on it, just let the C code fail (for example by calling abort()
).
Customizing the retry policy
The retry policy which controls the maximum number of retries and the exponential backoff is a global configuration of the Golem servers, but it can be customized for each worker.
Once the golem:api/host@0.2.0
interface is imported in the WIT file (see the previous page for more information), the retry policy can be controlled with the golem_api_host_get_retry_policy
and golem_api_host_set_retry_policy
functions.
golem_api_host_retry_policy_t previous;
golem_api_host_get_retry_policy(&previous);
golem_api_host_retry_policy_t current;
current.max_attempts = 10;
current.min_delay = 1000000000; // 1s
current.max_delay = 1000000000; // 1s
current.multiplier = 1.0;
golem_api_host_get_retry_policy(¤t);
// ...
golem_api_host_get_retry_policy(&previous);
The golem_api_host_retry_policy_t
type itself is originatd from Golem's WIT definition in the following way:
/// Configures how the executor retries failures
record retry-policy {
/// The maximum number of retries before the worker becomes permanently failed
max-attempts: u32,
/// The minimum delay between retries (applied to the first retry)
min-delay: duration,
/// The maximum delay between retries
max-delay: duration,
/// Multiplier applied to the delay on each retry to implement exponential backoff
multiplier: u32
}