Getting started with small-step operational semantics
A formal model of Sagas
Small-step operational semantics (SOS) are often used for the specification and verification of programming languages — although SOS can be used for the specification and verification of entire systems.
In this post, I will design SagaLang, a tiny “programming language” for the development of sagas to illustrate the ideas behind SOS.
Sagas
In the paper Sagas from 1987, H. Garcia-Molina and K. Salem introduce sagas as long-running transactions, that is, a sequential composition of transactions with transaction-like guarantees:
A saga is a sequence of steps, any partial execution is undesirable.
A saga shall guarantee to execute observably equivalent to exactly once or observably equivalent to not at all.
In the absence of failure, this guarantee is trivially met. However, in the presence of failure, a saga must implement failure mitigation strategies:
- Forward recovery attempts to mitigate any partial execution by moving forward, a common failure mitigation strategy that comes in the form of retries.
- Backward recovery attempts to mitigate any partial execution by rolling backward, a common failure mitigation strategy that comes in the form of compensation.
A common failure mitigation strategy is to combine forward recovery and backward recovery. Attempt forward recovery and subsequently attempt backward recovery if the failure persists.
Informally, this failure mitigation strategy amounts to “Let’s try as hard as we can to make this happen (retries). If we can’t, let’s not leave a mess (compensation).”
SagaLang
SagaLang is a tiny programming language that allows you to specify a saga as a sequence of transactions along with their allotted retries and corresponding compensations.
SagaLang is minimal, with no variables, no conditionals, no loops, and no exception handling. Just enough to demonstrate the ideas of SOS.
SOS relates programs (the definition) to computation (the execution) by specifying how the program determines the execution one step at a time:
- A program is a sequence of instructions, sometimes called statements, generating a computation.
- A computation is a sequence of states and state transitions constrained by the program.
SOS bridges the gap between the program and its computation: SOS provides a set of rules that describe how the program executes, step by step.
In this blog post, I’ll introduce SagaLang in three iterations. First, a basic execution model, then sagas with forward recovery, and lastly sagas with backward recovery.
1. Basic SagaLang
The first iteration of SagaLang is straightforward: A Saga S
is either a transaction s
followed by another Saga, or the empty sequence ε
.
S = s; S
| ε
To get started, let’s define a Saga P
as the sequence of three transactions a
, b
, and c
.
P = a; b; c; ε
SOS is often presented as inference rules, which consist of an upper part (above the line) and a lower part (below the line):
- The part above the line is called the antecedent, which represents the conditions that must be met.
- The part below the line is called the consequent, which represents the state transition.
Here, each transition is labeled. The label allows us to build a trace, that is, a history of the execution.
Figure 2. shows the interference rules for the basic version of SagaLang, three rules are sufficient to specify its execution.
- ① For a non-empty saga
⟨s; S⟩
, if the transactions
executes successfully, we proceed with the continuation⟨S⟩
. - ② For an empty saga
ε
, without any condition, we proceed to complete the saga as a success. - ③ For a non-empty saga
⟨s; S⟩
, if the transactions
fails, we proceed to complete the saga as a failure.
In the absence of a failure, our saga executes trivially exactly once and to completion:
trace(P) = a • b • c • success
However, in the presence of failure, our saga simply terminates without any attempt at correction. For example, if P
fails after transaction b
attempting to execute transaction c
, the trace of P
reveals the partial application:
trace(P) = a • b • failure
So far, our saga does not have transaction-like execution guarantees. Not even close.
2. SagaLang w/ Forward Recovery
The next iteration of SagaLang includes forward recovery, so we do not give up so easily in the presence of failure: A Saga S
is either a tuple of a transaction s
and a natural number n
followed by another Saga, or the empty sequence ε
.
S = (s × n); S
| ε
Our previous sample saga P
is now a sequence of three transactions a
, b
, and c
and their retry budgets:
P = (a × 1); (b × 1); (c × 1); ε
Figure 3. shows the interference rules for this version of SagaLang, four inference rules are necessary to specify its execution.
Here, I’ll only discuss the major differences to the previous version:
- ③ For a non-empty saga
⟨s; S⟩
, if the transactions
fails and we still have retry budget, i.e.n > 0
, we proceed with the continuation⟨(s × n-1); S⟩
. - ④ For a non-empty saga
⟨s; S⟩
, if the transactions
fails and we do not have retry budget, i.e.n = 0
, we proceed to complete the saga as a failure.
As before, in the absence of a failure, our saga executes trivially exactly once and to completion. Even in the presence of failure, our saga has a chance to execute observably equivalent to exactly once, for example:
trace(P) = a • a • b • c • success
However, in the presence of failure, we still may run out of our retry budget ultimately terminating without any further attempts of correction.
trace(P) = a • a • failure
But we need the cooperation of the developer: Forward recovery relies on the principle of idempotence, that is, the effect of invoking and potentially executing a
multiple times must be equivalent to the impact of executing a
exactly once — something our SOS does not express.
a ... • a ≈ a
So far, our saga stands a better chance at success, yet still does not have transaction-like execution guarantees.
3. SagaLang w/ Backward Recovery
The last iteration of SagaLang includes backward recovery, now we do not give up in the presence of failure until we absolutely have to: A Saga S
is either a triple of a transaction s
, a compensation c
, and a natural number n
followed by another Saga, or the empty sequence ε
.
S = (s × c × n); S
| ε
Our previous sample saga P
is now a sequence of three transactions a
, b
, and c
, their compensating transactions, and their retry budgets:
P = (a × ¬a × 1); (b × ¬b × 1); (c × ¬c × 1); ε
Figure 4. shows the interference rules for this version SagaLang, our inference rules are much more involved to specify its execution.
Again, I’ll only discuss the major differences to the previous version:
- ① For a non-empty saga
⟨(s × c × n); S⟩
, if the transactions
executes successfully, we proceed with the continuation⟨S⟩
and add the compensationc
to the sequence of compensations⟨C⟩
. - ④ For a non-empty saga
⟨s; S⟩
, if the transactions
fails and we do not have retry budget, i.e.n = 0
, we proceed to replace the saga with the accumulated sequence of compensations⟨C⟩
. - ⑦ For a non-empty compensation
⟨c; C⟩
, if the compensationc
fails, we proceed to complete the saga as a failure.
Of course, we could add “forward recovery to the backward recovery,” that is, we could add retries to the compensations. However, that is a trivial, mechanical extension and would not add any new insights into SOS.
As before, in the absence of a failure, our saga executes trivially exactly once and to completion, while in the presence of failure, our saga has a chance to execute observably equivalent to exactly once.
However, now even in the presence of failure and no retry budget left, we have a chance to execute observably equivalent to not at all:
trace(P) = a • b • b • ¬a • success
But once again we need the cooperation of the developer: Backward recovery relies on the principle that the effect of executing a
and later executing its compensation ¬a
is equivalent to not executing a
in the first place — again something our SOS does not express.
a • ¬a ≈ ε
However, if any compensation fails, all is lost 💀
Conclusion
Small-step operational semantics give meaning to definitions by providing a precise specification of their execution.
SOS is often used for the specification and verification of programming languages but they are equally capable as a thinking and communication tool.