The Apex engine by default guarantees that data is processed at-least-once and that state updates within the DAG occur exactly-once. With respect to state mutation through interaction with external systems, the results depend on the connector (refer to Chapter 3, The Apex Library). Connectors that support the exactly-once results include Files, Kafka, JDBC, Cassandra, and all others where the write operations are, or can be made, idempotent. We will look at an example application in the next section.
In distributed systems, a guarantee of the exactly-once processing is not really possible since nodes may go down at any time and when they are restored, some reprocessing of prior data, however minimal, must occur in order to guarantee correctness (or we have to accept data loss, which yields at-most-once processing). So, when we see exactly-once in published feature matrices of stream processors, it really means at-least-once along with the often implicit assumption of...