- User state exposed to extensions with stream context, check .
Fault Tolerance
By default, all the states reside in memory only which means that if the stream exits abnormally, the states will disappear.
In order to make state fault tolerance, Kuipler need to checkpoint the state into persistent storage which will allow a recovery after failure.
When things go wrong in a stream processing application, it is possible to have either lost, or duplicated results. For the 3 options of qos, the behavior will be:
- At-most-once(0): eKuiper makes no effort to recover from failures
- Exactly-once(2): Nothing is lost or duplicated
Given that eKuiper recovers from faults by rewinding and replaying the source data streams, when the ideal situation is described as exactly once does not mean that every event will be processed exactly once. Instead, it means that every event will affect the state being managed by eKuiper exactly once.
Exactly Once End to End
Source consideration
To have an end to end qos of the stream, the source must be rewindable. That means after recovery, the source can be reverted to the checkpointed offset and resend data from that so that the whole stream can be replayed from the last failure.
For extended source, the user must implement the api.Rewindable interface as well as the default api.Source interface. eKuiper will handle the rewind internally.
Sink consideration
To implement exactly-once, the user will have to implement deduplication tailored to fit the various sinking system.