apache-flink Tutorial => Savepoints: requirements and preliminary...

Example

A savepoint stores two things: (a) the positions of all datasources, (b) the states of operators. Savepoints are useful in many circonstances:

As of version 1.3 (also valid for earlier version):

checkpoint must be enabled for the savepoints to be possible. If you forget to explicitly enable checkpoint using:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(checkpointInterval);

you will get:

java.lang.IllegalStateException: Checkpointing disabled. You can enable it via the execution environment of your job

when using window operations, it is crucial to use event-time (vs ingestion or processing time) to yield proper results;
to be able to upgrade a program and reuse savepoints, manual uid must be set. This is because, by default, Flink changes the operator's UID after any change in their code;
Chained operators are identified by the ID of the first task. It’s not possible to manually assign an ID to an intermediate chained task, e.g. in the chain [ a -> b -> c ] only a can have its ID assigned manually, but not b or c. To work around this, you can manually define the task chains. If you rely on the automatic ID assignment, a change in the chaining behaviour will also change the IDs (see point above).

More info is available in the FAQ.

PDF - Download apache-flink for free

Previous Next