apache-flink Savepoints and externalized checkpoints Savepoints: requirements and preliminary notes


A savepoint stores two things: (a) the positions of all datasources, (b) the states of operators. Savepoints are useful in many circonstances:

  • slight application code updates
  • Flink update
  • changes in parallelism
  • ...

As of version 1.3 (also valid for earlier version):

  • checkpoint must be enabled for the savepoints to be possible. If you forget to explicitly enable checkpoint using:

    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

    you will get:

    java.lang.IllegalStateException: Checkpointing disabled. You can enable it via the execution environment of your job
  • when using window operations, it is crucial to use event-time (vs ingestion or processing time) to yield proper results;

  • to be able to upgrade a program and reuse savepoints, manual uid must be set. This is because, by default, Flink changes the operator's UID after any change in their code;

  • Chained operators are identified by the ID of the first task. It’s not possible to manually assign an ID to an intermediate chained task, e.g. in the chain [ a -> b -> c ] only a can have its ID assigned manually, but not b or c. To work around this, you can manually define the task chains. If you rely on the automatic ID assignment, a change in the chaining behaviour will also change the IDs (see point above).

More info is available in the FAQ.