Almost any MPI call returns an integer error code, which signifies the success of the operation. If no error occurs, the return code is MPI_SUCCESS
:
if (MPI_Some_op(...) != MPI_SUCCESS)
{
// Process error
}
If an error occurs, MPI calls an error handler associated with the communicator, window or file object before returning to the user code. There are two predefined error handlers (the user can define additional error handlers):
MPI_ERRORS_ARE_FATAL
- errors result in termination of the MPI programMPI_ERRORS_RETURN
- errors result in the error code being passed back to the userThe default error handler for communicators and windows is MPI_ERRORS_ARE_FATAL
; for file objects it is MPI_ERRORS_RETURN
. The error handler for MPI_COMM_WORLD
also applies to all operations that are not specifically related to an object (e.g., MPI_Get_count
). Thus, checking the return value of non-I/O operations without setting the error handler to MPI_ERRORS_RETURN
is redundant as erroneous MPI calls will not return.
// The execution will not reach past the following line in case of error
int res = MPI_Comm_size(MPI_COMM_WORLD, &size);
if (res != MPI_SUCCESS)
{
// The following code will never get executed
fprintf(stderr, "MPI_Comm_size failed: %d\n", res);
exit(EXIT_FAILURE);
}
To enable user error processing, one must first change the error handler of MPI_COMM_WORLD
:
MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
int res = MPI_Comm_size(MPI_COMM_WORLD, &size);
if (res != MPI_SUCCESS)
{
fprintf(stderr, "MPI_Comm_size failed: %d\n", res);
exit(EXIT_FAILURE);
}
The MPI standard does not require that MPI implementations are able to recover from errors and continue the program execution.