IntelĀ® MPI Library Reference Manual for Linux* OS
An application sets MPI_ERRORS_RETURN error handler and checks the return code after each communication call. If a communication call does not return MPI_SUCCESS, the destination process should be marked unreachable and exclude communication with it. For example:
if(live_ranks[rank]) {
mpi_err = MPI_Send(buf, count, dtype, rank, tag, MPI_COMM_WORLD);
if(mpi_err != MPI_SUCCESS) {
live_ranks[rank] = 0;
}
}
In the case of non-blocking communications, errors can appear during wait/test operations.