并行计算报错Fatal error in MPI_Recv: Message truncated, error stack
-
各位老师,我在并行计算时,算到某一时间步就会出现以下报错,想问一下应该如何解决?
Fatal error in MPI_Recv: Message truncated, error stack: MPI_Recv(224)...........................: MPI_Recv(buf=0x7ffcc7ef9140, count=4, MPI_BYTE, src=4, tag=1, MPI_COMM_WORLD, status=0x7ffcc7ef9020) failed MPIDI_CH3_PktHandler_EagerShortSend(455): Message from rank 4 and tag 1 truncated; 8 bytes received but buffer size is 4 srun: Job step aborted: Waiting up to 62 seconds for job step to finish. slurmstepd: error: *** STEP 2055946.0 ON x2510 CANCELLED AT 2023-06-28T15:48:37 *** srun: error: x2510: tasks 0-63: Killed srun: error: y2612: tasks 64-127: Killed
我采用了以下的方法进行网格的划分,不知道会不会有影响(64核和128核我都尝试过了,采用128核可以多算一会,但最后还是会出现
Message truncated, error stack
的报错):numberOfSubdomains 128; method scotch;