XPK Start: Thu Apr 23 16:26:14 UTC 2026 2026-04-23 16:26:18.677667: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1776961578.690932 10 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1776961578.694723 10 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered W0000 00:00:1776961578.706516 10 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1776961578.706543 10 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1776961578.706545 10 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1776961578.706551 10 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. 2026-04-23 16:26:37.874516: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303) I0423 16:26:38.401066 134025324922688 max_utils.py:273] Attempting to initialize the jax distributed system... INFO:2026-04-23 16:26:47,443:jax._src.distributed:140: Starting JAX distributed service on [::]:8482 I0423 16:26:47.443027 134025324922688 distributed.py:140] Starting JAX distributed service on [::]:8482 INFO:2026-04-23 16:26:47,445:jax._src.distributed:157: Connecting to JAX distributed service on mt-10-shardy-true-lmyxg-slice-job-0-0.mt-10-shardy-true-lmyxg:8482 I0423 16:26:47.445386 134025324922688 distributed.py:157] Connecting to JAX distributed service on mt-10-shardy-true-lmyxg-slice-job-0-0.mt-10-shardy-true-lmyxg:8482 F0423 16:31:52.447047 10 client.h:77] Terminating process because the JAX distributed service detected fatal errors. This most likely indicates that another task died; see the other task logs for more details. Disable Python buffering, i.e. `python -u`, to be sure to see all the previous output. absl::Status: DEADLINE_EXCEEDED: Deadline Exceeded RPC: /tensorflow.CoordinationService/RegisterTask XPK End: Thu Apr 23 16:31:54 UTC 2026 EXIT_CODE=1