MaxView

← Back to run

Log Summary

XPK Start: Sun Apr 19 03:43:22 UTC 2026
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'rope_theta'}
`rope_scaling`'s factor field must be a float >= 1, got 40
`rope_scaling`'s beta_fast field must be a float, got 32
`rope_scaling`'s beta_slow field must be a float, got 1
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'rope_theta'}
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'rope_theta'}
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'rope_theta'}
2026-04-19 03:43:45.119959: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
I0419 03:43:45.120084 137390531381056 max_utils.py:800] System Information: Jax Version: 0.8.3
I0419 03:43:45.120173 137390531381056 max_utils.py:801] System Information: Jaxlib Version: 0.8.3
I0419 03:43:52.582326 137390531381056 max_utils.py:802] System Information: Jax Backend: PJRT C API
TFRT TPU v6 lite
Built on Dec 15 2025 14:03:46 (1765836226) cl/844590465
I0419 03:43:52.770027 137390531381056 max_utils.py:238] Skipping jax distributed system due to skip_jax_distributed_system=True flag.
I0419 03:43:52.771657 137390531381056 train_rl.py:158] Running RL on a single slice
I0419 03:43:52.771716 137390531381056 train_rl.py:671] Starting RL Training
W0419 03:43:59.849029 137390531381056 file_utils.py:405] Variant folder /root/data/train/gsm8k/1.0.0 has no dataset_info.json
I0419 03:43:59.849192 137390531381056 dataset_builder.py:704] Generating dataset gsm8k (/root/data/train/gsm8k/1.0.0)
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /root/data/train/gsm8k/1.0.0...

Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.896186 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/test.jsonl into /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grrJS2RhFr7vT_IovyhSl4MxHhzrGEOw_UQms8tDg3BQE.jsonl.tmp.6762b5f496ab410e887b4641058b9a8a...

Dl Completed...:   0%|          | 0/1 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.898046 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/test_socratic.jsonl into /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grISouESCebaH1iFx7XNBU9uiZeiQdv4QnXpBcKLD0n1U.jsonl.tmp.c8072e739e95496f97c4e0c6d7816af6...

Dl Completed...:   0%|          | 0/2 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.899847 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/train.jsonl into /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grb6OPNFhl2yOEHFlmWqZu3-9nNXfrgazn2uvcMFm7DsE.jsonl.tmp.553358c1619f43b7a978da62b79d339d...

Dl Completed...:   0%|          | 0/3 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.901511 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/train_socratic.jsonl into /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_gr2MZGZhL1-jEHseV9KjGHESYiWQ5DuPqNnQzNNUrP-hU.jsonl.tmp.1cd906bd5db54bdda8999283b209e7c2...

Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...:   0%|          | 0/1 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...:   0%|          | 0/1 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...:   0%|          | 0/2 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...:   0%|          | 0/2 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  25%|██▌       | 1/4 [00:00<00:00, 20.02 url/s]

Dl Size...:   0%|          | 0/2 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.947460 137253112239872 download_manager.py:598] Skipping extraction for /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grISouESCebaH1iFx7XNBU9uiZeiQdv4QnXpBcKLD0n1U.jsonl (method=NO_EXTRACT).

Dl Completed...:  25%|██▌       | 1/4 [00:00<00:00, 17.93 url/s]

Dl Size...:  50%|█████     | 1/2 [00:00<00:00, 17.90 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 33.95 url/s]

Dl Size...:  50%|█████     | 1/2 [00:00<00:00, 16.97 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 33.75 url/s]

Dl Size...: 100%|██████████| 2/2 [00:00<00:00, 33.23 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.956328 137253120632576 download_manager.py:598] Skipping extraction for /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grrJS2RhFr7vT_IovyhSl4MxHhzrGEOw_UQms8tDg3BQE.jsonl (method=NO_EXTRACT).

Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 29.61 url/s]

Dl Size...: 3 MiB [00:00, 44.43 MiB/s]                     


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 28.98 url/s]

Dl Size...: 4 MiB [00:00, 57.96 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 26.41 url/s]

Dl Size...: 5 MiB [00:00, 66.01 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 26.21 url/s]

Dl Size...: 6 MiB [00:00, 78.64 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 23.78 url/s]

Dl Size...: 7 MiB [00:00, 83.25 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 35.55 url/s]

Dl Size...: 7 MiB [00:00, 82.90 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.980989 137253103847168 download_manager.py:598] Skipping extraction for /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grb6OPNFhl2yOEHFlmWqZu3-9nNXfrgazn2uvcMFm7DsE.jsonl (method=NO_EXTRACT).

Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 32.69 url/s]

Dl Size...: 8 MiB [00:00, 87.18 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...: 100%|██████████| 4/4 [00:00<00:00, 43.10 url/s]

Dl Size...: 8 MiB [00:00, 86.23 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:43:59.989173 137253095454464 download_manager.py:598] Skipping extraction for /root/data/train/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_gr2MZGZhL1-jEHseV9KjGHESYiWQ5DuPqNnQzNNUrP-hU.jsonl (method=NO_EXTRACT).

Extraction completed...: 0 file [00:00, ? file/s]

Dl Size...: 8 MiB [00:00, 84.56 MiB/s]

Dl Completed...: 100%|██████████| 4/4 [00:00<00:00, 42.22 url/s]

Generating splits...:   0%|          | 0/4 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

                                                              

Shuffling /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-train.array_record*...:   0%|          | 0/7473 [00:00<?, ? examples/s]

                                                                                                                                      I0419 03:44:00.411203 137390531381056 writer.py:431] Done writing /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-train.array_record*. Number of examples: 7473 (shards: [7473])

Generating splits...:  25%|██▌       | 1/4 [00:00<00:01,  2.36 splits/s]

Generating test examples...: 0 examples [00:00, ? examples/s]

                                                             

Shuffling /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-test.array_record*...:   0%|          | 0/1319 [00:00<?, ? examples/s]

                                                                                                                                     I0419 03:44:00.501586 137390531381056 writer.py:431] Done writing /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-test.array_record*. Number of examples: 1319 (shards: [1319])


Generating train_socratic examples...: 0 examples [00:00, ? examples/s]

                                                                       

Shuffling /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-train_socratic.array_record*...:   0%|          | 0/7473 [00:00<?, ? examples/s]

                                                                                                                                               I0419 03:44:00.924877 137390531381056 writer.py:431] Done writing /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-train_socratic.array_record*. Number of examples: 7473 (shards: [7473])

Generating splits...:  75%|███████▌  | 3/4 [00:00<00:00,  3.33 splits/s]

Generating test_socratic examples...: 0 examples [00:00, ? examples/s]

                                                                      

Shuffling /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-test_socratic.array_record*...:   0%|          | 0/1319 [00:00<?, ? examples/s]

                                                                                                                                              I0419 03:44:01.017819 137390531381056 writer.py:431] Done writing /root/data/train/gsm8k/incomplete.7BWZLU_1.0.0/gsm8k-test_socratic.array_record*. Number of examples: 1319 (shards: [1319])


I0419 03:44:01.019633 137390531381056 dataset_builder.py:892] Found random access formats: . Chose to use FileFormat.ARRAY_RECORD. Overriding file format in the dataset info.
W0419 03:44:01.023701 137390531381056 file_utils.py:405] Variant folder /root/data/test/gsm8k/1.0.0 has no dataset_info.json
I0419 03:44:01.023814 137390531381056 dataset_builder.py:704] Generating dataset gsm8k (/root/data/test/gsm8k/1.0.0)
Dataset gsm8k downloaded and prepared to /root/data/train/gsm8k/1.0.0. Subsequent calls will reuse this data.
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /root/data/test/gsm8k/1.0.0...

Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.025207 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/test.jsonl into /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grrJS2RhFr7vT_IovyhSl4MxHhzrGEOw_UQms8tDg3BQE.jsonl.tmp.e2de604a140c45078e63b63cd8e5965e...

Dl Completed...:   0%|          | 0/1 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.026622 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/test_socratic.jsonl into /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grISouESCebaH1iFx7XNBU9uiZeiQdv4QnXpBcKLD0n1U.jsonl.tmp.d1a63bad8e3a4d3687b6f6c380d04deb...

Dl Completed...:   0%|          | 0/2 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.027750 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/train.jsonl into /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grb6OPNFhl2yOEHFlmWqZu3-9nNXfrgazn2uvcMFm7DsE.jsonl.tmp.44ed48f3732a4249ad8a3f2dda0c404e...

Dl Completed...:   0%|          | 0/3 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.029275 137390531381056 download_manager.py:557] Downloading https://raw.githubusercontent.com/openai/grade-school-math/master/grade_school_math/data/train_socratic.jsonl into /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_gr2MZGZhL1-jEHseV9KjGHESYiWQ5DuPqNnQzNNUrP-hU.jsonl.tmp.eed1274e9f084b76970759b757c8e560...

Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:   0%|          | 0/4 [00:00<?, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  25%|██▌       | 1/4 [00:00<00:00, 24.66 url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  25%|██▌       | 1/4 [00:00<00:00, 24.07 url/s]

Dl Size...:   0%|          | 0/1 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.067353 137253112239872 download_manager.py:598] Skipping extraction for /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grrJS2RhFr7vT_IovyhSl4MxHhzrGEOw_UQms8tDg3BQE.jsonl (method=NO_EXTRACT).

Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 43.65 url/s]

Dl Size...:   0%|          | 0/1 [00:00<?, ? MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.071788 137253120632576 download_manager.py:598] Skipping extraction for /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grISouESCebaH1iFx7XNBU9uiZeiQdv4QnXpBcKLD0n1U.jsonl (method=NO_EXTRACT).

Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 37.27 url/s]

Dl Size...: 100%|██████████| 1/1 [00:00<00:00, 18.63 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 33.33 url/s]

Dl Size...: 2 MiB [00:00, 33.34 MiB/s]                     


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  50%|█████     | 2/4 [00:00<00:00, 30.14 url/s]

Dl Size...: 3 MiB [00:00, 45.21 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 41.42 url/s]

Dl Size...: 3 MiB [00:00, 41.42 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.098393 137253095454464 download_manager.py:598] Skipping extraction for /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_grb6OPNFhl2yOEHFlmWqZu3-9nNXfrgazn2uvcMFm7DsE.jsonl (method=NO_EXTRACT).

Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 26.54 url/s]

Dl Size...: 3 MiB [00:00, 26.54 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 23.46 url/s]

Dl Size...: 4 MiB [00:00, 31.28 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 22.42 url/s]

Dl Size...: 5 MiB [00:00, 37.36 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 21.45 url/s]

Dl Size...: 6 MiB [00:00, 42.90 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 20.58 url/s]

Dl Size...: 7 MiB [00:00, 48.01 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...:  75%|███████▌  | 3/4 [00:00<00:00, 19.76 url/s]

Dl Size...: 8 MiB [00:00, 52.70 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]
Dl Completed...: 100%|██████████| 4/4 [00:00<00:00, 26.12 url/s]

Dl Size...: 8 MiB [00:00, 52.23 MiB/s]


Extraction completed...: 0 file [00:00, ? file/s]I0419 03:44:01.179533 137253103847168 download_manager.py:598] Skipping extraction for /root/data/test/downloads/gsm8k/raw.gith.com_open_grad-scho-math_mast_gr2MZGZhL1-jEHseV9KjGHESYiWQ5DuPqNnQzNNUrP-hU.jsonl (method=NO_EXTRACT).

Extraction completed...: 0 file [00:00, ? file/s]

Dl Size...: 8 MiB [00:00, 51.53 MiB/s]

Dl Completed...: 100%|██████████| 4/4 [00:00<00:00, 25.74 url/s]

Generating splits...:   0%|          | 0/4 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

                                                              

Shuffling /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-train.array_record*...:   0%|          | 0/7473 [00:00<?, ? examples/s]

                                                                                                                                     I0419 03:44:01.591173 137390531381056 writer.py:431] Done writing /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-train.array_record*. Number of examples: 7473 (shards: [7473])

Generating splits...:  25%|██▌       | 1/4 [00:00<00:01,  2.42 splits/s]

Generating test examples...: 0 examples [00:00, ? examples/s]

                                                             

Shuffling /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-test.array_record*...:   0%|          | 0/1319 [00:00<?, ? examples/s]

                                                                                                                                    I0419 03:44:01.682210 137390531381056 writer.py:431] Done writing /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-test.array_record*. Number of examples: 1319 (shards: [1319])


Generating train_socratic examples...: 0 examples [00:00, ? examples/s]

                                                                       

Shuffling /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-train_socratic.array_record*...:   0%|          | 0/7473 [00:00<?, ? examples/s]

                                                                                                                                              I0419 03:44:02.106935 137390531381056 writer.py:431] Done writing /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-train_socratic.array_record*. Number of examples: 7473 (shards: [7473])

Generating splits...:  75%|███████▌  | 3/4 [00:00<00:00,  3.36 splits/s]

Generating test_socratic examples...: 0 examples [00:00, ? examples/s]

                                                                      

Shuffling /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-test_socratic.array_record*...:   0%|          | 0/1319 [00:00<?, ? examples/s]

                                                                                                                                             I0419 03:44:02.200880 137390531381056 writer.py:431] Done writing /root/data/test/gsm8k/incomplete.F8SEB1_1.0.0/gsm8k-test_socratic.array_record*. Number of examples: 1319 (shards: [1319])


I0419 03:44:02.202722 137390531381056 dataset_builder.py:892] Found random access formats: . Chose to use FileFormat.ARRAY_RECORD. Overriding file format in the dataset info.
I0419 03:44:02.252399 137390531381056 train_rl.py:416] Creating reference model and also meshes for reference and rollout
I0419 03:44:02.255696 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:02.356034 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:02.465927 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:07.725217 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:07.725368 137390531381056 train_rl.py:430] Creating policy model with same config as reference model on trainer mesh
I0419 03:44:07.727586 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:07.781121 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:07.845533 137390531381056 maxtext_utils.py:1631] Num_devices: 32, shape (1, 1, 1, 32, 1, 1, 1, 1, 1, 1, 1, 1, 1)
I0419 03:44:08.341426 137390531381056 train_rl.py:700] Reference Model initialized successfully
XPK End: Sun Apr 19 03:49:07 UTC 2026
EXIT_CODE=143