first version of engram following modded nanogpt style

This commit is contained in:
Andrej Karpathy
2026-01-25 18:59:51 +00:00
parent 85b3e95e09
commit 59e36cc727
2 changed files with 58 additions and 8 deletions

View File

@@ -1,11 +1,11 @@
"""
Train model. From root directory of the project, run as:
python -m scripts.base_train.py
python -m scripts.base_train
or distributed as:
torchrun --nproc_per_node=8 -m scripts.base_train.py
torchrun --nproc_per_node=8 -m scripts.base_train
If you are only on CPU/Macbook, you'll want to train a much much smaller LLM. Example:
python -m scripts.base_train --depth=4 --max-seq-len=512 --device-batch-size=1 --eval-tokens=512 --core-metric-every=-1 --total-batch-size=512 --num-iterations=20