nanochat/scripts at 306bc380ab62b9adb82f71d1c4eb606428329bbd - nanochat - Gitea: Git with a cup of tea

ros/nanochat

Files

History

karpathy 306bc380ab add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights

2025-10-16 10:04:43 -07:00

..

base_eval.py

initial commit

2025-10-13 06:49:24 -07:00

base_loss.py

initial commit

2025-10-13 06:49:24 -07:00

base_train.py

add support for CPU and for MPS. I had to change a few cosmetic things. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights

2025-10-16 10:04:43 -07:00

chat_cli.py

initial commit

2025-10-13 06:49:24 -07:00

chat_eval.py

initial commit

2025-10-13 06:49:24 -07:00

chat_rl.py

initial commit

2025-10-13 06:49:24 -07:00

chat_sft.py

dont evaluate the sampling evals during SFT they are too slow. keep the multiple choice evals. delete unused imports

2025-10-15 16:42:23 +00:00

chat_web.py

also allow regenerating assistant message by clicking it, and make sure to feed good seed to generate

2025-10-16 01:28:37 +00:00

mid_train.py

fix bug in learning rate multiplier, it was ramping up instead of ramping down. see more in Issue #68 . also add --dry_run option useful for experimentation

2025-10-15 16:35:04 +00:00

tok_eval.py

initial commit

2025-10-13 06:49:24 -07:00

tok_train.py

initial commit

2025-10-13 06:49:24 -07:00