This website requires JavaScript.
Explore
Help
Register
Sign In
ros
/
nanochat
Watch
1
Star
0
Fork
0
You've already forked nanochat
mirror of
https://github.com/karpathy/nanochat.git
synced
2026-01-29 20:12:03 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
master
nanochat
/
scripts
History
Andrej Karpathy
41bb2eac32
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
..
base_eval.py
bugfix
2025-12-26 19:02:12 +08:00
base_loss.py
update the CPU/MPS script to give reasonable results. The model can at least answer that Paris is the capital of France and knows that the sky is blue, for about 40 minutes of training on my macbook. Also fixed a bug that existed due to KVCache bfloat16 dtype assumption
2026-01-17 12:27:30 -08:00
base_train.py
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
chat_cli.py
upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
2025-10-20 10:15:17 -07:00
chat_eval.py
Fix args in readme (
#438
)
2026-01-15 16:26:38 -08:00
chat_rl.py
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
chat_sft.py
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
chat_web.py
ensure consistency of quotes within each statement
2025-11-03 21:52:02 +01:00
mid_train.py
Combine AdamW and Muon into single MuonAdamW optimizer, cleaner, ty @chrisjmccormick for idea/help
2026-01-29 00:52:08 +00:00
tok_eval.py
initial commit
2025-10-13 06:49:24 -07:00
tok_train.py
quick fix to not OOM main speedrun script
2026-01-26 22:31:42 +00:00