This website requires JavaScript.
Explore
Help
Register
Sign In
ros
/
nanochat
Watch
1
Star
0
Fork
0
You've already forked nanochat
mirror of
https://github.com/karpathy/nanochat.git
synced
2026-01-30 04:22:02 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
4bcc3bb698b802766852c3ef1003f91d589f7b66
nanochat
/
scripts
History
svlandeg
4bcc3bb698
clarify comment
2025-11-21 13:19:45 +01:00
..
base_eval.py
add explicit UTF-8 encoding
2025-11-03 21:27:12 +01:00
base_loss.py
many small tweaks. base, eval, core work now i think
2025-10-16 15:46:18 -07:00
base_train.py
big change: add pretraining resumption logic so that checkpoints can now be approximately resumed and training can continue. this is useful for very long runs when you don't want the anxiety of your run crashing for some reason. alternatively, it's a way to recover training in the event of loss spikes. i mean, this should have been there in v0 but it's ok. the resumption is approximate to control complexity and bloat, but it's possible we want to change that in the future. to use, set --save_every to a step interval to write checkpoints with, and then use --resume_from_step to resume optimization from a given step. only base model training (pretraining) supports this atm, but it's ok because midtraining is comparably quite a bit faster.
2025-11-13 15:34:40 +00:00
chat_cli.py
upgrading all other files to be able to use cpu/mps as well as cuda. various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming
2025-10-20 10:15:17 -07:00
chat_eval.py
fix typos
2025-11-14 11:20:25 +01:00
chat_rl.py
typo fixes in scripts
2025-10-28 20:17:31 +01:00
chat_sft.py
fix typo
2025-10-29 19:48:34 +01:00
chat_web.py
ensure consistency of quotes within each statement
2025-11-03 21:52:02 +01:00
mid_train.py
clarify comment
2025-11-21 13:19:45 +01:00
tok_eval.py
initial commit
2025-10-13 06:49:24 -07:00
tok_train.py
initial commit
2025-10-13 06:49:24 -07:00