Logo
Explore Help
Register Sign In
ros/nanochat
1
0
Fork 0
You've already forked nanochat
mirror of https://github.com/karpathy/nanochat.git synced 2026-01-30 04:22:02 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
a1ccb3dc0b7095620751498b8652a6d6647d8c01
nanochat/dev
History
Sofie Van Landeghem a1ccb3dc0b remove rust compilation as rustbpe is now installed from separate package (#416)
2026-01-08 06:18:37 -08:00
..
estimate_gpt3_core.ipynb
add notebook on deriving the CORE estimates for the GPT-3 miniseries.
2026-01-05 18:40:28 +00:00
gen_synthetic_data.py
sane secrets management
2026-01-04 19:29:22 +00:00
generate_logo.html
initial commit
2025-10-13 06:49:24 -07:00
LOG.md
delete grad_clip. appears to not be necessary at all. not only was it buggy because the clipping happened per gpu before grad synchronization, but it costs ~2% MFU, and it also doesn't even help. I tried deleting it a while ago and back then it did help. So I'm guessing that some hyperparameter tuning obviated the reason for it since then
2026-01-08 02:16:50 +00:00
nanochat.png
Update logo
2025-10-14 14:19:44 -04:00
repackage_data_reference.py
initial commit
2025-10-13 06:49:24 -07:00
runcpu.sh
remove rust compilation as rustbpe is now installed from separate package (#416)
2026-01-08 06:18:37 -08:00
scaling_analysis.ipynb
add notebook used for scaling laws analysis
2026-01-07 22:28:53 +00:00
Powered by Gitea Version: 1.24.5 Page: 90ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API