Undertrained

dark

Hand-Picked Top-Read Stories

Court remands suspects over murder of Offa Health College student thumbnail

Court remands suspects over murder of Offa Health College student

General

February 4, 2025

NYSC DG tasks corp members on impactful initiatives in hosts communities thumbnail

NYSC DG tasks corp members on impactful initiatives in hosts communities

General

February 4, 2025

EFCC arraigns former NHIS Boss, Usman Yusuf over alleged N90m fraud thumbnail

EFCC arraigns former NHIS Boss, Usman Yusuf over alleged N90m fraud

General

February 4, 2025

Trending Tags

AI Models Are Undertrained by 100-1000 Times – AI Will Be Better With More Training Resources thumbnail

Science and Medical

AI Models Are Undertrained by 100-1000 Times – AI Will Be Better With More Training Resources

The Chinchilla compute optimal point for an 8B (8 billion parameter) model would be train it for ~200B (billion) tokens. (if you were only interested to get the most “bang-for-the-buck” w.r.t. model performance at that size). So this is training ~75X beyond that point, which is unusual but personally, [Karpathy] thinks this is extremely welcome.

June 21, 2024