Posts by jmb@mstdn.party
(DIR) Post #ATdxVTbOPP147NtO1A by jmb@mstdn.party
2023-03-15T15:14:10Z
0 likes, 0 repeats
@TedUnderwood "... past 1T parameters (20T tokens), training data collection would naturally have to rely on alternative text-based and multimodal content. ...Fundamentally, it should not be an incredibly onerous process to collect petabytes of high-quality and filtered multimodal data (converted to text), though that task has not yet been accomplished by any AI lab to date (Jun/2022). - https://lifearchitect.ai/chinchilla/