Posted by msmash on Thursday December 12, 2024 @02:35AM from the moving-forward dept. Harvard University announced Thursday it's releasing a high-quality dataset of nearly one million public-domain books that could be used by anyone to train large language models and other AI tools. From a report: The dataset was created by Harvard's newly formed Institutional Data Initiative with funding from bot
{{#tags}}- {{label}}
{{/tags}}