Why wordfreq will not be updated

wordfreq will no longer be updated due to numerous reasons. The data has been contaminated by generative AI, making it unreliable for post-2021 language usage. The inclusion of slop from large language models has skewed word frequencies. External sources like Twitter and Reddit, once used for data collection, have restricted access or become unusable. The field of natural language processing is being overshadowed by generative AI controlled by companies like OpenAI and Google, a trend the author opposes. Robyn Speer no longer wants to be associated with projects that could be seen as supporting generative AI. She hopes these companies face consequences for their actions.

https://github.com/rspeer/wordfreq/blob/master/SUNSET.md

To top