Bluesky Social Dataset addresses the problem of misinformation in online social spaces by providing a comprehensive dataset of social interactions and user-generated content. The dataset includes the post history of 4M users, covering 235M posts, as well as data on follow, comment, repost, and quote interactions. Unique to Bluesky, users can create and bookmark content recommendation algorithms, allowing for the release of popular algorithms’ output and interaction data. This dataset offers unprecedented insights into online behavior, human-machine engagement patterns, and effects of content exposure and self-selection. Controversially, reducing access to social media APIs is hindering the advancement of computational social science.
https://zenodo.org/records/11082879