Predicting Personality from Book Preferences with User-Generated Content Labels
Psychological studies have shown that personality traits are associated with
book preferences. However, past findings are based on questionnaires focusing
on conventional book genres and are unrepresentative of niche content. For a
more comprehensive measure of book content, this study harnesses a massive
archive of content labels, also known as 'tags', created by users of an online
book catalogue, Goodreads.com. Combined with data on preferences and
personality scores collected from Facebook users, the tag labels achieve high
accuracy in personality prediction by psychological standards. We also group
tags into broader genres, to check their validity against past findings. Our
results are robust across both tag and genre levels of analyses, and consistent
with existing literature. Moreover, user-generated tag labels reveal unexpected
insights, such as cultural differences, book reading behaviors, and other
non-content factors affecting preferences. To our knowledge, this is currently
the largest study that explores the relationship between personality and book
content preferences.