Figurer-based personality judgments are more accurate than those made past humans

Encounter all Hide authors and affiliations

  1. Edited by David Funder, Academy of California, Riverside, CA, and accepted by the Editorial Lath December two, 2022 (received for review September 28, 2014)

Loading

Significance

This written report compares the accuracy of personality judgment—a ubiquitous and important social-cognitive activity—betwixt computer models and humans. Using several criteria, we testify that computers' judgments of people's personalities based on their digital footprints are more accurate and valid than judgments fabricated past their shut others or acquaintances (friends, family, spouse, colleagues, etc.). Our findings highlight that people'southward personalities can exist predicted automatically and without involving homo social-cognitive skills.

Abstract

Judging others' personalities is an essential skill in successful social living, as personality is a key driver backside people's interactions, behaviors, and emotions. Although authentic personality judgments stem from social-cognitive skills, developments in machine learning show that figurer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We bear witness that (i) estimator predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants' Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models testify college interjudge understanding; and (3) calculator personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy.

  • personality judgment
  • social media
  • computational social science
  • bogus intelligence
  • big information

Perceiving and judging other people's personality traits is an essential component of social living (1, ii). People use personality judgments to make twenty-four hours-to-day decisions and long-term plans in their personal and professional person lives, such as whom to befriend, ally, trust, hire, or elect as president (three). The more accurate the judgment, the better the decision (2, iv, five). Previous research has shown that people are fairly good at judging each other's personalities (half dozen–viii); for example, even complete strangers can make valid personality judgments after watching a brusk video presenting a sample of beliefs (9, 10).

Although it is typically believed that accurate personality perceptions stalk from social-cognitive skills of the human brain, recent developments in machine learning and statistics show that reckoner models are as well capable of making valid personality judgments by using digital records of homo behavior (xi–13). However, the comparative accuracy of figurer and human being judgments remains unknown; this study addresses this gap.

Personality traits, like many other psychological dimensions, are latent and cannot be measured direct; diverse perspectives exist regarding the evaluation criteria of judgmental accuracy (3, 5). We adopted the realistic approach, which assumes that personality traits correspond real private characteristics, and the accuracy of personality judgments may be benchmarked using iii fundamental criteria: self-other agreement, interjudge agreement, and external validity (1, 5, 7). We apply those benchmarks to a sample of 86,220 volunteers,* who filled in the 100-detail International Personality Detail Puddle (IPIP) 5-Factor Model of personality (fourteen) questionnaire (15), measuring traits of openness, conscientiousness, extraversion, agreeableness, and neuroticism.

Figurer-based personality judgments, based on Facebook Likes, were obtained for lxx,520 participants. Likes were previously shown to successfully predict personality and other psychological traits (11). Nosotros used LASSO (To the lowest degree Accented Shrinkage and Choice Operator) linear regressions (16) with 10-fold cross-validations, so that judgments for each participant were made using models developed on a different subsample of participants and their Likes. Likes are used by Facebook users to express positive association with online and offline objects, such as products, activities, sports, musicians, books, restaurants, or websites. Given the diverseness of objects, subjects, brands, and people that tin can be liked and the number of Facebook users (>1.3 billion), Likes represent i of the most generic kinds of digital footprint. For example, liking a make or a product offers a proxy for consumer preferences and purchasing behavior; music-related Likes reveal music taste; and liked websites let for approximating web browsing behavior. Consequently, Like-based models offer a good proxy of what could be achieved based on a wide range of other digital footprints such every bit web browsing logs, web search queries, or purchase records (eleven).

Man personality judgments were obtained from the participants' Facebook friends, who were asked to describe a given participant using a ten-detail version of the IPIP personality measure. To compute cocky-other agreement and external validity, we used a sample of 17,622 participants judged by one friend; to calculate interjudge agreement, we used a sample of 14,410 participants judged by two friends. A diagram illustrating the methods is presented in Fig. i.

Results

Self-Other Agreement.

The primary criterion of judgmental accuracy is self-other agreement: the extent to which an external judgment agrees with the target's self-rating (17), commonly operationalized as a Pearson product-moment correlation. Self-other understanding was determined by correlating participants' scores with the judgments made by humans and computer models (Fig. one). Since self-other understanding varies profoundly with the length and context of the relationship (18, 19), we further compared our results with those previously published in a meta-analysis by Connely and Ones (20), including estimates for unlike categories of human judges: friends, spouses, family members, cohabitants, and work colleagues.

To account for the questionnaires' measurement error, self-other agreement estimates were disattenuated using scales' Cronbach's α reliability coefficients. The measurement error of the computer model was assumed to be 0, resulting in the lower (conservative) estimates of self-other understanding for computer-based judgments. Also, disattenuation allowed for straight comparisons of human self-other agreement with those reported past Connely and Ones (xx), which followed the aforementioned procedure.

The results presented in Fig. ii prove that computers' average accuracy beyond the Big Five traits (red line) steadily grows with the number of Likes bachelor on the participant's profile (x centrality). Computer models demand just 100 Likes to outperform an boilerplate man judge in the present sample (r = 0.49; bluish point). Compared with the accuracy of various human judges reported in the meta-analysis (twenty), computer models need 10, lxx, 150, and 300 Likes, respectively, to outperform an average piece of work colleague, cohabitant or friend, family member, and spouse (grey points). Detailed results for human judges tin can be found in Table S1.

Fig. 2.

Fig. 2.

Estimator-based personality judgment accurateness (y centrality), plotted against the number of Likes available for prediction (ten axis). The red line represents the boilerplate accuracy (correlation) of computers' judgment across the five personality traits. The five-trait boilerplate accuracy of human judgments is positioned onto the computer accurateness curve. For example, the accuracy of an average homo individual (r = 0.49) is matched by that of the computer models based on effectually xc–100 Likes. The computer accuracy curves are smoothed using a LOWESS approach. The gray ribbon represents the 95% CI. Accurateness was averaged using Fisher's r-to-z transformation.

How accurate is the computer, given an average person? Our recent estimate of an average number of Likes per private is 227 (95% CI = 224, 230), and the expected estimator accuracy for this number of Likes equals r = 0.56. This accuracy is significantly better than that of an average human judge (z = three.68, P < 0.001) and comparable with an average spouse, the best of human judges (r = 0.58, z = −i.68, P = 0.09). The height estimator performance observed in this study reached r = 0.66 for participants with more 500 Likes. The approximately log-linear relationship between the number of Likes and computer accuracy, shown in Fig. 2, suggests that increasing the amount of signal beyond what was available in this study could further boost the accuracy, although gains are expected to be diminishing.

Why are Likes diagnostic of personality? Exploring the Likes most predictive of a given trait shows that they represent activities, attitudes, and preferences highly aligned with the Big V theory. For example, participants with loftier openness to experience tend to like Salvador Dalí, meditation, or TED talks; participants with high extraversion tend to like partying, Snookie (reality bear witness star), or dancing.

Self-other agreement estimates for private Big Five traits (Fig. two) reveal that the Likes-based models are more diagnostic of some traits than of others. Peculiarly high accuracy was observed for openness—a trait known to be otherwise hard to judge due to low observability (21, 22). This finding is consequent with previous findings showing that strangers' personality judgments, based on digital footprints such as the contents of personal websites (23), are specially accurate in the instance of openness. As openness is largely expressed through individuals' interests, preferences, and values, we argue that the digital environment provides a wealth of relevant clues presented in a highly observable way.

Interestingly, it seems that human and computer judgments capture singled-out components of personality. Tabular array S2 lists correlations and partial correlations (all disattenuated) between self-ratings, estimator judgments, and man judgments, based on a subsample of participants (north = 1,919) for whom both computer and human judgments were available. The average consensus between computer and man judgments (r = 0.37) is relatively high, merely information technology is mostly driven by their correlations with self-ratings, as represented by the depression partial correlations (r = 0.07) between computer and man judgments. Substantial partial correlations between self-ratings and both computer (r = 0.38) and human judgments (r = 0.42) suggest that estimator and human judgments each provide unique information.

Interjudge Understanding.

Another indication of the judgment accuracy, interjudge agreement, builds on the notion that 2 judges that agree with each other are more likely to be accurate than those that do not (3, 24–26).

The interjudge agreement for humans was computed using a subsample of 14,410 participants judged past two friends. As the judgments were aggregated (averaged) on collection (i.east., we did not store judgments separately for the judges), a formula was used to compute their intercorrelation (SI Text). Interjudge understanding for computer models was estimated past randomly splitting the Likes into two halves and developing two separate models following the procedure described in the previous section.

The average consensus betwixt computer models, expressed as the Pearson product-moment correlation across the Big Five traits (r = 0.62), was much higher than the approximate for human judges observed in this written report (r = 0.38, z = 36.viii, P < 0.001) or in the meta-analysis (xx) (r = 0.41, z = 41.99, P < 0.001). All results were corrected for attenuation.

External Validity.

The third measure of judgment accuracy, external validity, focuses on how well a judgment predicts external criteria, such every bit real-life behavior, behaviorally related traits, and life outcomes (three). Participants' self-rated personality scores, as well as humans' and computers' judgments, were entered into regression models (linear or logistic for continuous and dichotomous variables respectively) to predict xiii life outcomes and traits previously shown to exist related to personality: life satisfaction, depression, political orientation, self-monitoring, impulsivity, values, sensational interests, field of study, substance use, concrete health, social network characteristics, and Facebook activities (see Table S3 for detailed descriptions). The accuracy of those predictions, or external validity, is expressed as Pearson product-moment correlations for continuous variables, or area under the receiver-operating characteristic bend (AUC) for dichotomous variables.§

As shown in Fig. iii, the external validity of the computer judgments was college than that of human judges in 12 of the thirteen criteria (except life satisfaction). Furthermore, calculator models' external validity was even better than self-rated personality in 4 of the 13 criteria: Facebook activities, substance use, discipline, and network size; and comparable in predicting political attitudes and social network characteristics. Because nigh of the outcome variables are self-reports, the high external validity of personality self-ratings is to be expected. It is therefore hitting that Likes-based judgments were nonetheless better at predicting variables such as field of study or cocky-rated substance use, despite them sharing more method variance with self-ratings of personality. In add-on, the estimator-based models were aimed at predicting personality scores and not life outcomes. In fact, Likes-based models, direct aimed at predicting such variables, can achieve even college accurateness (11).

Fig. 3.

Fig. 3.

The external validity of personality judgments and self-ratings beyond the range of life outcomes, expressed as correlation (continuous variables; Upper) or AUC (dichotomous variables; Lower). The red, yellow, and blue bars betoken the external validity of cocky-ratings, human judgments, and computer judgments, respectively. For example, self-rated scores allow predicting network size with accuracy of r = 0.23, human judgments achieve r = 0.17 accuracy (or 0.06 less than self-ratings), whereas computer-based judgments achieve r = 0.24 accuracy (or 0.01 more than self-ratings). Chemical compound variables (i.eastward., variables representing accuracy averaged across a few subvariables) are marked with an asterisk; meet Table S4 for detailed results. Results are ordered past computer accuracy.

Discussion

Our results evidence that calculator-based models are significantly more than authentic than humans in a cadre social-cognitive job: personality judgment. Computer-based judgments (r = 0.56) correlate more strongly with participants' cocky-ratings than average human judgments do (r = 0.49). Moreover, calculator models showed higher interjudge understanding and higher external validity (figurer-based personality judgments were better at predicting life outcomes and other behaviorally related traits than human judgments). The potential growth in both the sophistication of the computer models and the amount of the digital footprint might lead to estimator models outperforming humans even more than decisively.

According to the Realistic Accuracy Model, the accuracy of the personality judgment depends on the availability and the amount of the relevant behavioral information, along with the judges' ability to detect and use it correctly (i, 2, v). Such conceptualization reveals a couple of major advantages that computers accept over humans. Showtime, computers have the capacity to store a tremendous amount of information, which is difficult for humans to retain and access. 2d, the way computers use information—through statistical modeling—generates consistent algorithms that optimize the judgmental accuracy, whereas humans are affected by various motivational biases (27). Yet, homo perceptions have the advantage of existence flexible and able to capture many hidden cues unavailable to machines. Because the Big Five personality traits only represent some aspects of human personality, human judgments might still be better at describing other traits that crave subtle cognition or that are less evident in digital behavior. Our study is limited in that man judges could only describe the participants using a 10-item-long questionnaire on the Big Five traits. In reality, they might have more cognition than what was assessed in the questionnaire.

Automated, accurate, and cheap personality assessment tools could touch society in many ways: marketing messages could be tailored to users' personalities; recruiters could better friction match candidates with jobs based on their personality; products and services could adjust their behavior to best match their users' characters and changing moods; and scientists could collect personality data without burdening participants with lengthy questionnaires. Furthermore, in the time to come, people might abandon their own psychological judgments and rely on computers when making of import life decisions, such as choosing activities, career paths, or even romantic partners. It is possible that such information-driven decisions will improve people'southward lives.

However, knowledge of people's personalities tin also be used to manipulate and influence them (28). Understandably, people might distrust or refuse digital technologies after realizing that their government, internet provider, web browser, online social network, or search engine can infer their personal characteristics more accurately than their closest family members. We hope that consumers, engineering developers, and policy-makers will tackle those challenges by supporting privacy-protecting laws and technologies, and giving the users full control over their digital footprints.

Popular civilization has depicted robots that surpass humans in making psychological inferences. In the film Her, for example, the main character falls in love with his operating organization. By curating and analyzing his digital records, his computer tin understand and respond to his thoughts and needs much better than other humans, including his long-term girlfriend and closest friends. Our research, forth with development in robotics (29, xxx), provides empirical evidence that such a scenario is becoming increasingly likely as tools for digital assessment come to maturity. The ability to accurately assess psychological traits and states, using digital footprints of beliefs, occupies an important milestone on the path toward more social human-calculator interactions.

Acknowledgments

We thank John Rust, Thore Graepel, Patrick Morse, Vesselin Popov, Winter Stonemason, Jure Leskovec, Isabelle Abraham, and Jeremy Peang-Meth for their critical reading of the manuscript. W.Y. was supported by the Jardine Foundation; D.South. was supported by a grant from the Richard Benjamin Trust; and K.Yard. was supported by Microsoft Research, Boeing Corporation, the National Science Foundation, the Defence Avant-garde Research Projects Agency, and the Center for the Written report of Language and Information at Stanford Academy.

Footnotes

  • Author contributions: W.Y. and Yard.Yard. designed research; W.Y., Grand.K., and D.South. performed research; Due west.Y. and M.One thousand. contributed new reagents/analytic tools; W.Y. and M.K. analyzed data; and W.Y., Yard.K., and D.South. wrote the newspaper.

  • Conflict of involvement statement: D.Due south. received revenue as the possessor of the myPersonality Facebook awarding.

  • This article is a PNAS Direct Submission. D.F. is a invitee editor invited past the Editorial Board.

  • Information deposition: The data used in the study are shared with the bookish community at mypersonality.org.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1418680112/-/DCSupplemental.

  • ↵*The sample used in this report was obtained from the myPersonality projection. myPersonality was a pop Facebook application that offered to its users psychometric tests and feedback on their scores. Since the data are secondary, anonymized, was previously published in the public domain, and was originally gathered with an explicit opt-in consent for reuse for research purposes beyond the original project, no IRB approval was needed. This was additionally confirmed by the Psychology Research Ethics Committee at the Academy of Cambridge.

  • This figure is very close to the boilerplate human accuracy (r = 0.48) institute in Connelly and Ones's meta-analysis (xx).

  • Approximate based on a 2022 sample of north = 100,001 Facebook users nerveless for a divide project. Sample used in this study was recorded in the years 2009–2012.

  • §AUC is an equivalent of the probability of correctly classifying 2 randomly selected participants, one from each form, such as liberal vs. conservative political views. Note that for dichotomous variables, the random guessing baseline corresponds to an AUC = 0.l.

Freely available online through the PNAS open access pick.