MBTI From Twitter Profiles

MBTI (Myers-Briggs Type Indicator) identifies a set of psychological characteristics identified through special psychometric questionnaires and discriminates among sixteen possible personality types, identified by four different characteristics that can each present in two alternative ways:

  • Introversion (I) / Extroversion (E)
  • Sensitivity (S) / Intuition (N)
  • Reasoning (R) / Feeling (F)
  • Judgment (J) / Perception (P)

So I wondered if it was possible to identify the personality of a user based on what he writes publicly on forums and / or social networks (Reddit, Facebook, Twitter, …) without having to go through a questionnaire (the latter really boring ..).

This summer, thanks to the Big Data Computing course, I got to do this experiment and I also got to learn more about the #Spark framework and the #SparkNLP library.

I do not make spoilers on the results, I invite you to have a look at the project documented through a python notebook and slides (you can find everything here).

Tiny spoiler: the final testing has been done on particular characters 😄