摘要:We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners' average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. State-space temporal analysis suggests that onset of depression may be detectable from Twitter data several months prior to diagnosis. Predictive results were replicated with a separate sample of individuals diagnosed with PTSD (Nusers = 174, Ntweets = 243,775). A state-space time series model revealed indicators of PTSD almost immediately post-trauma, often many months prior to clinical diagnosis. These methods suggest a data-driven, predictive approach for early screening and detection of mental illness.