Resumen |
Suicide is a widespread global concern, particularly among the young population with high social media usage. Detecting suicide tendencies early can offer crucial aid. This study investigates the use of Natural Language Processing (NLP) techniques, including Language Models, to identify suicide-related content. By training a language model on a dataset containing both public content and social media posts from individuals with documented suicide tendencies, we aim to develop a tool for recognizing language patterns suggestive of potential suicide risk. Additionally, we explore the integration of the LIME explainer to enhance local interpretability, improving model comprehension. This paper provides a comprehensive exploration of identifying suicide-related text using NLP, employing diverse methodologies including classical machine learning and state-of-the-art Large Language Models (LLMs) like BERT, RoBERTa, and DistilBERT. Remarkably, the DistilBERT model surpasses more complex counterparts, acchieving 0.97741 of validation accuracy and 0.97584 of testing accuracy. Introducing an explainer algorithm improves model transparency, illuminating decision processes. Our findings emphasize the potential of advanced NLP techniques for understanding and addressing suicide-related content, benefiting mental health professionals and digital platforms striving to offer timely support and intervention. |