Examining Multiple Features for Author Profiling
DOI:
https://doi.org/10.5753/jidm.2014.1543Keywords:
author profiling, classification, text mining, information retrievalAbstract
Authorship analysis aims at classifying texts based on the stylistic choices of their authors.The idea is to discover characteristics of the authors of the texts.
This task has a growing importance in forensics, security, and marketing.
In this work, we focus on discovering age and gender from blog authors.
With this goal in mind, we analyzed a large number of features -- ranging from Information Retrieval to Sentiment Analysis.
This paper reports on the usefulness of these features.
Experiments on a corpus of over 236K blogs show that a classifier using the features explored here have outperformed the state-of-the art.
More importantly, the experiments show that the Information Retrieval features proposed in our work are the most discriminative and yield the best class predictions.
Downloads
Download data is not yet available.
Downloads
Additional Files
Published
2014-10-02
How to Cite
Weren, E. R. D., Kauer, A. U., Mizusaki, L., Moreira, V. P., de Oliveira, J. P. M., & Wives, L. K. (2014). Examining Multiple Features for Author Profiling. Journal of Information and Data Management, 5(3), 266. https://doi.org/10.5753/jidm.2014.1543
Issue
Section
SBBD Articles