Abstract
Users of online social networks often use multiple identities. This paper investigates the possibility of identifying a user from his or her chat behavior in such a setting. We have collected a large corpus of multiparty chat records in Turkish, obtained from a multiplayer game database. The most active 978 users are selected according to their participation in game chat sessions. This corpus is used in a biometric identification experiment where we seek each user among a gallery of users. Character matrices for each player are used as features, and re-centered local profiles and cosine similarity measure are preferred as identification methods. We systematically assess the effect of text normalization on identification. We report comparative results, the best of which reach around 75% rank-1 accuracy for a gallery size of 978.
Original language | English |
---|---|
Title of host publication | Proceedings - 2016 4th International Workshop on Biometrics and Forensics, IWBF 2016 |
Publisher | IEEE |
ISBN (Electronic) | 9781467394482 |
DOIs | |
Publication status | Published - 7 Apr 2016 |
Event | 4th International Workshop on Biometrics and Forensics, IWBF 2016 - Limassol, Cyprus Duration: 3 Mar 2016 → 4 Mar 2016 |
Conference
Conference | 4th International Workshop on Biometrics and Forensics, IWBF 2016 |
---|---|
Country/Territory | Cyprus |
City | Limassol |
Period | 3/03/16 → 4/03/16 |
Funding
This work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) with project number 114E481 and by the Turkish Ministry of Development under the TAM Project number DPT2007K120610
Keywords
- Authorship recognition
- Chat biometrics
- Chat mining
- Machine learning
- Multiparty chat
- Text classification
- Text information retrieval