Examining the performance of classification algorithms for imbalanced data sets in web author identification

Vorobeva, Alisa A.

Examining the performance of classification algorithms for imbalanced data sets in web author identification

Vorobeva, Alisa A.;

proceedings of the xxth conference of open innovations association fruct 2016 Vol. 664 pp. 385-390

267

vorobeva2016examiningproceedings

Abstract

Individuals, criminals or even terrorist organizations can use web-communication for criminal purposes; to avoid the prosecution they try to hide their identity. To increase level of safety in Web we have to improve the author (or web-user) identification and authentication procedures. In field of web author identification the situation of imbalanced data sets appears rather frequent, when number of one author's texts significantly exceeds the number of other's. This is common situation for the modern web: social networks, blogs, emails etc. Author identification task is some sort of classification task. To develop methods, technics and tools for web author identification we have to examine the performance of classification algorithms for imbalanced data sets. In this work several modern classification algorithms were tested on data sets with various levels of class imbalance and different number of available webpost The best accuracy in all experiments was achieved with Random Forest algorithm.

Keywords

chemistry Biology (General) Engineering (General). Civil engineering (General) Technology Science (General) Medical technology physics telecommunication electronic computers. computer science

Access

URL:

https://fruct.org/publications/fruct18/files/Vor.p...

Citation

ID: 104887

Ref Key: vorobeva2016examiningproceedings

Use this key to autocite in SciMatic or Thesis Manager

References

No Bibliography

Blockchain Verification

Account:

NFT Contract Address:

0x95644003c57E6F55A65596E3D9Eac6813e3566dA

Article ID:

104887

Unique Identifier:

Network:

Scimatic Chain (ID: 481)

Blockchain Readiness Checklist

Authors

Abstract

Journal Name

Year

Title

5/5

Creates 1,000,000 NFT tokens for this article

Token Features:

ERC-1155 Standard NFT
1 Million Supply per Article
Transferable via MetaMask
Permanent Blockchain Record

Scan with Saymatik Web3.0 Wallet

Gas fees required in SCI Coins

Buy SCI

Saymatik Web3.0 Wallet

Google Play

App Store

Coming soon

Reference Key: lastname+year+titlefirstword+journalfirstword

Article Type (Article, Book, Proceedings etc.)

Add a reference in a raw form. Our automatic system will correct it later.

Examining the performance of classification algorithms for imbalanced data sets in web author identification

Abstract

Keywords

Access

Citation

References

References

Blockchain Verification

Blockchain Readiness Checklist

Article Tokenized!

Token Features:

Saymatik Web3.0 Wallet