dc.contributor.author | Aubaid, Asmaa M. | |
dc.contributor.author | Mishra, Alok | |
dc.date.accessioned | 2023-10-11T06:16:54Z | |
dc.date.available | 2023-10-11T06:16:54Z | |
dc.date.created | 2020-07-21T14:35:42Z | |
dc.date.issued | 2020 | |
dc.identifier.citation | Applied Sciences. 2020, 10 (11), 1-22. | en_US |
dc.identifier.issn | 2076-3417 | |
dc.identifier.uri | https://hdl.handle.net/11250/3095654 | |
dc.description.abstract | With the growth of online information and sudden expansion in the number of electronic documents provided on websites and in electronic libraries, there is difficulty in categorizing text documents. Therefore, a rule-based approach is a solution to this problem; the purpose of this study is to classify documents by using a rule-based. This paper deals with the rule-based approach with the embedding technique for a document to vector (doc2vec) files. An experiment was performed on two data sets Reuters-21578 and the 20 Newsgroups to classify the top ten categories of these data sets by using a document to vector rule-based (D2vecRule). Finally, this method provided us a good classification result according to the F-measures and implementation time metrics. In conclusion, it was observed that our algorithm document to vector rule-based (D2vecRule) was good when compared with other algorithms such as JRip, One R, and ZeroR applied to the same Reuters-21578 dataset. Keywords: text classification; rule-based; word embedding; Doc2vec. | en_US |
dc.language.iso | eng | en_US |
dc.relation.uri | https://doi.org/10.3390/app10114009 | |
dc.rights | Navngivelse 4.0 Internasjonal | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/deed.no | * |
dc.title | A rule-based approach to embedding techniques for text document classification | en_US |
dc.type | Peer reviewed | en_US |
dc.type | Journal article | en_US |
dc.description.version | publishedVersion | en_US |
dc.source.pagenumber | 1-22 | en_US |
dc.source.volume | 10 | en_US |
dc.source.journal | Applied Sciences | en_US |
dc.source.issue | 11 | en_US |
dc.identifier.doi | 10.3390/app10114009 | |
dc.identifier.cristin | 1820062 | |
cristin.ispublished | true | |
cristin.fulltext | original | |
cristin.qualitycode | 1 | |