Name2Cat: A Lightweight Autonomous Systems Classifier Using Organization Names

Thodi, Martin and Chavula, Josiah and Phokeer, Amreesh (2024) Name2Cat: A Lightweight Autonomous Systems Classifier Using Organization Names, Proceedings of Southern Africa Telecommunication Networks and Applications Conference (SATNAC) 2024, 6 - 8 October 2024, Skukuza Safari Lodge, Kruger National Park, Mpumalanga, South Africa, Southern Africa Telecommunication Networks and Applications Conference (SATNAC), 34 - 40, Southern Africa Telecommunication Networks and Applications Conference (SATNAC).

[thumbnail of Thodi_SATNAC2024_Proceedings-1.pdf] Text
Thodi_SATNAC2024_Proceedings-1.pdf - Published Version
Available under License Creative Commons Attribution No Derivatives.

Download (383kB)

Abstract

Abstract—The Internet’s backbone consists of Autonomous Systems (ASes), each typically managed by a single organisation and providing a related set of services. Accurate AS classification is crucial for understanding various aspects of Internet infrastructure, including network performance and the economic behaviours of the organisations that manage them. This granular insight is invaluable for network operators and researchers alike. In this paper, we propose a lightweight method for classifying ASes based solely on the names of the ASes and their owning organisations. By employing text feature extraction techniques, we convert these names into numerical features suitable for machine learning models. Our approach achieves an overall accuracy of 80%, with F1-scores ranging from 70% to 92% across six different categories. The method performs particularly well in categories with distinctive naming conventions, which aid classification while facing challenges in categories like Transit that have less distinctive naming patterns. Although our approach uses fewer categories than the 17 found in the state-of-the-art ASdb system, which relies on a mix of public and proprietary datasets to achieve accuracies between 75% and 93%, it offers a quick and resource-efficient solution for AS classification when detailed AS information is unavailable.

Item Type: Conference paper
Uncontrolled Keywords: Terms—autonomous systems, classification, Internet, Ridge classification, TF-IDF, machine learning
Subjects: Networks > Network algorithms > Network economics
Social and professional topics > Computing / technology policy > Intellectual property > Internet governance / domain names
Networks > Network types > Public Internet
Date Deposited: 05 Dec 2024 05:47
Last Modified: 05 Dec 2024 05:47
URI: https://pubs.cs.uct.ac.za/id/eprint/1708

Actions (login required)

View Item View Item