CESNET-DeviceType24

Overview

The CESNET-DeviceType24 dataset [1] was created from the CESNET-TimeSeries-24 dataset [2], which captures traffic from the CESNET network. This dataset is specifically designed for device type classification based on time-series behavior analysis.

Dataset Metadata

Property	Value
Type	Recreated dataset
Category	Time Series
Primary Task	Device Type Detection
Source Dataset	CESNET-TimeSeries-24
Time Span	40 weeks (October 2023 – July 2024)
Annotated Devices	82,504 labeled devices
Number of Classes	3 (end-device, net-device, server)

Source Dataset Characteristics

CESNET-TimeSeries-24 Scope

A standout feature of the source CESNET-TimeSeries-24 dataset is its breadth and depth:

Overall network traffic across the entire CESNET network
297 institutions with individual time series
610 institutional subnets tracked separately
270,000+ individual IP addresses monitored

This extensive range provides a robust basis for comparative analysis of neural network models, enabling researchers to benchmark forecasting performance across multiple hierarchical levels within the network. Spanning a substantial 40-week period, the dataset captures both long-term trends and fine-grained fluctuations, offering an invaluable resource for rigorous neural network model assessment.

Annotation Methodology

The annotation process employed a semi-automated approach:

Initial Annotation: Based on prior knowledge of the CESNET3 network infrastructure and connected devices
Extended Annotation: Unknown IP addresses were annotated using:
- Reverse DNS lookups
- Queries to the Shodan platform
Limitations: This approach does not allow reliable annotation of all captured devices within the CESNET3 network

The dataset annotations are also published in the open-source tool CESNET TS-Zoo [3].

Class Distribution

A total of 82,504 devices were reliably labeled and placed into three classes:

Class	Device Count	Percentage	Description
end-device	72,523	87.9%	User devices and NATs
server	7,875	9.5%	Server infrastructure
net-device	2,106	2.6%	Network equipment

As expected, the majority class is end-device, which covers both user devices and Network Address Translation (NAT) systems.

Dataset Splits

The dataset uses temporal splitting to enable assessment of model generalization and stability on future data, addressing the common problem of data drift in network monitoring:

Split	Duration	Sample Count	Purpose
Training	First 26 weeks	2,401,854	Model training
Validation	Next 2 weeks	184,758	Hyperparameter tuning
Test	Final 12 weeks	1,108,548	Performance evaluation

Benefits of Long Test Period

The extended 12-week test set enables:

Multi-week Performance Evaluation: Assess model performance across multiple weeks
Data Drift Detection: Evaluate whether performance degradation occurs over time
Stability Analysis: Determine model robustness to evolving network patterns

How to Cite

@article{koumar2025cesnet,
  title={CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting},
  author={Koumar, Josef and Hynek, Karel and {\v{C}}ejka, Tom{\'a}{\v{s}} and {\v{S}}i{\v{s}}ka, Pavel},
  journal={Scientific Data},
  volume={12},
  number={1},
  pages={338},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

Download

[1] Mudruňka, K., Koumar, J., & Jeřábek, K. (2025). CESNET-DeviceType24: Dataset for Device Type Classification on ISP Network [Data set]. Zenodo.
DOI: 10.5281/zenodo.17542827

References

[2] Koumar, J., Hynek, K., Čejka, T., & Šiška, P. (2025). CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting. Scientific Data, 12(1), 338.

[3] Kureš, M., Koumar, J., & Hynek, K. (2025, October). CESNET TS-Zoo: A Library for Reproducible Analysis of Network Traffic Time Series. In 2025 21st International Conference on Network and Service Management (CNSM) (pp. 1-5). IEEE.

Overview#

Dataset Metadata#

Source Dataset Characteristics#

CESNET-TimeSeries-24 Scope#

Annotation Methodology#

Class Distribution#

Dataset Splits#

Benefits of Long Test Period#

How to Cite#

Download#

References#