loghub

Log datasets

Provides a collection of system log datasets for AI-driven analytics research.

A large collection of system log datasets for AI-driven log analytics [ISSRE'23]

GitHub

2k stars

59 watching

610 forks

last commit: over 1 year ago

Linked from 1 awesome list

anomaly-detectiondatasetslog-analysislog-intelligencelog-parsinglogsunstructured-logs

Backlinks from these awesome lists:

infosecb/awesome-detection-engineering

Related projects:

Repository	Description	Stars
aitutorials/datasets	A comprehensive collection of datasets from various AI-related sources worldwide.	46
ynqa/logu	Extracts patterns from streaming log messages by tokenizing and grouping similar logs into clusters	84
dogoncouch/logdissect	Analyzes log files and other data from various sources and formats.	148
karthikncode/nlp-datasets	A curated list of Natural Language Processing datasets used to train and evaluate NLP models.	919
jehuty4949/nsl_kdd	An NSL-KDD dataset project for network intrusion detection	172
kitura/loggerapi	Provides a common logging interface for different kinds of loggers	27
alessandrogianfelici/danish_reviews_dataset	A dataset of Danish reviews scraped from the internet to train sentiment classification models	2
mirfan899/urdu	A collection of Urdu language datasets for various NLP tasks and applications	71
thu-coai/safety-prompts	Provides a dataset of safety prompts to evaluate and improve the safety of large language models.	880
causiq/logary	A high-performance logging and metrics library for .NET applications	527
laion-ai/laion-datasets	A repository containing a collection of large datasets used for training and testing AI models, specifically designed to improve image-text matching capabilities.	239
poio-nlp/poio-corpus	A collection of language resources extracted from publicly available sources.	7
gopherdata/resources	A collection of Go-based resources and tools for data science tasks	879
juji-io/datalevin	A simple, fast and versatile Datalog query engine with features like recursive rules, full-text search, and concurrent read-intensive workloads.	1,169
pratyushmaini/llm_dataset_inference	Detects whether a given text sequence is part of the training data used to train a large language model.	23