loghub
Log datasets
Provides a collection of system log datasets for AI-driven analytics research.
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
2k stars
59 watching
602 forks
last commit: 5 days ago
Linked from 1 awesome list
anomaly-detectiondatasetslog-analysislog-intelligencelog-parsinglogsunstructured-logs
Related projects:
Repository | Description | Stars |
---|---|---|
aitutorials/datasets | A comprehensive collection of datasets from various AI-related sources worldwide. | 46 |
ynqa/logu | Extracts patterns from streaming log messages by tokenizing and grouping similar logs into clusters | 83 |
dogoncouch/logdissect | Analyzes log files and other data from various sources and formats. | 148 |
karthikncode/nlp-datasets | A curated list of Natural Language Processing datasets used to train and evaluate NLP models. | 919 |
jehuty4949/nsl_kdd | An NSL-KDD dataset project for network intrusion detection | 173 |
kitura/loggerapi | Provides a common logging interface for different kinds of loggers | 27 |
alessandrogianfelici/danish_reviews_dataset | A dataset of Danish reviews scraped from the internet to train sentiment classification models | 2 |
mirfan899/urdu | A collection of Urdu language datasets for various NLP tasks and applications | 71 |
thu-coai/safety-prompts | Provides a dataset of safety prompts to evaluate and improve the safety of large language models. | 870 |
causiq/logary | A high-performance logging and metrics library for .NET applications | 526 |
laion-ai/laion-datasets | A repository containing a collection of large datasets used for training and testing AI models, specifically designed to improve image-text matching capabilities. | 235 |
poio-nlp/poio-corpus | A collection of language resources extracted from publicly available sources. | 7 |
gopherdata/resources | A collection of Go-based resources and tools for data science tasks | 876 |
juji-io/datalevin | A simple, fast and versatile Datalog query engine with features like recursive rules, full-text search, and concurrent read-intensive workloads. | 1,155 |
pratyushmaini/llm_dataset_inference | Detects whether a given text sequence is part of the training data used to train a large language model. | 23 |