ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering
A. Kougkas, H. Devarajan, K. Bateman, J. Cernuda, N. Rajesh, X.-H. Sun
36th Intl. Conference on Massive Storage Systems and Technology
PDFNSF-Funded Cyberinfrastructure
Activity data describe things that happen rather than things that are.
ChronoLog is a distributed log storage ecosystem that uses physical time as the ordering mechanism -- eliminating centralized sequencers and enabling auto-tiered storage across multiple layers. Built to capture the velocity and variety of modern activity data: from scientific instruments producing terabytes per second to AI agent audit trails.
ChronoLog rethinks distributed logging from first principles, introducing two ideas that set it apart from existing systems.
Traditional distributed logs rely on centralized sequencers or locking protocols to order entries -- creating bottlenecks at scale. ChronoLog uses physical time itself as the natural ordering principle, enabling lock-free concurrent writes with immediate entry visibility and no coordination overhead.
Log data flows automatically from fast ingestion nodes through intermediate tiers to persistent archival storage -- balancing access latency with capacity. This 3D distribution (horizontal across nodes, vertical across tiers, temporal by time) enables elastic capacity scaling without manual data management.
ChronoLog is designed as a foundation that other systems build upon. Its plugin architecture enables diverse workloads through a common log infrastructure.
Query log data with familiar SQL semantics. Time-based ordering enables efficient range scans without auxiliary indices.
Publish-subscribe patterns and real-time streaming built on ChronoGrapher's DAG pipeline. Used internally by IOWarp.
ChronoKVS provides time-series key-value semantics on top of the ordered log, with built-in consistency guarantees.
TensorFlow integration for feeding time-ordered data streams directly into training and inference pipelines.
AI agents need persistent, time-ordered memory. ChronoLog provides exactly that -- a distributed log that serves as the shared memory backend for autonomous agents, LLM-based systems, and agentic workflows. Every agent action, observation, and reasoning step can be recorded as a time-stamped chronicle entry, creating a queryable audit trail across conversations and sessions.
The official ChronoLog MCP server, part of the IOWarp Agent Toolkit, exposes chronicle operations as Model Context Protocol tools -- so any MCP-compatible AI agent can create logs, record events, replay history, and query by time range natively.
ChronoLog MCP Server// Agent memory via MCP
tools:
chronicle.create - new memory log
chronicle.record - store event
chronicle.replay - retrieve history
chronicle.query - search by time
// Use cases
- Conversation logging
- Agent audit trails
- Cross-session memory
- System monitoring
Existing distributed log systems (Kafka, BookKeeper, Corfu) were designed for different eras and constraints. ChronoLog explores a fundamentally different point in the design space -- one where time itself provides ordering, tiering provides capacity, and plugins provide versatility.
This isn't about replacing existing systems. It's about understanding what becomes possible when you rethink the log abstraction from the ground up for modern HPC, scientific, and AI workloads.
Explore the architectureFive distributed services form a pipeline from ingestion to archival. Data flows through physical-time-ordered tiers automatically -- no manual data movement required.
C++ or Python application using libchronolog to create chronicles, record events, and replay history.
Central coordination: chronicle metadata, client connections, and distributed clock synchronization across all nodes.
Fast ingestion on compute nodes via RDMA. Serves record() and real-time playback() with microsecond latency.
DAG pipeline: event collection, story building, and continuous flushing to lower tiers. Real-time and elastic.
Persistent archival in HDF5 containers. Elastic capacity with device-aware access optimization.
Reads span all tiers transparently. ChronoPlayer serves replay() requests from hot, warm, or cold storage and merges results into a single time-ordered stream. Fully decoupled from the write path.
C++17
Core implementation
RDMA
Zero-copy transport
HDF5
Persistent backend
Docker
Containerized deployment
ChronoLog serves as foundational infrastructure for other projects, AI systems, and research frameworks.
Internal logging and pub/sub abstractions
MCP-based agent memory and audit trails
Task execution logging and provenance
Available at national lab partner sites
Genomic sequencing pipeline data at IIT
ChronoLog's impact extends beyond its codebase. From graduate classrooms to national lab clusters, the project has a growing community of researchers, students, and collaborators.
Attendees at ChronoLog presentations, webinars, and classroom sessions
GitHub visitors per year, with growing adoption across research communities
PhD researchers using ChronoLog as a backend for distributed logging
Cluster installations, including GRC and DOE partner sites
Active community domains: FaaS, scientific data repositories, and system telemetry
Used as teaching infrastructure in distributed systems and HPC graduate courses at IIT. Students learn real-world distributed log design using ChronoLog's APIs and deployment tools.
Beyond HPC, ChronoLog has been applied in nutrition analysis (with IIT's Department of Food Science) and genomic sequencing pipelines, demonstrating its versatility across domains.
Active installations on the GRC research cluster and available at DOE partner sites. Used for continuous research workloads including HPC system monitoring and provenance tracking.
Peer-reviewed research behind ChronoLog, published at top-tier HPC, systems, and parallel computing venues.
A. Kougkas, H. Devarajan, K. Bateman, J. Cernuda, N. Rajesh, X.-H. Sun
36th Intl. Conference on Massive Storage Systems and Technology
PDFL. Logan, A. Kougkas, X.-H. Sun
Intl. Conference for High Performance Computing, Networking, Storage, and Analysis
DOIM. Tang, J. Cernuda, J. Ye, L. Guo, et al.
IEEE Intl. Conference on Cluster Computing
DOIJ. Ye, J. Cernuda, A. Maurya, X.-H. Sun, A. Kougkas, B. Nicolae
39th IEEE Intl. Parallel & Distributed Processing Symposium
I. Yildirim, H. Devarajan, A. Kougkas, X.-H. Sun, K. Mohror
39th Intl. Conference on Supercomputing
ChronoLog is open source and welcoming collaborators. Whether you're a researcher exploring distributed log abstractions, a developer building on the plugin ecosystem, or an institution looking for scalable logging infrastructure -- we'd like to hear from you.