How it Works
ChronoLog is a distributed shared log for activity data with concurrent multi-writer and multi-reader access.
To understand ChronoLog, start with its foundation. A shared log is one of the most powerful abstractions in distributed systems: a durable data store, a consensus mechanism, an execution history for deterministic replay, and a data integration hub. ChronoLog provides this primitive at scale for HPC, scientific, and AI workloads.
It all starts with two core research ideas
Physical time as ordering
Events are ordered by physical timestamps —no centralized sequencer or locks. Multiple writers append concurrently from any node.
Automatic multi-tier storage
Data flows automatically from fast compute-node storage through intermediate tiers to long-term archival. Capacity scales elastically.
What Makes ChronoLog Different
With these principles in place, what sets ChronoLog apart? Traditional distributed logs rely on centralized sequencers or consensus protocols to order events. ChronoLog uses physical time itself, which enables lock-free appends, immediate visibility, and elastic capacity, but it introduces three challenges that ChronoLog solves.
Clock Uncertainty
Different machines have different clock offsets and drift rates. ChronoLog synchronizes server nodes with ChronoVisor during initialization and periodically thereafter. Clients use ChronoTicks as relative time distances from a base clock, eliminating the need for globally synchronized wall clocks.
Late / Backdated Events
Network non-determinism means events may arrive after later events, violating ordering. ChronoLog defines an Acceptance Time Window (ATW), a moving window equal to twice the measured network latency, within which out-of-order events are gracefully absorbed and correctly ordered.
Timestamp Collisions
At coarser time granularities, multiple events may share the same ChronoTick. ChronoLog disambiguates using (clientId, index) pairs and configurable collision semantics: idempotent, redundancy, ordering, or sequentiality, chosen per workload.
The Result
For detailed comparisons with Kafka, BookKeeper, Corfu, and other systems, see the Research page.
Data Model
With the core ideas and their implications understood, let's look at how data is actually structured. ChronoLog organizes data through three core concepts: Events, Stories, and Chronicles. These abstractions define how data is logically structured.
Chronicle
A named collection of related Stories that share a common context or namespace. Chronicles are the top-level unit used to organize data and manage access.
Story
A logical, time-ordered stream of Events representing a single topic, task, or activity. Stories preserve the chronological order of every Event they contain.
Event
The smallest unit of data in ChronoLog — immutable, timestamped, and uniquely identified. Events are generated by clients and attributed to a specific Story.
The diagram below shows how these three concepts nest in a real-world example: a monitoring chronicle containing three sensor stories, each holding a sequence of events.
Software Architecture
The data model defines what ChronoLog stores — the architecture defines how. Five distributed services form a pipeline from ingestion to archival. Data flows through time-ordered tiers automatically.
Client App
Your application. Uses libchronolog (C++ or Python) to create chronicles, record events, and replay history.
ChronoVisor
Central coordination: chronicle metadata, client connections, and distributed clock synchronization across all nodes.
ChronoKeeper
HotFast ingestion on compute nodes via RDMA. Serves record() and real-time playback() with microsecond latency.
ChronoGrapher
WarmDAG pipeline: event collection, story building, and continuous flushing to lower tiers. Real-time and elastic.
ChronoStore
ColdPersistent archival in HDF5 containers. Elastic capacity with device-aware access optimization.
ChronoPlayer
All tiersReads span all tiers transparently. ChronoPlayer serves replay() requests from hot, warm, or cold storage and merges results into a single time-ordered stream. Fully decoupled from the write path.
C++17
Core implementation
RDMA
Zero-copy transport
HDF5
Persistent backend
Docker
Containerized deployment
Software Ecosystem
There is a lot going on both inside ChronoLog and around it. Its software ecosystem spans three layers: the core distributed services (ChronoVisor, ChronoKeeper, ChronoGrapher, ChronoPlayer, ChronoStore), a client library (libchronolog) providing the chronicle API in C++ and Python, and a plugin framework enabling higher-level data systems to be built on top of the shared log.
Core Services
Five distributed services forming the ingestion-to-archival pipeline. Implemented in C++17 with RDMA and TCP transport backends and HDF5 for persistent storage.
Client Library
libchronolog exposes the full chronicle API: connect, create chronicles, acquire stories, record events, and replay history. Available in C++ and Python.
Plugin Framework
A modular architecture that allows new plugins to be developed independently: SQL queries, streaming analytics, key-value stores, ML pipelines, and more.
Part of a broader ecosystem
ChronoLog serves as foundational infrastructure for other projects, AI systems, and research frameworks.
IOWarp
Internal logging and pub/sub abstractions
AI Agents
MCP-based agent memory and audit trails
Parsl Workflows
Task execution logging and provenance
DOE Labs
Available at national lab partner sites
IFSH Genomics
Genomic sequencing pipeline data at IIT
Dive Deeper
Now that you have the full picture, it's time to get hands-on. The documentation has everything else —architecture deep dives, API references, deployment guides, and tutorials.