About ChronoLog
ChronoLog is a multi-year research effort to build a distributed shared log storage ecosystem grounded in a novel idea: using physical time as the ordering mechanism. Created and led by the Gnosis Research Center (GRC) at Illinois Institute of Technology, supported by a $4M National Science Foundation CSSI grant.
Background
Modern scientific instruments, IoT networks, and distributed services generate massive volumes of activity data -- things that happen rather than things that are. Distributed log stores are the natural infrastructure for capturing, ordering, and retrieving this data, but existing systems face fundamental trade-offs between total ordering, concurrent access, and capacity scaling.
ChronoLog explores a different point in the design space: by using physical time as the ordering principle and automatic multi-tier storage for elastic capacity, it aims to resolve these trade-offs simultaneously. The project has been under active development since 2020, led by GRC with partnerships across DOE national laboratories and integrated into the broader HPC ecosystem through collaborators.
Team
Created and led by the Gnosis Research Center (GRC) at Illinois Institute of Technology. University of Chicago serves as an ecosystem integration partner. Supported by a $4M NSF CSSI grant.
Principal Investigators
Dr. Xian-He Sun
Principal Investigator
Illinois Tech
University Distinguished Professor & Ron Hochsprung Endowed Chair. IEEE Fellow. Editor-in-Chief of IEEE Transactions on Parallel and Distributed Systems.
Dr. Anthony Kougkas
Project LeadCo-Principal Investigator
Illinois Tech
Associate Research Professor & Associate Director of GRC. Founder of ChronoLog, system architect, and day-to-day project lead.
Dr. Kyle Chard
Co-Principal Investigator
University of Chicago
Associate Research Professor. Application-layer integrations and ecosystem partnerships.
Researchers & Engineers
Dr. Jaime Cernuda Garcia
ResearcherIllinois Tech
Assistant Research Professor at GRC. Developer of HStream, HFlow, and Hades. Streaming data systems and hierarchical storage.
Dr. Luke Logan
ResearcherIllinois Tech
Assistant Research Professor at GRC. Developer of LabStor and MegaMmap. Distributed storage and OS-level I/O optimization.
Izzet Yildirim
ResearcherIllinois Tech
PhD Candidate at GRC. I/O analysis, characterization, and bottleneck detection in HPC systems (WisIO).
Eneko Gonzalez
EngineerIllinois Tech
Research Software Engineer at GRC. Core member of the ChronoLog engineering team: architecture, deployment, and optimization.
Dr. Kun Feng
EngineerIllinois Tech
Research Software Engineer at GRC. Key developer for Hermes. Data-intensive applications and memory systems.
Inna Brodkin
EngineerUniversity of Chicago
Research Associate at UChicago. Collaborates with GRC on ChronoLog distributed log storage systems.
Collaborators
ChronoLog is developed in partnership with DOE national laboratories, universities, and industry. Each collaboration brings domain-specific expertise and real-world deployment environments.
Argonne National Laboratory
Logan Ward
funcX integration for event-based computing and Colmena framework for materials science.
University of Chicago
Ian Foster
Ecosystem integration partner: Parsl workflow extensions and Dark Energy Science Collaboration for Rubin Observatory.
Lawrence Livermore National Lab
Stephen Herbein
Integration with Sonar and Flux job scheduler for HPC telemetry.
SLAC National Accelerator Lab
Tom Glanzman
Dark Energy Science Collaboration and Rubin Observatory data pipeline integration.
UW-Madison
Benedikt Riedel
CyberGIS storage backend for geospatial data workloads.
UIUC
Shaowen Wang
Parsl workflow extensions for Rubin Observatory data processing.
DePaul University
Tanu Malik
Lightweight indexing mechanisms for efficient log querying.
IFSH at IIT
Genomics and bioinformatics pipelines for food safety research.
3Red Partners
Dries Kimpe, Sam Lang
Financial trading infrastructure and low-latency log requirements.
ParaTools, Inc.
Sameer Shende
Performance monitoring tools integration (TAU Performance System).
OmniBond Systems
Boyd Wilson
OrangeFS storage stack optimization.
Publications
ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering
A. Kougkas, H. Devarajan, K. Bateman, J. Cernuda, N. Rajesh, X.-H. Sun
MegaMmap: Blurring the Boundary Between Memory and Storage for Data-Intensive Workloads
L. Logan, A. Kougkas, X.-H. Sun
Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models
J. Ye, J. Cernuda, N. Rajesh, K. Bateman, et al.
HStream: A Hierarchical Data Streaming Engine for High-Throughput Scientific Applications
J. Cernuda, J. Ye, A. Kougkas, X.-H. Sun
DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics
M. Tang, J. Cernuda, J. Ye, L. Guo, et al.
Characterizing the Behavior and Impact of KV Caching on Transformer Inferences under Concurrency
J. Ye, J. Cernuda, A. Maurya, X.-H. Sun, A. Kougkas, B. Nicolae
WisIO: Automated I/O Bottleneck Detection with Multi-Perspective Views for HPC Workflows
I. Yildirim, H. Devarajan, A. Kougkas, X.-H. Sun, K. Mohror
LabStor: A Modular and Extensible Platform for Developing High-Performance, Customized I/O Stacks in Userspace
L. Logan, J. Cernuda Garcia, J. Lofstead, X.-H. Sun, A. Kougkas
LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services
K. Bateman, N. Rajesh, J. Cernuda Garcia, L. Logan, J. Ye, S. Herbein, A. Kougkas, X.-H. Sun
Sponsor
National Science Foundation
Grant NSF CSSI-2104013
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.