Skip to main content
Version: 2.5.0

Performance Tuning

This page covers the configuration knobs that most directly affect ChronoLog's throughput, latency, and memory usage. All settings below live in default_conf.json under the relevant component block.

Story Chunk Settings

A story chunk is the unit of data that flows through the pipeline: Keeper buffers events into chunks, drains them to Grapher, which persists them to storage. Three parameters govern chunk behavior in DataStoreInternals.

max_story_chunk_size

Maximum size of a single story chunk in the number of Events.

ComponentDefault
ChronoKeeper4096
ChronoGrapher4096
ChronoPlayer4096

Effect: larger chunks reduce per-chunk overhead (fewer RPC calls, fewer I/O operations) but increase memory usage and the minimum latency before an event reaches persistent storage. Tune this based on your expected event payload size.

story_chunk_duration_secs

How long (in seconds) a chunk remains open, accumulating events, before it is sealed and drained downstream.

ComponentDefault
ChronoKeeper10
ChronoGrapher60
ChronoPlayer60

Effect: shorter durations reduce end-to-end latency from write to persistent storage at the cost of more frequent RPC round-trips. Longer durations improve batching efficiency. The ChronoKeeper value is the most latency-sensitive; the ChronoGrapher/ChronoPlayer values affect how quickly data becomes queryable.

acceptance_window_secs

Maximum age (in seconds) of an incoming event timestamp, relative to the current wall clock, for the event to be accepted. Events older than this are rejected.

ComponentDefault
ChronoKeeper15
ChronoGrapher180

Effect: a wider window accommodates clock skew between client hosts and the ChronoKeeper nodes, and allows late-arriving events to be stored correctly. Narrowing the window tightens the ordering guarantee but rejects events from slow or clock-skewed producers.

inactive_story_delay_secs

How long (in seconds) a story with no new events is kept in memory before being evicted.

ComponentDefault
ChronoKeeper120
ChronoGrapher300

Effect: longer delays keep story state warm, reducing re-initialization cost when a story becomes active again. Shorter delays free memory sooner in workloads with many short-lived or bursty stories.

Recording Groups

"RecordingGroup": 7

Applies to: chrono_keeper, chrono_grapher, chrono_player.

Recording groups are logical partitions that pair ChronoKeeper and ChronoGrapher instances. In a multi-node deployment, recordin group IDs determine which ChronoGrapher drains which ChronoKeeper. The default value of 7 is a single-group configuration. When deploying multiple ChronoKeeper+ChronoGrapher pairs, assign each pair the same group ID and ensure no two pairs share the same ID.

Clock Source

"clock": {
"clocksource_type": "CPP_STYLE",
"drift_cal_sleep_sec": 10,
"drift_cal_sleep_nsec": 0
}

clocksource_type

Controls the mechanism used to generate event timestamps across all components.

ValueEnumDescription
"C_STYLE"C_STYLE (0)Uses C gettimeofday. Portable but lowest resolution.
"CPP_STYLE"CPP_STYLE (1)Uses C++ std::chrono::high_resolution_clock. Default. Good balance of portability and precision.
"TSC"TSC (2)Reads the CPU timestamp counter directly. Lowest latency and highest resolution, but requires a stable, invariant TSC across all sockets. Not suitable for systems with dynamic CPU frequency scaling unless constant_tsc is set.
caution

TSC mode requires that all CPUs in the system have synchronized, invariant TSCs. Verify with grep -m1 constant_tsc /proc/cpuinfo on Linux. Do not use TSC on virtual machines or heterogeneous CPU clusters.

drift_cal_sleep_sec / drift_cal_sleep_nsec

Interval between clock drift calibration runs (default: every 10 seconds). The drift calibration corrects for slow clock drift between the ChronoLog clock and the system wall clock. Reducing this interval increases correction frequency at a small CPU cost; increasing it reduces CPU overhead but allows more drift to accumulate between corrections. Currently, it has no effect in ChronoLog v2.5.0.

Shutdown Grace Period

"chrono_visor": {
"delayed_data_admin_exit_in_secs": 3
}

Number of seconds ChronoVisor waits after receiving a shutdown signal before terminating the data administration service. This grace period allows in-flight RPC calls and pending data acknowledgements to complete cleanly. Increase this value if you observe data loss at shutdown under heavy write load.

Summary of Tuning Recommendations

GoalRecommended change
Lower write-to-query latencyReduce story_chunk_duration_secs on ChronoKeeper (e.g., 5) and ChronoGrapher (e.g., 30)
Higher throughput / better batchingIncrease max_story_chunk_size (e.g., 16384 or 65536)
Tolerate high clock skew across nodesIncrease acceptance_window_secs on ChronoKeeper (e.g., 60)
Reduce memory usage with many storiesDecrease inactive_story_delay_secs on Keeper and Grapher
Minimum timestamp latency on bare-metal HPCSet clocksource_type to "TSC" (verify invariant TSC first)
Prevent data loss at shutdown under heavy loadIncrease delayed_data_admin_exit_in_secs to 10 or more