Execution latency gets all the attention. Market data latency gets almost none. The asymmetry is unforced and operationally expensive — a strategy whose execution path is fast but whose market data is half a second stale is a strategy that is running on yesterday's prices and paying the spread on tomorrow's. This post is about the engineering discipline that keeps the data path honest, particularly when honesty is hardest.

The market data path
From the exchange's matching engine to the order-decision logic on your desk, market data passes through, at minimum: the venue's outbound multicast or unicast feed; the colo gateway translating it to your transport; the feed handler decoding the wire protocol; the in-memory order book reconstruction; the strategy's subscription to the relevant instruments; the snapshot mechanism that keeps your view in sync with the venue's; and the heartbeat/gap-detection layer that tells you when any of the above failed.
Each layer has a typical latency of single-digit microseconds in normal conditions and a 99th-percentile latency of milliseconds. On a news spike, the 99th-percentile becomes the median, and the latency of the path can rise by an order of magnitude in the span of a few seconds. Strategies that did not budget for the spike experience this as 'something has gone wrong with the data', usually after they have already placed orders against stale prices.
Three things that go wrong on a spike
1. Feed-handler queue buildup
The feed handler decodes wire-format messages into the in-memory book. Its throughput is bounded by single-core decode performance plus the cost of memory allocation per message. When the message rate spikes (a news event can produce 100x the calm-market message rate for 5-30 seconds), the handler's input queue grows. If the queue grows faster than it can be drained, messages are processed later than they arrived, and the in-memory book is stale relative to the venue's actual state.
The mitigation is unglamorous: a feed handler designed for the peak rate, not the median, with pre-allocated memory pools and a single-producer/single-consumer ring buffer that avoids lock contention. Drovix's handlers are written in C++ with manual memory management for exactly this reason; the JVM- or .NET-based alternatives can survive normal conditions but produce predictable failures on extreme rates.
2. Snapshot/incremental gap
Most modern wire protocols use an incremental update model: snapshots are published periodically, and the strategy reconstructs the current book by applying incremental updates to the most recent snapshot. If any incremental update is dropped or processed out of order, the reconstructed book diverges from the venue's actual book, often without an obvious error.
The discipline that keeps this honest is gap detection on every incremental sequence number and snapshot-based recovery when a gap is detected. Snapshot recovery is expensive (it requires re-subscribing to a snapshot feed and discarding incremental updates in flight) and is therefore tempting to skip. The cost of skipping it is silent corruption of the book; the cost of doing it is a brief data gap that the strategy must handle correctly.

3. Timestamp ambiguity
Each market data update carries multiple timestamps: the venue's transact time (when the matching engine processed the event), the venue's transmit time (when it left the venue), the handler's receive time (when it landed at your network interface), and the strategy's process time (when the application logic saw it). The relevant timestamp for any given decision depends on what the decision is sensitive to.
On a spike, the gap between transact time and process time widens to milliseconds. A strategy that uses process time as a proxy for transact time will see correct-looking data that is, in event-time, latent. Decisions based on that data will be too late by the difference.
The discipline here is to carry all four timestamps through the data pipeline and use the most appropriate one at each decision point. Drovix's feed handlers stamp all four into the in-memory event record; the strategy specifies which timestamp it cares about per event type.
What honest measurement looks like
A market-data SLO that is worth committing to has three numbers per feed:
- Median wire-to-decision latency, measured continuously, with an alert threshold at 5x median.
- 99th-percentile latency over the prior trading day, published the next morning.
- Worst-second latency in the prior trading day — the maximum latency observed within any 1-second window — published as a tail-risk measure.
The median is the easy number and is the only one most providers publish. The 99th and the worst-second are the numbers that determine whether a strategy survives a stress event. A provider that publishes only the median is a provider whose tail is undisciplined; a provider that publishes all three is one that has at least measured the right thing.
Drovix's market-data discipline
Drovix runs its own feed handlers in C++ for every venue we connect to, with pre-allocated memory and single-producer/single-consumer ring buffers between layers. Heartbeat and gap detection are mandatory on every feed; a gap fires an automatic snapshot recovery. All four timestamps are carried through to the application layer.
The full latency budget — and how the data path interacts with the execution path inside our engine — is the subject of Microseconds Matter. The market-data layer is roughly 40% of the total budget on a normal day and rises to 60% on a spike day; the budget would not survive if the data path were treated as an afterthought.
We publish median, 99th-percentile, and worst-second latency per venue per day, available on request to any institutional counterparty. The discipline of publishing the worst-second is itself part of the discipline of running the system; the number is uncomfortable when it is large, and the discomfort is what motivates the engineering to make it smaller.
Operational implications for the desk
- Ask your provider for the worst-second latency, not the median. If they cannot answer, they are not measuring it.
- Backtest with the same data path you intend to trade on, not a clean historical feed. The clean feed flatters the strategy; the live feed is what the strategy will actually see.
- Define a degradation policy: when wire-to-decision latency exceeds threshold X, the strategy pauses or moves to safer order types. The decision should be in code, not in a runbook.
- Treat any market data gap as a real event, not a noise artifact. Log it, alert on it, and reconcile the book to the venue's truth after each one.
Where to go next
→ FIX Tags That Decide Fill vs Re-Quote — the execution-side equivalent of this discussion: the protocol decisions that decide whether your messages survive the spike.
→ Microseconds Matter — the full latency budget for the path from venue to your decision and back, with the engineering choices that earn each microsecond.
Analyst Desk
Drovix Research Desk
Institutional Research
Drovix Research Desk publishes institutional-grade analysis covering macro events, cross-asset correlations, and execution insights for professional market participants.
