2025-03-02
Checksum habits for distributor CSV drops
By Dae Park
data · pipelines · integrity
Distributor CSVs arrive with heroic filenames and invisible encoding swaps. Before any atlas merge, we hash each file at ingest and compare against the prior night’s schema fingerprint.
When hashes drift, we pause automation and send a human-readable diff to both commercial and IT distribution lists. That sounds noisy, but it prevents silent corruption from infecting board decks.
We recommend keeping a thirty-day rolling manifest even if storage feels indulgent—regulators love predictable retention, and so do your future selves.
Checksums cannot fix upstream laziness, but they make laziness visible early enough to fix without panic.