Modern autonomy development relies on stored data to train and validate the performance of algorithms and models. However, the community developing autonomous ground vehicles for national defense lacks readily available datasets that adequately cover the landscape of anticipated operating environments. We propose the development of an open architecture and supporting infrastructure enabling scalable and effective collection, storage, processing, and reuse of the U.S. Army’s autonomous ground vehicle data across numerous stakeholders and programs. This paper presents the proposed architecture’s requirements, use cases, and a preliminary design. We also show results of an initial prototype implementation performing a query task on existing ground vehicle sensor data.