OpenStreetMap Traces


The availability and usage of embedded systems increases permanently and the industry drives the IoT to become more and more relevant in daily life. Factory lines, planes and cars, traffic lights, or even clothes are equipped with sensors and small computers constantly communicating with the outside world. One challenge in maintaining those devices is updating their software. Due to slow connections or only because of the huge amount of devices data transfers can be problematic. Data compression algorithms can be applied to reduce the amount of data that must be transferred. A data reduction technique that provides high efficiency, but which has not been considered so far for embedded systems is data deduplication. In this work we present the results of a long term study for updating a car multimedia system. The results show that deduplication can achieve significantly better results than commonly used data compression techniques.

Data set description

The data set consists of complete virtual machine images (Virtual Box) collected over a periode of more than 2 years. The images are based on Debian Linux and contain a small linux base, a Mate display manager, standard multimedia applications, and software for navigation including an OpenStreetMap from Germany. All data was update on an almost weekly base. During the observation time the image size increased from about 5 GiB to about 5.5 GiB without installing any additional software packages from user side.


Please cite the following paper in case you are using our traces for your research:


  • Tim Süß, Tunahan Kaya, Markus Mäsker, and André Brinkmann. 2018. Deduplication Analyses of Multimedia System Images. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge). Author/Publisher URL


The metadata of the invidual VMs including their iRods tickets are available in the DedupVMBackups Collection.

The metadata of each individual file can be downloaded using the ticket numbers from the overall metadata, e.g., using

Individual virtual machines can finally be downloaded again using the ticket numbert received from the metadata, e.g.,

The image has been provided by Ken Vermette under CC 2.0.