Martin Hoppenheit
archivist ∧ computer scientist → digital preservation wizard

XML stream processing with Haskell

Memory-efficient processing of large XML documents requires the use of a streaming parser. This post gives an introduction to XML stream processing with the Haskell programming language, in particular to the streaming API of the xml-conduit package. It shows examples for reading, writing and transforming XML data in a conduit pipeline.

Writing binary by hand

This post is based on a talk I gave at iPres 2022 (slides). It explains how to read a file format specification (in this example, TIFF) and based on that build a minimal binary file by hand.

Rotating Matroska video

Videos created by mobile phones are often rotated by means of embedded metadata tags. Playback software that respects these tags applies the correct rotation when rendering a video while software that doesn’t only displays the tilted, un-rotated (original) video. Let’s see if we can rotate a Matroska (mkv) video using metadata! (spoiler: not really, but maybe in the future)