Ilmari's Website


Validating MP3 Files with Schematron via XProc

Posted: 14th October 2025

Contents

Introduction

As I have been learning more about XProc, I am curious about extending its use beyond working with XML. For my private music project, I maintain a website which serves an audio/mpeg demo stream generated with Icecast2 and Ezstream. The audio files are stored in the same server, and contain metadata that I would like to keep consistent across the site. This data should be the source of truth as it is utilised for other pages, such as the archive and the newest demos feed1. For this, it needs to, for example, contain the file’s published date in the format YYYY-MM, which is used to sort files2. Additionally, I want to ensure that the MP3 source files have a certain sample rate, bit depth and the correct number of channels3.

XProc

As the site is generated with XProc, I added a step that pulls the metadata from each file and validates them, before being pushed live. I quite like this approach as it fits seamlessly with the site’s pre-existing build process (itself being XML-driven) and runs entirely in memory. The snippet below reflects one approach, being a self-contained pipeline wrapped in p:group. Firstly, the step calls ffprobe4via the os:exec step for each MP3 file returned by p:directory-list, with the output declared as XML. The output is cast to application/xml to ensure that the XProc engine treats it as well-formed XML, and not, for example, text/plain. Once the loop finishes, a sequence of XML documents is piped into p:wrap-sequence to create a master XML file that can be conveniently processed with Schematron. The output of this pipeline is stored in memory in the p:output step (line 2), which can be called in other processes elsewhere in the project.

Schematron

Schematron provides a flexible way to assert data constraints via XPath, which is perfect for checking whether element values are as expected. An example pattern is shared below that reflects the output of the XProc pipeline above. Among other tasks, this pattern stores any required tags in a map (key-value structure), that are then iterated over to check for empty values. As an other example, a required comma-separated string entry containing a timestamp, as mentioned above, is matched via a simple regular expression. In these checks, the validating pipeline will halt with an error code if any of the asserts are fired, returning a short human-readable message as standard error. But this could just as well allow the pipeline to continue while generating a validation report and delivering it via, for example, email.

Listing 1: Group example

 1: <p:group name="generate-mp3-data">
 2:   <p:output pipe="result@add-tracks"
 3:             serialization="map{ 'indent': true() }"/>
 4:   <p:directory-list path="{$uri-demo-dir}"/>
 5:   <p:for-each name="generate-mp3-data-as-xml">
 6:     <p:with-input select="//c:file"/>
 7:     <p:variable name="mp3-file-name"
 8:                 select="c:file/@name[fn:ends-with(., '.mp3')]"/>
 9:     <p:variable name="uri-mp3-file-path"
10:                 select="fn:concat($uri-demo-dir, '/', $mp3-file-name)"/>
11:     <p:os-exec command="ffprobe"
12:                cwd="{$uri-demo-dir}"
13:                message="generating xml data for {$mp3-file-name}"
14:                name="generate-mp3-date">
15:       <p:with-option name="args"
16:                      select="(
17:                              '-v',
18:                              'quiet',
19:                              '-print_format', 'xml',
20:                              '-show_format',
21:                              '-show_streams',
22:                              '-i', $uri-mp3-file-path
23:                              )"/>
24:     </p:os-exec>
25:     <p:cast-content-type content-type="application/xml"/>
26:   </p:for-each>
27:   <p:wrap-sequence wrapper="mp3-data"
28:                    name="master-mp3-data"
29:                    message="generating master mp3 data"/>
30:   <p:validate-with-schematron name="validate-mp3-files"
31:                               assert-valid="true"
32:                               message="validating mp3 files"
33:                               phase="mp3-checks"
34:                               report-format="xvrl">
35:     <p:with-input port="schema"
36:                   href="../sch/post.sch"/>
37:   </p:validate-with-schematron>
38: </p:group>

Listing 2: Schematron example

 1: <pattern id="mp3-checks-main">
 2:   <title>mp3 checks</title>
 3:   <let name="required-tags"
 4:        value="map{
 5:               'title':        mp3-data/ffprobe/format/tags/tag[@key='title']/@value,
 6:               'artist':       mp3-data/ffprobe/format/tags/tag[@key='artist']/@value,
 7:               'album_artist': mp3-data/ffprobe/format/tags/tag[@key='album_artist']/@value,
 8:               'album':        mp3-data/ffprobe/format/tags/tag[@key='album']/@value,
 9:               'copyright':    mp3-data/ffprobe/format/tags/tag[@key='copyright']/@value,
10:               'genre':        mp3-data/ffprobe/format/tags/tag[@key='genre']/@value,
11:               'comment':      mp3-data/ffprobe/format/tags/tag[@key='comment']/@value,
12:               'composer':     mp3-data/ffprobe/format/tags/tag[@key='composer']/@value,
13:               'TOPE':         mp3-data/ffprobe/format/tags/tag[@key='TOPE']/@value,
14:               'date':         mp3-data/ffprobe/format/tags/tag[@key='date']/@value
15:               }"/>
16:   <rule context="mp3-data/ffprobe/streams/stream[@codec_type='audio']">
17:     <assert test="@codec_name='mp3'">
18:       ❌ Audio stream must be MP3. Check <value-of
19:       select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10;
20:     </assert>
21:     <assert test="@channel_layout='stereo'">
22:       ❌ Audio stream must be stereo. Check <value-of
23:       select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10;
24:     </assert>
25:     <assert test="@sample_rate='44100'">
26:       ❌ Audio stream must have 44100 Hz sample rate. Check
27:       <value-of select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10;
28:     </assert>
29:     <assert test="xs:integer(@bit_rate) gt 20000">
30:       ❌ Audio stream must have 24 bits per sample. Check <value-of
31:       select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10;
32:     </assert>
33:   </rule>
34:   <rule context="mp3-data/ffprobe/streams">
35:     <assert test="stream[@codec_name='png']">
36:       ❌ mp3 file must have a cover image. Check <value-of
37:       select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10;
38:     </assert>
39:   </rule>
40:   <rule context="mp3-data/ffprobe/format">
41:     <assert test="@format_name='mp3'">
42:       Format must be MP3.
43:     </assert>
44:     <assert test="every $k in map:keys($required-tags)
45:                   satisfies $required-tags($k) != ''">
46:       ❌ Format must contain all required tags (title, artist,
47:       album_artist, album, copyright, genre, comment, composer,
48:       TOPE, date) and they must not be empty. Check <value-of
49:       select="ancestor::ffprobe/format/tags/tag[@key='title']/@value"/>.&#10; <value-of select="for $k in
50:       map:keys($required-tags) return $k"/>
51:     </assert>
52:     <assert test="fn:matches(
53:                   tags/tag[@key='comment']/@value,
54:                   '^[^,]+,[^,]+,\d{4}-\d{2}$'
55:                   )
56:                   ">
57:       ❌ Comment tag must follow format: URL,EMAIL,YYYY-MM.
58:       Check <value-of select='tags/tag[@key="title"]/@value'/>.
59:     </assert>
60:   </rule>
61: </pattern>

Conclusion

While applied in a small-scale, I thought that this was convenient way to validate MP3 files in a production pipeline. XProc is a quickly becoming one of my favourite tools for managing publishing scripts, but my use of it has been limited to digital text media. The use of the os:exec step, however, makes it possible to draw on a number of command-line audio tools that support standard input and output. As a next step it would be interesting to see if there are any use-cases for using XProc with SoX or FFmpeg for manipulating audio files. This could involve, for example, generating SoX commands dynamically and then piping the results into SoX as stdin. I am also looking for excuses to use the Invisible XML step, which could involve piping SoX stdout into an iXML grammar. However, I am not quite sure what problem this is trying to solve (yet) 🤔.

Footnotes

[1] The demo archive here contains a table with the audio title, year and length in minutes.
[2] This string is contained in the comments metadata field.
[3] All audio assets are generated with REAPER using a standardised export profile.
[4] As per the ffprobe documentation, ffprobe is a tool that ‘gathers information from multimedia streams and prints it in human- and machine-readable fashion.’