Dear ConTeXt community, recently, as part of my bachelor thesis, I looked into the state of multimedia (audio, video, 3D) and other relatively obscure PDF features, with the connection to TeX. I already put off this write up long enough. Hopefully it clarifies things that might influence the current uncertainty about these features and their presence in ConTeXt (see abstract of Hans' talk for upcoming ConTeXt meeting, chapter 8 of "On and on" and remarks in files like "lpdf-wid.lmt"). The following text discusses the options for including multimedia in PDF files from ConTeXt. First the different PDF mechanisms are introduced and compared, then their support in ConTeXt is summarized. My ideas for future steps are also included. Patch to fix some bugs is also below. Lastly I link to some resources regarding the topic. The various versions of the PDF standard have over the years developed several ways of including "multimedia" in PDF files. The simplest are XObjects which allow raster and vector graphics -- this is a well known and well supported feature in both PDF writers and viewers. However, later revisions of the PDF standard added what is essentially five different mechanisms for including video, audio or 3D files (each mechanism supports a different subset of these three). For evaluating these mechanisms from perspective of ConTeXt it is possible to devise the following criteria: - support in PDF standard (deprecation, etc.), - supported media types (audio, video[+optional audio], 3D), - support of different source types (embedded file, external file, URL) - what is possible to achieve ("usefulness") and at what cost ("complexity"), - current support in ConTeXt, - and the most important: support in PDF viewers. Perhaps the "source types" need a bit of explanation. Essentially all file references from PDF can be one of three types: - Embedded file. The referenced file is *in* the PDF file, this means that it can also be compressed as part of it (not very useful for multimedia though). This is nice because the result is integral -- the media file can't get lost and there is only a single file to distribute. - URL file. The reference to the file is solely the URL. Takes almost no space at all in the PDF file, but means that the availability of the media file and PDF file are not tied. - External file. A file path is included in the PDF file. The file doesn't have to be available over the internet, but has to be distributed along with the PDF file (and the relative path has to match). The "usefulness" aspect includes the possibility of interaction or scripting. E.g. using media player buttons ("controls"), scripting with JavaScript or some control with PDF actions (\goto, and triggers like page open which allow auto-play). The viewers I tested were: Acrobat Reader DC, Foxit Reader, Sumatra PDF on Windows and Evince, Okular, Xpdf, MuPDF, Firefox and Google Chrome on Linux. Now to the different mechanisms: 1) Sound objects - First appeared in PDF 1.2 (1996), but had since been deprecated (PDF 1.5, 2003) and became unsupported (PDF 2.0, 2017). - Only audio is supported. - "Raw" and in practice uncompressed PCM audio can be embedded (i.e. ".wav" format without the metadata). Otherwise an external file may be used (this one has to be in a real audio format - like ".wav" - i.e. with metadata). - Users usually don't have raw audio. So embedding requires preprocessing. Some control using PDF actions is possible. - Not supported in ConTeXt. - None of the viewers supports the external files. Only Acrobat Reader supports the embedded raw audio. 2) Movie objects - First appeared in PDF 1.2 (1996), but had since been deprecated (PDF 1.5, 2003) and became unsupported (PDF 2.0, 2017). - Both video and audio is supported. - Any source (embedded, external and URL file). - In all regards superior to sound objects. Is still relatively simple and allows some customization and control (media player controls, PDF actions). - This is the backing mechanism for including video and audio in ConTeXt (\externalfigure, \useexternalsoundtrack). - Supported only in Evince and Okular (with their usual quirks, see below). Notably Acrobat Reader does no longer support this mechanism. 3) Multimedia ("Renditions") - First appeared in PDF 1.5 (2003). Adobe Acrobat considers them "legacy". - Both video and audio supported (as well as other unspecified types of multimedia, like images and Flash, but not really, see below). - In theory all source types should be possible. - This mechanism was supposed to replace sound and movie objects. Hence their deprecation. The mechanism is complex (the spec is 10 times longer than that for movie objects). It expects the PDF viewers to work with plugins and introduces ways for determining if a media file is really playable in some plugin. It is allowed to even include more media files (to serve as fallback should the primary one be unsupported by the viewer). Other complexity is that the concept of the rectangle where the media will be played ("screen") is separated from the media itself ("rendition"). In theory this allows mixing and matching them, but in practice is a lot of unnecessary complexity, at least in my opinion. This mechanism allows multimedia player controls, as well as PDF actions. The PDF action can be either one of the predefined ones or entirely specified in JavaScript (extra API is available for this). - This is the mechanism behind \useexternalrendering. This has been used for Flash (.swf files) and "manual" audio + video insertion as far as I can tell. - Evince and Okular support this (with usual quirks), but not for external files (Evince segfaults). Acrobat and Foxit support this mechanism as well, but Acrobat only allows embedded files. Okular by mistake auto-plays all media [1]. 4) 3D art - First appeared in PDF 1.6 (2004). - Only 3D files supported. This means U3D and later PRC files. The 3D objects described in the files are shown in a scene whose parameters (like camera position, angle, background color, etc.) can be configured. - The source is not a file, but a "PDF stream" (which is essentially embedded file with different metadata, but allows also "external files" to contain the stream data). - The 3D functionality is nice. It allows great amount of interactivity (playing with the camera, selectively disabling 3D objects, etc.) and also scriptability (switching between predefined "views" with PDF actions and a _lot_ of possibilities with JavaScript scripts). - This is the mechanism used for u3d and prc files in the ConTeXt "figure" mechanism (\externalfigure). - Apart from the external streams (see above) everything works in Adobe Acrobat. Foxit Reader also has support, but it is limited (no support for JavaScript and printing). 5) Rich Media - First appeared in Adobe extension level 3 to PDF 1.7 (2008). Later included in PDF 2.0 (2017). It was meant to replace both multimedia (renditions) and 3D art mechanisms, with unified mechanism based on Flash, thus also supporting arbitrary Flash applications. - Supports video, audio and 3D. - Only embedded files are supported. - While the mechanism is heavily based on Flash (which is dead, since December 2020) it allows also "plain" Rich Media without Flash. The old idea was that the PDF viewer would support Flash (and playing its video as well as mp4), but the audio/video wouldn't be played directly by the PDF viewer, but by a Flash application (embedded in the PDF along with the media file). This means that the mechanism has inherent complexity that is not justified nowadays (essentially four levels of indirection for a plain audio / video file). While the same thing should have been true for 3D files I couldn't find any real usage like that. Instead it seems that 3D files with Rich Media have always been used like with the "3D art" mechanism (but with different wrappers). There is essentially no scriptability for audio and video. (Note that in this regard 3D files work just like with the "3D art" mechanism). There also isn't an easy way to display multimedia player controls (a hack works for Acrobat). One thing that it allows is playing the media in a customizable window, even full screen (not only in a part of a page like the previous mechanisms allow). - ConTeXt uses this mechanism for Flash (SWF) files in the figure mechanism. This is also allows audio/video (Flash media player, like "vplayer" is inserted and the media file is its parameter), see for example "java-imp-vplayer.mkiv". - Both Flash and "plain" Rich media are supported by Acrobat Reader. Okular only supports Flash Rich Media. How is this possible, considering that Flash player is dead? Well, both viewers have a compatibility layer that detects embedded Flash media player file and doesn't use it to play the video, but instead plays the video natively. This is good, because there are a lot of documents out there which use Flash based Rich Media. But there is absolutely no need to create new documents with embedded Flash player applications, it only takes space and isn't even used. Okular notably doesn't support plain Rich Media. The support is easy to add, but my proposed patch [2] depends on changes to poppler. The poppler developers want to take this chance to improve the Rich Media representation [3] but I haven't gotten to that, yet. Support similar to Okular's should be relatively easy to add to Evince as well. The 3D support is the same as with 3D art for Acrobat Reader. Weirdly Foxit Reader doesn't support 3D files wrapped in Rich Media, although there doesn't seem any good reason for it. All in all, of the five mechanisms 2 are deprecated and unfortunately no longer supported in the most used PDF viewer and other 2 mechanisms are needlessly complex and in reality limited. For example, while the multimedia mechanism supports JavaScript, (AFAIK) only Acrobat Reader supports that, this further limits the viewer support or available features, choose one. The support for video and audio in Okular and Evince is based on Gstreamer. Explaining Gstreamer is tricky, but essentially it allows the viewers to play any media type as long as the right plugin is installed. These plugins are distributed in bundless and three of them cover all reasonable formats and more. But while the media file format support is great, these viewers don't really support PDF actions or JavaScript for more control over the media playback. Acrobat and Foxit both use Windows Media Player for playing the video. Both support controls, but behave differently -- Acrobat displays the controls outside of the multimedia annotation, Foxit within... As if it wasn't enough there is other trouble with playing multimedia in Acrobat Reader and Foxit Reader. They nag you to allow the media playback every time. You can select to trust the file once or from now on, but if somebody opens a foreign PDF with video, they aren't going to get smooth experience. Another thing (but I don't remember well) is that there is a check box in Acrobat Reader, that allows the "legacy" Multimedia mechanism. I don't remember its state in an unaltered installation. After evaluating these mechanisms I came to conclusion, that a PDF writer today is best at: - Embedding video and audio using the "multimedia" ("renditions") mechanism. It is supported in proprietary and open source viewers alike. Customization and scripting / PDF actions is out of the question, though. - Embedding 3D files using the "Rich Media" mechanism. While it is essentially just a few differences in wrappers, it has real advantages (data sources are files not streams, and multiple JavaScript script files are supported), that I find nice enough for the implementation and users alike. Some sources for this topic are also the LaTeX centric [4] and [5]. I go into more details in the former. In the latter the "plain" Rich Media and "multimedia" ("renditions") mechanisms are suggested as solutions for the Flash media player approach. And now for the future. What should ConTeXt do? On one hand all available mechanisms are flawed in one way or the other. On the other hand some users may still find the functionality useful. My suggestions is to either delete all the support for audio/video or: 1) Delete the "Movie objects" implementation of figures. It is not supported in viewers, where users expect it to [6]. 2) Delete all mentions of Flash. There is no reason to create new documents with embedded Flash files, even though they may work in some viewers. Plain Rich Media can be used instead, with hopefully soon equal support [2]. 3) The "externalrendering" mechanism (multimedia/renditions) can stay. If the insertion of audio/video as "figures" is to stay, then I suggest to use multimedia/renditions for it (in simplified form). Note that the 3D support in ConTeXt is completely fine and works in Acrobat and Foxit. The "externalrendering" part currently has three "bugs". Previous discussion at this list provides some context [7]. The following is currently "wrong": - Currently ConTeXt wraps a PDF file specification for embedded file inside another file specification (i.e. embedded files don't work). - As a result of "externalrendering" inheriting from \framed, the PDF annotation late_lua whatsit is centered inside the frame and so the annotation itself is offset by half its width to the right. - ConTeXt doesn't explicitly allow the viewer to create temporary files, hence the playback fails in Acrobat Reader. Hopefully the patch included below fixes all three. But note that while I love ConTeXt I don't know it well and may be terribly wrong. I also was aiming at a minimal diff for inclusion in this e-mail. This is a test file for this: \starttext \setupinteraction[state=start] \useexternalrendering[myvideo][video/mp4][video.mp4][embed=yes] \useexternalrendering[myvideo2][video/mp4][https://gitlab.com/agrahn/media9/uploads/c7e2ae944fbd711df4ad7bd58000f83a/ni...] \useexternalrendering[myvideo3][video/mp4][video.mp4] \definerenderingwindow[myrenderingwindow][width=\textwidth, height=\textwidth] \noindent \placerenderingwindow[myrenderingwindow][myvideo] \goto{START}[StartRendering{myvideo}] \goto{STOP} [StopRendering{myvideo}] \goto{PAUSE}[PauseRendering{myvideo}] \vfil\break\noindent \placerenderingwindow[myrenderingwindow][myvideo2] \vfil\break\noindent \placerenderingwindow[myrenderingwindow][myvideo3] \stoptext All three file source types are demonstrated. Any "video.mp4" in the directory you compile in will do. (Works as expected in Okular on Linux.) This was a dump of knowledge that I gained from writing my thesis. Sadly its in Czech, but part of it is PDF code snippets and tables summarizing viewer support, that I can translate and provide if there is interest. But a large part of what I deem practical today is implemented and documented here: http://mirrors.ctan.org/macros/luatex/optex/pdfextra/pdfextra-doc.pdf. The source is probably hard to read, because of the "_" and "." prefixes in the control sequences, but those can be ignored. I posted some "real" documents in [3] and [5]. If more documents / snippets / explanations are needed I hope I can provide them. Sadly, while working on this, I didn't have access to the PDF 2.0 standard. My information mostly comes from the PDF 1.7 standard and publicly known information about PDF 2.0 - the Rich Media mechanism got included in PDF 2.0, but I am not sure in what extent did the Flash part get included. I also don't know if there really is anything new, but nothing suggests it. Regardless, viewer support isn't complete for something standardized over 20 years ago, I don't expect revolution in the PDF viewers, considering the price of the standard(s). [1]: https://bugs.kde.org/show_bug.cgi?id=436709 [2]: https://invent.kde.org/graphics/okular/-/merge_requests/426 [3]: https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/855 [4]: https://tex.stackexchange.com/questions/516029/media9-is-becoming-obsolete-d... [5]: https://gitlab.com/agrahn/media9/-/issues/9 [6]: https://wiki.contextgarden.net/Command/externalfigure [7]: https://www.mail-archive.com/ntg-context@ntg.nl/msg88639.html Best regards, Michal Vlasák --- a/tex/texmf-context/tex/context/base/mkxl/lpdf-wid.lmt +++ b/tex/texmf-context/tex/context/base/mkxl/lpdf-wid.lmt @@ -689,22 +689,26 @@ -- B = start, -- } -- } - -- local parameters = pdfdictionary { - -- Type = pdfconstant(MediaPermissions), - -- TF = pdfstring("TEMPALWAYS") }, -- TEMPNEVER TEMPEXTRACT TEMPACCESS TEMPALWAYS - -- } + local parameters = pdfdictionary { + Type = pdfconstant("MediaPermissions"), + TF = pdfstring("TEMPALWAYS"), -- TEMPNEVER TEMPEXTRACT TEMPACCESS TEMPALWAYS + -- TEMPALWAYS - allows temporary files (needed for Acrobat / Windows Movie Player) + } local descriptor = pdfdictionary { Type = pdfconstant("Filespec"), F = filename, } if isurl then descriptor.FS = pdfconstant("URL") + descriptor = pdfreference(pdfflushobject(descriptor)) elseif option[v_embed] then - descriptor.EF = codeinjections.embedfile { + descriptor = codeinjections.embedfile { file = filename, mimetype = mimetype, -- yes or no compress = false, } + else + descriptor = pdfreference(pdfflushobject(descriptor)) end local clip = pdfdictionary { Type = pdfconstant("MediaClip"), @@ -712,8 +716,8 @@ N = label, CT = mimetype, Alt = pdfarray { "", "file not found" }, -- language id + message - D = pdfreference(pdfflushobject(descriptor)), - -- P = pdfreference(pdfflushobject(parameters)), + D = descriptor, + P = pdfreference(pdfflushobject(parameters)), } local rendition = pdfdictionary { Type = pdfconstant("Rendition"), --- a/tex/texmf-context/tex/context/base/mkxl/scrn-wid.mklx +++ b/tex/texmf-context/tex/context/base/mkxl/scrn-wid.mklx @@ -649,6 +649,7 @@ \letrenderingwindowparameter\c!closepageaction\empty \setrenderingwindowparameter\c!width {\d_scrn_rendering_width }% \setrenderingwindowparameter\c!height {\d_scrn_rendering_height}% + \setrenderingwindowparameter\c!align {\v!flushleft}% don't center annotation whatsit \to \everypresetrenderingwindow \permanent\tolerant\protected\def\placerenderingwindow[#window]#spacer[#rendering]% do all in lua