Troubleshooting

Some errors occur again and again. Here are fixes.

Workflow stalls at executing 'make4ht nondashed'

Problem

The console output looks like:

INFO - Wrote /eoa/output/studies10/imxml/tmp_files/bib2html/chapter_04-tmp/nondashed.tex
INFO - cd /eoa/output/studies10/imxml/tmp_files/bib2html/chapter_04-tmp
INFO - Wrote nondashed.mk4.
INFO - executing 'make4ht nondashed'
INFO - output to log file

And even after minutes, the script does not proceed.

Solution

For the creation of XML and HTML bibliographies, the LaTeX bibfile needs to be converted to HTML. The system uses tex4ht for this task, encapsulated in a makefile, which is called with make4ht.

The most common error here is that a LaTeX command was used in the bibfile that is not supported. To close in on the error, the tex4ht tool has to be called on its own. In the first line of above excerpt we see the exact location of problematic files. When listing the contents of that directory, we see a copy of the original bibliography file, as well as files required for the tex run.

When executing make4ht nondashed now, we see the original output of LaTeX until it is stopped by an error. The displayed line:

l.116 \verb|Goebl1996|  &\cite{Goebl1996}&
                                        \cite*{Goebl1996}&\citefield{Goebl...

will display the faulty bibliography entry.

Open the bibliography file and the nondashed.tex file in a text editor. The file nondashed.tex contains a tabular environment with all citekeys. Comment out all the other lines before and after the line with an error:

% \verb|Johnston2007|  &\cite{Johnston2007}&\cite*{Johnston2007}&\citefield{Johnston2007}{title}\\
% \verb|Labov1971|  &\cite{Labov1971}&\cite*{Labov1971}&\citefield{Labov1971}{title}\\
\verb|Goebl1996|  &\cite{Goebl1996}&\cite*{Goebl1996}&\citefield{Goebl1996}{title}\\
% \verb|Salverda1996|  &\cite{Salverda1996}&\cite*{Salverda1996}&\citefield{Salverda1996}{title}\\
% \verb|Toso2008|  &\cite{Toso2008}&\cite*{Toso2008}&\citefield{Toso2008}{title}\\

delete the aux files:

rm nondashed.4* nondashed.aux nondashed.b* nondashed.dvi nondashed.log nondashed.run.xml nondashed.tmp nondashed.xref

and re-run make4ht. If the error persists, look at the bibliography entry. Most likely, the error is a TeX command that is not recognized. Check the following:

Things to check 1: unrecognized commands

Are there commands like \EOAemph? They need to be replaced by normal LaTeX commands:

EOATeX Code

LaTeX Code

\EOAemph

\emph

\EOAup

\textsuperscript

\EOAdown

\textsubscript

\EOAbold

\textbf

Things to check 2: diacritics commands

Second: are there commands for diacritics like:

Author = {al-Qurash\={\i}},

They must be replaced by their Unicode counterparts:

Author = {al-Qurashī},

Things to check 3: character erroneously converted to entity

Or third: a character has erroneously been converted to an entity command:

Year = {1996\entity{8211}1997}}

This needs be replaced by the corresponding Unicode character, but not in the bib file, but in the conversion script. See commit https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/commit/2e4b6c030d1a8582ea429333684ec81b09ae31f1#diff-b8b62ff21d36dafc5bdd5d73dec4e4a7R101. The line contains a list of codepoints that are excluded from the conversion. Add the new codepoint there.

Font issues in Docker

When using the Docker pipeline, the PDF output lacks some special characters and small caps.

This is because the version of Times New Roman that is used inside the Docker image is completely outdated (see https://sourceforge.net/projects/mscorefonts2/ and the character set limited.)

So the way to go will be to create the final PDF outside Docker and do the rest in Docker. But since some scripts need info from the aux files, we cannot just ditch it there. We’ll just not use it.

Outside Docker the fonts Termes, Cursor and Heros from http://www.gust.org.pl/projects/e-foundry/tex-gyre are needed, as they will act as replacements for Courier, Helvetica and as Small Caps for Times New Roman.

So, the best way to circumvent this problem is not to use Times New Roman, but a font with an open license