Getting started

This is a rough overview of the work on publications using the Edition Open Access Publication Platform 1.5.

In order to conveniently work on the command line, the file eoa.rc which is found in https://github.molgen.mpg.de/EditionOpenAccess/eoa-utilities should be setup. As is written on top of that file:

# Please configure the path, preferably by using an environment variable
# For example, put those two lines into your zshrc.local:
# export rootpath="path to where the dockers run"
# xsource ~/eoa-utilities/eoa.rc

it needs some basic configuration.

Minimal publication

A good starting point is working with a minimal publication: https://gitlab.gwdg.de/EditionOpenAccess/minimal-publication

The main file there, minipub.tex contains no preambel. Since the conversion pipeline uses to preambels, one for the XeLaTeX run, the other for the Tralics run, these are prepended to the main file when needed by the scripts eoatex2imxml and eoatex2pdf, respectively.

The latter command can also be used outside Docker and which will then use the LaTeX installation and fonts found on the system:

python3 $PATH_TO_EOASkripts/src/eoatex2pdf.py --log-level=DEBUG -f minipub.tex .

The dot in the end of above command is important. What will be created is a new document structure, beginning with output/, and ending with pdf, where the new PDF is being created.

All the changes must still be made in the original texfiles, outside the output file structure, as the ones in output will be overwritten each time above script is run.

Working locally with Docker

The EOASkripts container contains all the scripts for the conversion workflow plus some extra stuff and libraries. The scripts can also be run from outside the container, but in some cases there are dependencies on specific versions of scripts. The creation of the HTML bibliography is one example there.

process_eoa_latex.py is a script that will run the processing steps in a particular order: the PDF is produced, then the conversion to XML and out of the XML the version for the publication platform and the EPUB are created. As the Linux version of Microsoft fonts is outdated, the final PDF will be produced outside the container. The aux files from LaTeX are however necessary for the XML versions (mainly for references with page numbers that realistically don’t make sense there).

The relevant files need to be copied into the Docker input folder, for which a shortcut, $di is available in eoa.rc. It expands to:

$PATH_TO_EOASkripts/runtime/input

Inside that directory, a new directory should be created, e.g. called by the series and number of the publication in question: proceedings132:

rm -rf $di/proceedings132/* ; cp -r minipub.tex images preambel publication.cfg texfiles $di/proceedings132

That is, old data is deleted, new data is copied over

Similarly, there is a shortcut for the output directory, $do:

$PATH_TO_EOASkripts/runtime/output

In order to get the whole pipeline in Docker going, there is a script, process_eoa_latex.py which runs the whole pipeline. For special cases, customised versions of that file can be created that are tailored towards one publication only.

To run the whole thing, the Docker daemon needs to run. Then, in EOASkripts directory, the container is started:

./scripts/run.py

The next steps will be run interactively in the container, so it will be entered with:

./scripts/exec_in_container.py

Once inside, the command:

process_eoa_latex.py

gets the whole thing going:

  • PDF version with LaTeX

  • conversion to intermediate XML (imxml)

  • creation of Django version out of imxml

  • creation of EPUB version out of imxml

If errors occur in the the later scripts, the situation can be evaluated, a temporary fix in the intermediate file be made and the subsequent steps run individually:

imxml2django.py "input/proceedings132"
imxml2epub.py "input/proceedings132"

They terminate successfully then.

Once successful, the output formats will be available in $PATH_TO_EOASkripts/runtime/output:

There will be an epub file in the epub directory, but also the raw HTML files that make up the EPUB in epub/OEBPS. And the django directory contains the files for the publication platform.

Gitlab CI

Through the file .gitlab-ci.yml, the project https://gitlab.gwdg.de/EditionOpenAccess/minimal-publication is enabled for using the automated conversion pipeline provided by the GWDG. Whenever changes are pushed to the repository, the conversion pipeline will be started and will produce the desired output formats.

This service can be enabled for any other project, too. Some settings in the .gitlab-ci.yml need to be adjusted, for example file paths and options to some scripts.

Import into Publication Platform

The output files from either the local workflow or the Gitab CI workflow can then be loaded into the publication platform, which is also available as a docker container. See https://docs.edition-open-access.de/eoa-1.5/install_docker.html for installation instructions.

The script pimp15 proceedings132 copies the relevant files from the django directory into a directory of the publication platform and starts the ingestion process (the platform container needs to be running for that). The import progress can be monitored on the command line.

NB: the pimp15 script depends on files found in $do. So if files from the Gitlab CI are used, they need to be placed there.