9 Algorithm pipelines
Propagation services are supported by a number of automation pipelines.
9.0.1 General principles
- A simple design, that favours simple static files rendered in the browser rather than complex web services.
- Rely on Github actions for running the workflows, could also be ArgoCI.
- Use git to maintain metadata records rather than a record service.
- Tools like Quarto or Jupyter notebooks can render HTML reports that can be served.
- Object storage can be used to store artifacts.
Components:
App Git Repoan ESA project repository, maintained by the project. Mostly relevant if it is used by the algorithm, for instance UDF code.Algorithm Cataloga list of openEO or OGC AP based algorithms, hosted by APEx. This catalog is a generic metadata record, not the actual standard specfic process.Backend Cataloglist of compliant backends. Allows discovery of available backends by users.Algorithm runsparquet file containing a row per run of a specific udp. Allows us to plot statistics.
9.0.2 Quality & compliance test
- Static code analysis
9.0.2.1 Python tools
Python libraries can use pylint to compute a code quality score. Question is how APEx can show an overview of related github repositories, together with a pylint score, and potentially some other metrics? The most simple solution is probably to run pylint ourselves. APEx compliance guidelines can be used to ensure a proper project organization.
A UDP can for sure link to a source git repository, from where APEx can harvest them, to show this overview.
The test results can be rendered either by a custom javascript based webapp, or even by a Quarto dashboard.
9.0.3 Integration test
- End-to-end test on backends that support the algorithm.
- Runs weekly or upon changes.
- Compares against reference output.
- Records performance metrics.
9.0.4 Benchmark
- specific runs to build sufficient statistics to compute a cost distribution
- Computes cost per km² using a standardized formula such as
µ + 2* sigma
9.0.5 Release pipeline
- Releases a new version of algorithm or software
- Tracks changes in changelog, linking to issue tracker?
- Publishes/Deploys artifacts
9.0.6 Pipeline tools
Choose from
- Argo CI
- Jenkins (not foreseen in apex)
- NiFI (not foreseen in apex)
- dvc: https://mlops-guide.github.io/Versionamento/pipelines_dvc/