10 UDP based upscaling service
An upscaling ‘service’ can be an openEO/OAPIP process that performs the upscaling.
10.1 User stories
As a user:
I want to manage large jobs, up to continental or even global scale.
I want to see intermediate progress and status of jobs.
When a job fails, I want to be able to restart it.
When a job finishes, output data & metadata needs to be persisted.
10.2 Design
The upscaling service will be built on top of other APEx components. The architecture allows it to work independently of external processing platforms.
The initial proposal outlines a design that minimizes the dependency on web services, and instead favours the use of static files that can be served over http. The result is something that can run equally well locally or as a service.
Being lightweight makes it fairly easy to deploy in different environments. For certain projects and organizations, this may be favourable if they want to avoid the use of ‘as a service’ solutions. Similarly, it will be very easy to adapt the script to specific project needs.
The dynamic part of the service is an evolution of the Python based ‘MultiBackendJobManager’ script. This script is responsible for running up to X jobs in parallel (per backend), and tracking the status of all jobs. It is currently supporting openEO, but can equally well support OGC API Processes.
The evolution would be built to work with UDP’s or OGC processes, which allows to use the schema definitions to easily create jobs for various combinations of parameters. This significantly reduces the complexity of the scripting effort required from the user, or could even result into a tool that is purely configuration driven.
The script would be started by the user in the APEx JupyterLab environment. We propose to predefine an ‘application’ (much like the instantiation services) that is configured to include all dependencies. It is important that the pod is not automatically cleaned up, but continues to run when the user logs out.
Jupyter also features a realtime collaboration extension, which could be used to allow multiple users to monitor the same job.
The persistent part of the service is a GeoParquet file that contains the status of all jobs. This file can be served over http, and is only updated by a single script. The use of a cloud native format rather than a database is what keeps the upscaling service very lightweight, and easily deployable in many environments.
For visualization and dashboards, we propose to build upon Javascript frameworks that can natively handle and visualize GeoParquet in combination with openEO javascript components. This would effectively allow to easily build a dashboard showing a map and some processing statistics, while also allowing for instance to fetch the details of a specific job from openEO/OAPIP. The result should be a serverless dashboard! Advanced users can build their own dashboards in a very similar manner.
A mechanism to externally expose the dashboard would be useful to allow users to monitor the progress visually.
10.2.0.1 Dashboard prototyping
The concept of a cloud native, job dashboard was prototyped using ObservableHQ. The input was a 5.7MB parquet file containing job information for a continental scale run (EU27). Such a file is fairly fast to load, even over internet, and thanks to the columnar format, the dashboard can restrict loading to only required columns, which further reduces the initial loading time.
A deck.gl map can visualize all 46443 tile boundaries easily in the browser, allowing to use color coding to get a view on finished, failed and running jobs. ign may be that the Parquet file is onl A caveat in this desy updated in the background, by the upscaling service, and the browser would only see changes upon refresh. A more advanced design could also consider fetching changes from the openEO backend, but it is not trivial to get current statuses for 40k+ jobs in a fast manner. Hence reloading the full data from parquet might be a better option.
When we want to extend the dashboard with capabilities to manage openEO jobs, we may need to heavily customize components. We could however envision creating links to a new window that allows to control an individual job. In a similar manner, we may want to consider how to allow bulk updates of multiple jobs.
10.2.1 Specific considerations
10.2.2 Authentication constraint
OIDC tokens expire, but the service needs to run for multiple days. Proper accounting of these long running jobs depend on having a specific user across the federation.
10.2.2.1 Alternative 1: refresh tokens
If the service is running in the APEx application hub (jupyter), we could request the user to do a login, and then use refresh tokens. This might expire after some time.
10.2.2.2 Alternative 2: storing client credentials
If the service is part of APEx, we can allow to store client credentials in Vault. In this case, it is important that these credentials are only stored in a trusted service (APEx), and do not have to be shared with each processing platform.
10.2.3 Job database persistence
The job database is stored as GeoParquet. This will be stored in the user workspace, to ensure persistence beyond the lifetime of the upscaling job.
For global processing, an evolution to a hosted database might prove necessary. A managed OpenSearch instance could be a good way to track the jobs.
10.2.4 Data export
Large scale processing jobs produce a lot of data. Instead of downloading all data, it is often more convenient to request the processing backend to immediately write data to a cloud storage bucket. The metadata can also be written there, or even ingested into a STAC catalog.
Doing so, would also allow to visualize intermediate results as they are produced.
10.2.5 Job queueing
An important variable for the upscaling service it to know how many jobs can be queued at a given backend at once. Scheduling multiple jobs can be a way to make sure that backend capacity is optimally used. However, when scheduling too many jobs, the backend might get flooded, and the jobs may takes weeks to finish.
A potential feature could be for the job manager to keep on submitting jobs, until there’s at least X number of jobs in the ‘queued’ state.
10.2.5.1 Job queueing service
A microservice that runs as part of a backend, with the ability to queue processing jobs. This service would allow backends to more easily handle the queueing of larger amounts of jobs, and is orthogonal to the general problem of the upscaling service.
10.2.5.2 Argo based
What if we can simply write an argo workflow that corresponds to the large number of process invocation?
10.2.5.3 NiFi based
NiFi can talk to both YARN and K8S clusters, and is capable of handling large number of tasks.
It can run the job tracker, and as part of that process submit already started jobs to the cluster.
10.2.6 Updating running jobs
How to patch a running job? Use case: updating vector cube?