4 Interactive Development Environments

4.1 Overview

The Interactive Development Environment (IDE) within the APEx Instantiation Services leverages the power of Code Server (VS Code in the Cloud) on the User Workspace described in the previous section. This setup encapsulates all the capabilities of Microsoft’s popular VS Code editor and extends them to be run and accessed on a remote server. The IDE allows developers to maintain a familiar environment and rich feature set of VS Code while benefiting from the power and resources of server-side computing. This is particularly advantageous for those working on resource-intensive tasks or needing access to a consistent development environment from various locations and devices.

The server-based nature ensures that developers are not constrained by their local machine’s hardware capabilities, allowing them to harness the computational power of remote servers. Beyond the core functionality of its desktop counterpart, the IDE offers additional features tailored for remote development, such as integrated Git support, debugging tools, and a plethora of extensions from the VS Code Marketplace. It seamlessly adapts to containerized environments, enabling developers to create, test, and deploy applications within isolated, replicable, and consistent environments, ensuring consistent behaviour across development, staging, and production phases.

Tailored specifically for EO tasks, this environment furnishes developers with an array of tools and libraries fine-tuned for programming languages including Python, R, and Java. Key libraries such as SNAP and GDAL are integrated, providing robust capabilities for EO data discovery, access, processing, and analytical needs.

4.2 Showcase Scenarios

The Interactive Development Environment supports a variety of use cases, making it an essential tool for developers, researchers, and data scientists within the EO community. Some typical scenarios include:

Algorithm Development and Testing: Researchers and developers can write, test, and debug new algorithms for processing satellite imagery or other EO data. For instance, a user might develop a script to detect deforestation using multi-temporal satellite images.
Collaborative Projects: Teams can work collaboratively on projects, sharing code and resources in real time. A group of data scientists might collaboratively develop a machine-learning model to predict crop yields based on various data inputs.
Data Processing Pipelines: Users can develop and test data processing pipelines that automate the ingestion, processing, and analysis of large EO datasets. An example use case could be to create a workflow setting up a pipeline to preprocess satellite images and extract relevant features for further analysis.

4.3 User stories

4.4 Business model

4.5 Technical Architecture

The IDE within the APEx Instantiation Services is built on the User Workspace architecture leveraging Kubernetes and JupyterHub for orchestration and management. VS Code Server serves as the core development environment, providing a powerful and flexible platform for coding and debugging. This setup extends the capabilities of Microsoft’s popular VS Code editor to a remote server environment, allowing developers to harness server-side computational power while maintaining a familiar interface.

Kubernetes plays a crucial role in managing the deployment, scaling, and operation of containerized applications within the IDE. By ensuring auto-scalability, self-healing, and load distribution, Kubernetes provides a stable and efficient environment for development tasks. JupyterHub orchestrates the creation and management of user-specific development environments, allowing seamless integration with the broader APEx services and enabling the IDE to function smoothly within the APEx ecosystem.

VS Code Server forms the core development environment, delivering a comprehensive code editing and debugging interface. This server-based IDE allows developers to maintain the familiar environment and feature set of VS Code while benefiting from the power and resources of server-side computing. The server-based nature ensures that developers are not constrained by their local machine’s hardware capabilities, enabling them to leverage the computational power of remote servers.

Data management is a key aspect of the IDE’s architecture. The IDE enables the integration with various data sources, including external databases and APEx’s Product Catalogue, enabling users to access and utilise these data sources within their development workflows. Data generated or used within the IDE is securely stored using PersistentVolumeClaims (PVCs) and can be retrieved as needed, ensuring efficient and secure data management.

Application portability is a cornerstone of the IDE, achieved through containerization technologies. Developers can encapsulate applications with all essential configurations and dependencies, ensuring consistent behaviour across diverse deployment environments. The IDE follows the Best Practice for Earth Observation Application Package as defined by the Open Geospatial Consortium (OGC 20-089) and the EO Exploitation Platform Common Architecture (EOEPCA) spearheaded by the European Space Agency (ESA). This best practice supports developers in adapting and packaging their existing algorithms to be reproducible, deployed, and executable on different platforms.

An EO Application within the IDE is treated as a command-line interface (CLI) tool that runs as a non-interactive executable program. It receives input arguments, performs a computation, and terminates after producing some output. These applications, written in various programming languages such as Python, Java, C++, C#, and shell scripts, use specific software libraries like SNAP, GDAL, and Orfeo Toolbox. Developers build container images that encapsulate their applications and command-line tools, along with necessary runtime environments, and publish these images on container registries for easy access and deployment.

The IDE supports the Common Workflow Language (CWL), allowing developers to delineate and disseminate application workflows in a recognized format. CWL documents comprehensively describe the data processing application, including parameters, software items, executables, dependencies, and metadata. This standardisation enhances collaboration, clarity, and operational consistency, ensuring that applications are reproducible and portable across various execution scenarios, including local computers, cloud resources, high-performance computing (HPC) environments, Kubernetes clusters, and services deployed through an OGC API - Processes interface.

Version control and continuous integration are integral components of the IDE’s technical architecture. The IDE enables access to VCS (e.g. GitLab, GitHub) for efficient code repository management, version control, collaboration, and monitoring of code changes. Automated continuous integration (CI) tools manage the build, test, and deployment tasks in response to code modifications, ensuring that applications are always in a deployable state. This automation minimises manual testing overhead and accelerates the rollout of new features or updates.

Security and compliance are prioritised within the IDE. User authentication and authorization are managed through JupyterHub, ensuring that only authorised users can access the IDE. User data is stored in isolated environments, adhering to data privacy regulations and standards. Configurations and environment settings are managed securely to prevent unauthorised access, ensuring that the development environment remains secure and compliant.

4.6 Operational Management

The deployment and scaling of the IDE within the APEx Instantiation Services are efficiently managed through Kubernetes. Kubernetes ensures that resources are allocated and scaled according to user demand, providing a flexible and resilient infrastructure.

Monitoring and maintenance are critical aspects of operational management. Continuous monitoring of the IDE environment is performed using tools such as Prometheus and Grafana. These tools provide real-time insights into system metrics, allowing administrators to track performance, detect anomalies, and address potential issues before they impact users. This proactive approach ensures optimal performance and high availability of the IDE.

Maintenance tasks, including updates and backups, are automated to minimise downtime and ensure data integrity. Regular updates ensure that the IDE remains secure and incorporates the latest features and improvements. Automated backups protect user data and configurations, allowing for quick recovery in the event of a failure.

Security audits and compliance checks are conducted regularly to maintain a secure and compliant environment. These audits ensure that the IDE adheres to data privacy regulations and security standards, protecting user data and maintaining trust. Access control mechanisms are reviewed and updated as necessary to ensure that only authorised users can access the development environment.