TABLE OF CONTENTS

General Description

Environments for Functions enable the user to upload custom Python environments. These can contain Python packages, allowing the use of them in Functions. ONE DATA differentiates between System and User Environments.

System Environments
Are provided together with ONE DATA and cannot be uploaded, updated or deleted by users.

User Environments

Can be uploaded by users of a domain and enable them using Environments containing additional Python packages, needed for specific use cases.

Benefits of User Environments

  • Reliably working use cases as every project can have its own isolated set of Python packages and have control over them.
  • More freedom and flexibility when creating Python scripts.
  • Independence from others as packages can be installed without the involvement of DevOps.
  • Faster execution as Workflows can be replaced by Functions. This will reduce Spark overhead.


Necessary Rights

Different rights and roles are necessary to be able to create and use Environments.

Upload a User Environment

Every user with access to a specific domain can upload Environments to that domain. 

Update a User Environment

The uploader of the User Environment, a Domain Admin or a Super Admin.

Delete a User Environment

The uploader of the User Environment, a Domain Admin or a Super Admin.

Change Environment of a Function

Everyone, who has Write rights to the Function.


Where to Find the Functionality

There is no overview of all Environments. Instead, they are managed through the Functions. Management (upload, update & delete) can be done in two places:


Functions overview page

Functions detail view


Create a User Environment

Two variants for creating a User Environment are available. One needs access to https://harbor.onedata.de, which is only accessible for internal users, the other does not. Specific prerequisites are needed for both. At the end, you will have a tar file to upload to ONE DATA. Various CLI commands are used during these processes. A list of them can be found here.

Prerequisites

For Every User

You need Docker CLI to be installed and started. A documentation on how to do that for different operation systems can be found here: https://docs.docker.com/get-docker/.

For Internal Users

For interactions with harbor.onedata.de the following is necessary:

  • A valid account for harbor.onedata.de (Can be requested in the #devops-support Slack channel).
  • Being logged in to harbor.onedata.de (CLI command "docker login https://harbor.onedata.de -u username -p password")


With Access to harbor.onedata.de

1. Acquire a base Environment

First, you need a base Environment, which you can base your new one on. There are two ways to acquire the base image.


Option 1: You have an account on harbor.onedata.de and are logged in from you Docker CLI. Referenced images are automatically acquired (pulled) from harbor and need no further efforts. You can continue with step 2.


Option 2: You can acquire the base image here (for ONE LOGIC internals), or through your contact at ONE LOGIC. Afterwards, additional steps are required before you can continue with step 2:

  • Load the base image from the tar archive to your local registry: "docker load -i base_1.2.8.tar"
  • Validate this using the command "docker images". This should print a line of this kind:
    <repo/name> <tag> <imageid> <created> <size>

We will use the base image as a parent image and combine it with the Conda image to build your new environment. 


2. Copy this Dockerfile for the Conda image to a local Dockerfile on your system

# If you use the base image from harbor:
FROM harbor.onedata.de/private/openfaas/runtime/base:1.2.8
# If you use a local base image:
# FROM <repo>:<tag>
 
# --------------------------------------------------------
# Install conda
# --------------------------------------------------------
ARG CONDA_VER=4.7.12.1
 
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH /opt/conda/bin:$PATH
 
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VER}-Linux-x86_64.sh -O ~/anaconda.sh && \
    /bin/bash ~/anaconda.sh -b -p /opt/conda && \
    rm ~/anaconda.sh && \
    ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
    echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
    echo "conda activate base" >> ~/.bashrc && \
    find /opt/conda/ -follow -type f -name '*.a' -delete && \
    find /opt/conda/ -follow -type f -name '*.js.map' -delete && \
    /opt/conda/bin/conda clean -afy
 
# --------------------------------------------------------
# Install mandatory python dependencies (DO NOT REMOVE!)
# --------------------------------------------------------
RUN pip install \
  flask==1.1.2 \
  waitress==1.4.4
 
# --------------------------------------------------------
# Install use case specific packages (Add here like above)
# --------------------------------------------------------
RUN pip install \
  requests==2.22.0 # optional, but required to perform OD API requests
	

3. Add needed packages in the "Install use case specific packages" section of the Dockerfile

The already listed packages flask, waitress, and requests are required for Function execution and must not be removed. Also, other parts of this file must not be changed, except the Miniconda version (if you want a newer/older version of Python). Available Miniconda versions can be found at https://repo.anaconda.com/miniconda/

We recommend to use pip as a package manager for installation, as it creates lightweight Environments, which makes Function creation time as short as possible.

In case your Environment was not usable for some reason, you can also try installing the use case specific packages using Conda as an alternative. However, Conda creates larger Environment, which makes Function creation take longer.

4. [Optional] Installing custom packages and including files in your Conda Environment image

Always use "/home/app" to make packages/files available in a Function. A Function is always deployed as the user "app" and it would then have access to "/home/app" (home directory of the user "app").

You have two possibilities for adding custom packages.

  • If your custom packages were created following this guide (for ONE LOGIC internals), or this guide (for ONE LOGIC externals) they can be installed in a Conda Environment image by adding the following at the end of your Conda Environment Dockerfile (presuming the .whl file is located in the same folder as the Dockerfile):

    COPY example_pkg_xxx-0.0.1-py3-none-any.whl /home/app
    RUN pip install /home/app/example_pkg_xxx-0.0.1-py3-none-any.whl
  • You can also include additional files in your Environment image by just copying them to /home/app which therefore makes them accessible in a Function. This can be done by adding the following at the end of your Conda Environment Dockerfile (presuming the file is located in the same folder as the Dockerfile):

    COPY file.pkl /home/app

    This file will then be available in your Function as "file.pkl".

5. Build and tag an image from your Dockerfile

Open the command line and navigate to the folder containing your Dockerfile. 

Execute the following command:

docker build . -f <name_of_dockerfile> -t <tag>
  • <name_of_dockerfile> must be replaced with the name of the Dockerfile you want to build.
  • <tag> must be replaced with a valid docker image tag (format: name:tag).

Example:

docker build . -f Dockerfile -t env:1.0.0

More information can be found in the official docker build documentation

6. Save your custom environment as tar-ball

docker save -o myCustomEnvironment.tar <repo>:<tag>

The file myCustomEnvironment.tar now contains your custom Environment and is ready to be uploaded.

Without Access to harbor.onedata.de

1. Copy your Dockerfile to a local file on your local machine and give it a custom name. In this example, we use the name "DockerfileExample".

2. Add the needed packages for your specific use case. Requests and numpy could be possible candidates.

  1. Open a command-line interface and navigate to the folder containing your Dockerfile. Some examples for command-line interfaces are:
  2. Built-in Windows command line
  3. Git Bash
  4. Linux Bash
  5. Windows PowerShell

3. Build the image by running the "docker build" command with suitable parameters.

4. Verify that the build succeeded. The messages should look something like this.

5. Verify that the image is saved locally with the "docker images" command.

6. Verify that the created image contains the desired packages by running "docker run -it <name>:<tag> python" and trying to use those packages. (Prefixing the command with "winpty" is necessary for Git Bash.)

7. Save the custom environment to a tar file.

Used CLI Commands

docker save -o filename.tar <repo>:<tag> 

  • Writes repo in a tar-file which can be uploaded to OD
  • "-o": Write to a file, instead of STDOUT

docker load -i filename.tar

  • Load an image from a tar archive or STDIN

  • "-i": Read from tar archive file, instead of STDIN

docker images

  • Lists local docker registry

docker rmi <filename | imageId>

  • Removes given image from local docker registry

docker pull <imageURL>:<tag>

  • Pulls given image to local docker registry

docker build . -f <name_of_dockerfile> -t <tag>

  • Builds an image from a Dockerfile
  • "-f": Name of the Dockerfile (Default is 'PATH/Dockerfile')
  • "-t": Name and optionally a tag in the 'name:tag' format


Upload a User Environment

You can upload User Environments to ONE DATA through its UI, or by REST Request.

Using the ONE DATA UI

The "Upload Environment" functionality can be found together with the other management options in the Functions overview and the Function detail view, as seen in "Where to Find the Functionality". Clicking on the button opens a dialog with your operating system's file explorer, in which you can select the correct tar file to upload (open).


Please don't leave the page until the Environment is uploaded. You can see the progress via a loading bar on top of the page. The file upload can take some time depending on the image size and bandwidth. 


After the Environment is uploaded, you will receive a toast message in the bottom left corner of the page. The name of the Environment is the <name>:<tag> repo tag specified in the Dockerfile.


Using a REST Request

Alternatively, you can perform the upload via REST request. Replace <faas-server><token><pathToTar> and <domainId> in the following command with the necessary values.

curl --location --request POST '<faas-server>/api/v1/environment' \
--header 'Authorization: Bearer <token>' \
--form 'environmentImage=@"<pathToTar>"' \
--form 'domainId="<domainId>"'

Update/Overwrite a User Environment

Existing User Environments can be updated using the upload feature. An existing User Environment will be updated when the uploaded tar file includes the same <name>:<tag> combination. This is specified during the building of the docker image, which is descriped in steps 4 and 5 of the creation instruction here.


For smaller changes, please reuse Environment-tags to avoid cluttering an instance. The respective Environment will be overwritten. Only the creator of an Environment or Domain and Super Admins can overwrite it.


Delete a User Environment

User Environments can be deleted also in the Functions overview page or in the Function detail view.


Deployed Functions using a deleted Environment can still be executed. However, they can not be re-deployed until a new Environment gets assigned to them.

If the User Environment of a Function was deleted, you will get a warning inside the Function.



Change the Environment of a Function

If a Function should use a different Environment, this can be changed in the Function detail view, using "Change Environment". A dialog will open, allowing you to select a different Environment. You need to be careful that the new Environment contains all necessary Python packages.



Boundaries and Current Constraints

  • No automatic check for package compatibility is performed. The Environment builder himself is responsible for checking whether packages are compatible.
  • The same applies for usage licenses and security vulnerabilities.


Troubleshooting and Tips

Keep Track of User Environments

It's highly suggested to use version control (e.g. Git) to keep track of Environment versions, especially if multiple users are responsible for managing them.


Environment Upload Failed

Currently there are two scenarios for an Environment upload failing, that is not related to technical issues.

An Environment with that name already exists in the current Domain

Possible solutions are:

  • If Environment A and B are the same, use the existing Environment.
  • If Environment A and B have very different content, use a different name for the new Environment.
  • If one Environment contains all packages of the other Environment use the larger Environment. Keep the old Environment or overwrite with the new one. It's also possible to simply choose a different name for the new Environment. Keep however in mind to not clutter your instance with too many Environments.

An Environment with that name already exists in another Domain

It's not possible to use the Environment of the other Domain, even if it's suitable.

  • Upload your Environment to your Domain with a different name.


Check Package Versions in Default Environment

Package versions of the default Environment can be checked in harbor.onedata.de or by using a Function.

Check in harbor

  • Check private/openfaas/runtime/conda in harbor in the latest release in the last step of the build history
  • Example with Conda version 4.7.12.1-r10: the installed packages are listed

Use a Function

def handle(req):
    # retrieve installed packages and their versions
    import pkg_resources
    installed_packages = pkg_resources.working_set
 
    # formatting & sorting for better readability
    installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
        for i in installed_packages])
    return installed_packages_list