Deployment options

This page serves a guide, summarizing the pros and cons of each deployment option. With this information in mind, users can make the best decision on where to deploy their models.

Deployment options from the Dashboard

Option

✅ Pros

❌ Cons

Deploy as serverless (model is loaded on demand)

  • You are not consuming resources when you are not using the model,

  • Deployments can auto-scale to fit peaks in user queries,

  • Supports both sync and async calls

  • Supports predictions via default UI (Oscar)

  • Zero configuration needed as the model is deployed internally,

  • Can be manually customized for specific needs (like historical data batch prediction).

  • Predictions can have some latency, due to the AI model being loaded at each prediction.

Deploy persistently (model is always loaded)

  • Low latency in predictions,

  • Supports sync calls

  • Supports predictions via dedicated UI (Gradio)

  • Zero configuration needed as the model is deployed internally.

  • You are consuming resources even when not actively making predictions.

Deploy in your own cloud resources

  • You control where you deploy (no need to be a project member).

  • Supports sync calls

  • More work to configure the deployment,

  • You are consuming resources even when not actively making predictions (if not deployed as serverless).

Deploy in the EOSC node

  • You control where you deploy (no need to be a project member). Free limited resources for any European researcher.

  • Supports sync calls

  • More work to configure the deployment,

  • You are consuming resources even when not actively making predictions (if not deployed as serverless).

Given the above specifications, we recommend the following typical workflows:

  • Deploy as serverless if your service does not need low latency and expects to either (1) receive lots of concurrent user queries or (2) receive user queries that are spaced-out in time. This is the recommended option by default for all users.

  • Deploy persistently if your service needs to handle, with low latency, few concurrent user queries that are close in time.

  • Deploy in your own cloud resources if you do not belong to the project and have access to private cloud resources.

  • Deploy in the EOSC node if you are a European researcher who does not belong to the project and don’t have access to private cloud resources.

If you need to generate one-off predictions on a given dataset but not maintain a running service, you have two options:

In addition to the above deployment options from the Dashboard, there are several additional deployment methods: