Deployment options in AI4OS

This page serves a guide, summarizing the pros and cons of each deployment option. With this information in mind, users can make the best decision on where to deploy their models.

Deployment options from the AI4OS Dashboard

Option

✅ Pros

❌ Cons

Deploy in AI4OS (serverless) (model is loaded on demand)

  • You are not consuming resources when you are not using the model,

  • Deployments can auto-scale to fit peaks in user queries,

  • Zero configuration needed as the model is deployed in the AI4OS stack.

  • Predictions can have some latency, due to the AI model being loaded at each prediction.

Deploy in AI4OS (dedicated resources) (model is always loaded)

  • Low latency in predictions,

  • Zero configuration needed as the model is deployed in the AI4OS stack.

  • You are consuming resources even when not actively making predictions.

Deploy in your own cloud resources

  • You control where you deploy (no need to be an AI4OS member).

  • More work to configure the deployment,

  • You are consuming resources even when not actively making predictions (if not deployed as serverless).

Given the above specifications, we recommend the following typical workflows:

If you need to generate one-off predictions on a given dataset but not maintain a running service, you have two options:

In addition to the above deployment options from the AI4OS Dashboard, there are several additional deployment methods: