Develop a model from scratch

This tutorial explains how to develop a AI4OS module from scratch.

If you are new to Machine Learning, you might want to check some useful Machine Learning resources we compiled to help you getting started.

Requirements

  • For Step 1, to use the Module’s template webpage, you need at least basic authentication.

  • For Step 2, if you plan to use the AI4OS Development environment, you need full authentication to be able to access the Dashboard. Otherwise you can develop locally.

  • For Step 4 we recommend having docker installed (though it’s not strictly mandatory).

1. Setting the framework

This first step relies on the the AI4OS Modules Template for creating a template for your new module:

  • Access and authenticate in the Template creation webpage.

  • Then select the minimal branch of the template and answer the questions.

  • Click on Generate and you will be able to download a .zip file with with your project directory. Extract it locally.

2. Prepare your development environment

Although it is possible to develop your code locally, we also offer the possibility to develop from our AI4OS Development Environment.

This offers the benefits of:

  • developing on dedicated resources (including GPUs),

  • have direct access to your Nextcloud storage,

  • develop on Docker image that is already packaged with your favorite Deep Learning framework (eg. Pytorch, Tensorflow),

  • develop on your favorite IDE (Jupyterlab or VScode),

Check how to configure the AI4OS Development Environment. For example, this is what an AI4OS Development Environment with VScode would look out-of-the-box:

‎ ‎ Launching a development environment
../../../_images/vscode.png

3. Editing the module’s code

Unpack, the zip file created in Step 1. Install your project as a Python module in editable mode (so that the changes you make to the codebase are picked by Python).

$ cd <project-name>
$ pip install -e .

Now you can start writing your code.

Tip

Some users have reported issues in some systems when installing deepaas (which is always present in the requirements.txt of your project). Those issues have been resolved as following:

  • In Pytorch Docker images, making sure gcc is installed (apt install gcc)

  • In other systems, sometimes python3-dev is needed (apt install python3-dev).

To be able to interface with DEEPaaS you have to define in api.py the functions you want to make accessible to the user. For this tutorial we are going to head to our official demo module and copy-paste its api.py file.

Once this is done, check that DEEPaaS is interfacing correctly by running:

$ deepaas-run --listen-ip 0.0.0.0

Your module should be visible in http://0.0.0.0:5000/ui . If you don’t see your module, you probably messed the api.py file. Try running it with python so you get a more detailed debug message.

$ python api.py

Remember to leave untouched the get_metadata() function that comes predefined with your module, as all modules should have proper metadata.

You can also use port 6006 to expose some training monitoring tool, like Tensorboard.

In order to improve the readability of the code and the overall maintainability of your module, we enforce some quality standards in tox (including style, security, etc). Modules that fail to pass style tests won’t be able to build docker images. You can check locally if your module passes the tests:

$ tox -e .

There you should see a detailed report of the offending lines (if any). You can always turn off flake8 testing in some parts of the code if long lines are really needed.

Tip

If your project has many offending lines, it’s recommended using a code formatter tool like Black. It also helps for having a consistent code style and minimizing git diffs. Black formatted code will always be compliant with flake8.

Once installed, you can check how Black would have reformatted your code:

$ black <code-folder> --diff

You can always turn off Black formatting if you want to keep some sections of your code untouched.

If you are happy with the changes, you can make them permanent using:

$ black <code-folder>

Remember to have a backup before reformatting, just in case!

4. Editing the module’s Dockerfile

Your ./Dockerfile is in charge of creating a docker image that integrates your application, along with deepaas and any other dependency. You can modify that file according to your needs.

We recommend checking the installation steps are fine. If your module needs additional Linux packages add them to the Dockerfile. Check your Dockerfile works correctly by building it locally (outside the AI4OS Development Environment) and running it:

$ docker build --no-cache -t your_project .
$ docker run -ti -p 5000:5000 -p 6006:6006 -p 8888:8888 your_project

Your module should be visible in http://0.0.0.0:5000/ui . You can make a POST request to the predict method to check everything is working as intended.

5. Update your project’s metadata

The module’s metadata is located in the ai4-metadata.yml file. This is the information that will be displayed in the Marketplace. The fields you need to edit to comply with our schemata are:

  • title (mandatory): short title,

  • summary (mandatory): one liner summary of your module,

  • description (optional): extended description of your module, like a README,

  • links (mostly optional): links to related info (training dataset, module citation. etc),

  • tags (mandatory): relevant user-defined keywords (can be empty),

  • categories, tasks, libraries, data-type (mandatory): one or several keywords, to be chosen from a closed list (can be empty).

Libraries

Tasks

Categories

Data Type

TensorFlow

Computer Vision

AI4 pre trained

Image

PyTorch

Natural Language Processing

AI4 trainable

Text

Keras

Time Series

AI4 inference

Time Series

Scikit-learn

Recommender Systems

AI4 tools

Tabular

XGBoost

Anomaly Detection

Graph

LightGBM

Regression

Audio

CatBoost

Classification

Video

Other

Clustering

Other

Dimensionality Reduction

Generative Models

Graph Neural Networks

Optimization

Reinforcement Learning

Transfer Learning

Uncertainty Estimation

Other

Some fields are pre-filled via the AI4OS Modules Template and usually do not need to be modified. Check you didn’t mess up the YAML definition by running our metadata validator:

$ pip install ai4-metadata
$ ai4-metadata validate ai4-metadata.yml

6. Integrating the module in the Marketplace

Once your repo is set, it’s time to integrate it in the Marketplace!

For this the steps are:

  1. Open an issue in the AI4OS Catalog repo.

  2. An admin will create the Github repo for your module inside the ai4os-hub organization. You will be granted write permissions in that repo.

    Modules repos follow the following convention:

    • ai4os-hub/ai4-<project-name>: module officially developed by the project

    • ai4os-hub/<project-name>: modules developed by external users

  3. Upload your code to that repo.

  4. An admin will review your code and add it to the AI4OS Catalog. Once a module is approved it will take roughly 6 hours to appear in the Dashboard’s Marketplace.