Develop a model from scratch¶
This tutorial explains how to develop a AI4OS module from scratch.
If you are new to Machine Learning, you might want to check some useful Machine Learning resources we compiled to help you getting started.
Requirements
For Step 1, to use the Module’s template webpage, you need at least basic authentication.
For Step 2, if you plan to use the AI4OS Development environment, you need full authentication to be able to access the Dashboard. Otherwise you can develop locally.
For Step 4 we recommend having docker installed (though it’s not strictly mandatory).
1. Setting the framework¶
This first step relies on the the AI4OS Modules Template for creating a template for your new module:
Access and authenticate in the Template creation webpage.
Then select the
minimal
branch of the template and answer the questions.Click on
Generate
and you will be able to download a.zip
file with with your project directory. Extract it locally.
2. Prepare your development environment¶
Although it is possible to develop your code locally, we also offer the possibility to develop from our AI4OS Development Environment.
This offers the benefits of:
developing on dedicated resources (including GPUs),
have direct access to your Nextcloud storage,
develop on Docker image that is already packaged with your favorite Deep Learning framework (eg. Pytorch, Tensorflow),
develop on your favorite IDE (Jupyterlab or VScode),
Check how to configure the AI4OS Development Environment. For example, this is what an AI4OS Development Environment with VScode would look out-of-the-box:
Launching a development environment
3. Editing the module’s code¶
Unpack, the zip file created in Step 1. Install your project as a Python module in editable mode (so that the changes you make to the codebase are picked by Python).
$ cd <project-name>
$ pip install -e .
Now you can start writing your code.
Tip
Some users have reported issues in some systems when installing deepaas
(which is always present in the requirements.txt
of your project). Those issues have been resolved as following:
In Pytorch Docker images, making sure
gcc
is installed (apt install gcc
)In other systems, sometimes
python3-dev
is needed (apt install python3-dev
).
To be able to interface with DEEPaaS you have to define in api.py
the functions you want to make accessible to the user. For this tutorial we are going to head to our official demo module and copy-paste its api.py
file.
Once this is done, check that DEEPaaS is interfacing correctly by running:
$ deepaas-run --listen-ip 0.0.0.0
Your module should be visible in http://0.0.0.0:5000/ui . If you don’t see your module, you probably messed the api.py
file. Try running it with python so you get a more detailed debug message.
$ python api.py
Remember to leave untouched the get_metadata()
function that comes predefined with your module, as all modules should have proper metadata.
You can also use port 6006
to expose some training monitoring tool, like Tensorboard.
In order to improve the readability of the code and the overall maintainability of your module, we enforce some quality standards in tox (including style, security, etc). Modules that fail to pass style tests won’t be able to build docker images. You can check locally if your module passes the tests:
$ tox -e .
There you should see a detailed report of the offending lines (if any). You can always turn off flake8 testing in some parts of the code if long lines are really needed.
Tip
If your project has many offending lines, it’s recommended using a code formatter tool like Black. It also helps for having a consistent code style and minimizing git diffs. Black formatted code will always be compliant with flake8.
Once installed, you can check how Black would have reformatted your code:
$ black <code-folder> --diff
You can always turn off Black formatting if you want to keep some sections of your code untouched.
If you are happy with the changes, you can make them permanent using:
$ black <code-folder>
Remember to have a backup before reformatting, just in case!
4. Editing the module’s Dockerfile¶
Your ./Dockerfile
is in charge of creating a docker image that integrates your application, along with deepaas and any other dependency. You can modify that file according to your needs.
We recommend checking the installation steps are fine. If your module needs additional Linux packages add them to the Dockerfile. Check your Dockerfile works correctly by building it locally (outside the AI4OS Development Environment) and running it:
$ docker build --no-cache -t your_project .
$ docker run -ti -p 5000:5000 -p 6006:6006 -p 8888:8888 your_project
Your module should be visible in http://0.0.0.0:5000/ui . You can make a POST request to the predict
method to check everything is working as intended.
5. Update your project’s metadata¶
The module’s metadata is located in the ai4-metadata.yml
file. This is the information that will be displayed in the Marketplace. The fields you need to edit to comply with our schemata are:
title
(mandatory): short title,summary
(mandatory): one liner summary of your module,description
(optional): extended description of your module, like a README,links
(mostly optional): links to related info (training dataset, module citation. etc),tags
(mandatory): relevant user-defined keywords (can be empty),categories
,tasks
,libraries
,data-type
(mandatory): one or several keywords, to be chosen from a closed list (can be empty).
Libraries | Tasks | Categories | Data Type |
---|---|---|---|
TensorFlow | Computer Vision | AI4 pre trained | Image |
PyTorch | Natural Language Processing | AI4 trainable | Text |
Keras | Time Series | AI4 inference | Time Series |
Scikit-learn | Recommender Systems | AI4 tools | Tabular |
XGBoost | Anomaly Detection | Graph | |
LightGBM | Regression | Audio | |
CatBoost | Classification | Video | |
Other | Clustering | Other | |
Dimensionality Reduction | |||
Generative Models | |||
Graph Neural Networks | |||
Optimization | |||
Reinforcement Learning | |||
Transfer Learning | |||
Uncertainty Estimation | |||
Other |
Some fields are pre-filled via the AI4OS Modules Template and usually do not need to be modified. Check you didn’t mess up the YAML definition by running our metadata validator:
$ pip install ai4-metadata
$ ai4-metadata validate ai4-metadata.yml
6. Integrating the module in the Marketplace¶
Once your repo is set, it’s time to integrate it in the Marketplace!
For this the steps are:
Open an issue in the AI4OS Catalog repo.
An admin will create the Github repo for your module inside the ai4os-hub organization. You will be granted
write
permissions in that repo.Modules repos follow the following convention:
ai4os-hub/ai4-<project-name>
: module officially developed by the projectai4os-hub/<project-name>
: modules developed by external users
Upload your code to that repo.
An admin will review your code and add it to the AI4OS Catalog. Once a module is approved it will take roughly 6 hours to appear in the Dashboard’s Marketplace.
Next steps
If to go further, check our tutorials on how to: