Better analyze information, in all its forms

Implement your own Datashare tasks, written in Python¶

Most AI, Machine Learning, Data Engineering happens in Python. Datashare now lets you extend its backend with your own tasks implemented in Python.

Turning your own ML pipelines into Datashare tasks is very simple.

Actually, it's almost as simple as cloning our template repo:

git clone git@github.com:ICIJ/datashare-python.git

replacing existing app tasks with your own:

from icij_worker import AsyncApp

app = AsyncApp("some-app")


@app.task
def hello_world() -> str:
    return "Hello world"

installing uv to set up dependencies and running your async Datashare worker:

cd datashare-pythoncurl -LsSf https://astral.sh/uv/install.sh | shuv run ./scripts/worker_entrypoint.sh[INFO][icij_worker.backend.backend]: Loading worker configuration from env...
...
}
[INFO][icij_worker.backend.mp]: starting 1 worker for app datashare_python.app.app
...

you'll then be able to execute task by starting using our HTTP client (and soon using Datashare's UI).

Learn¶

Learn how to integrate Data Processing and Machine Learning pipelines to Datashare following our tutorial.

Get started¶

Follow our get started guide an learn how to clone the template repository and implement your own Datashare tasks !

Refine your knowledge¶

Follow our guides to learn how to implement complex tasks and deploy Datashare workers running your own tasks.