Data Collection and Annotation as a Service (DCaaS)

DCaaS is a microservice that helps in collection and annotation of input video data. The collected/annotated data can then be used for AI model training/fine-tuning and statistical analysis.

It supports 3 design patterns which is targeted to cover different deployment scenarios:

Design Pattern

Remarks

Simple Data Collection

Data ingest from source and save in local database

Simple Data Collection with Human Annotation

Data ingest from source and data annotation by a user along with review

Simple Data Collection with Auto-Annotation

Data ingest from source and data annotation by an (AI) algorithm

Below figure outlines architecture of DCaaS. Please click here for more details

Contents

How to run DCaaS

Basic pre-requisites

  • Git

  • OS

    Python

    Ubuntu 20.04

    3.8

    Ubuntu 22.04

    3.10

Get Code base from GitHub

Note: See Setup for Development for details 1. Install repo tool. sh       curl https://storage.googleapis.com/git-repo-downloads/repo > repo       sudo mv repo /bin/repo       sudo chmod a+x /bin/repo 2. Create a working directory and initialize the eii-manifest repo. For more details, read here sh       mkdir -p [WORKDIR]           cd [WORKDIR]       repo init -u "https://github.com/intel-innersource/applications.industrial.edge-insights.manifests.git" -b main       repo sync

  1. Clone DCaaS repository. sh       cd IEdgeInsights       git clone https://github.com/intel-innersource/frameworks.ai.edgecsp.data-collection-as-a-service DataCollectionMicroservice       cd DataCollectionMicroservice       git checkout develop

Install pre-requisites and proxy setup

  1. Install docker utilities and python packages required during first installation sh     cd [WORKDIR]/IEdgeInsights/build     sudo -E ./pre_requisites.sh     ## run with proxy if needed     # sudo -E ./pre_requisites.sh --proxy="<Proxy-DNS>:<port>"

  2. Create proxy configuration for docker. Create [HOMEDIR]/.docker/config.json and add the below with the appropriate <Proxy-DNS>:<port> ```json { “proxies”: { “default”: { “httpProxy”: “http://:”, “httpsProxy”: “http://:”, “noProxy”: “intel.com,*.intel.com,10.0.0.0/8,192.168.0.0/16,172.16.0.0/12” } } }

    ```

Configure, Build and Run the Services

  1. Edit the .env file and add entries for the following: sh     # DEB packages source location     PKG_SRC=http://eii-nightly-devops.iind.intel.com/latest     # Host ip address to be updated here     HOST_IP=     ETCD_HOST=     # Service credentials     ETCDROOT_PASSWORD=     INFLUXDB_USERNAME=     INFLUXDB_PASSWORD=     MINIO_ACCESS_KEY=     MINIO_SECRET_KEY=     # These are required to be updated in PROD mode only     WEBVISUALIZER_USERNAME=     WEBVISUALIZER_PASSWORD=     # Path where remote backup will happen     DCAAS_STORAGE_DIR=/path/to/persistent/remote/storage     # set the below variable if DEV_MODE is set to `true in the current .env file DCAAS_UDF_DIR=/udfs` Edit the .env file further as mentioned here

    Add "DataCollectionMicroservice": "" to the "subscriber_list" in [WORKDIR]/IEdgeInsights/build/builder_config.json

    Note: Set DCAAS_STORAGE_DIR to the directory where you wish to remotely save the annotations and images (ex: /path/to/persistent/remote/storage) and run the below. Azure users need additional setup before being able to upload data to azure. Refer Remote Storage for more details. sh    sudo chmod 777 /path/to/persistent/remote/storage

  2. Run the builder script to generate deployment and configuration files. To learn more, click here

    (venv) python3 builder.py -f ../DataCollectionMicroservice/artifacts/sample-usecase.yml
    
  3. Build the OEI usecase. To learn more, click here sh     (venv) docker compose -f docker-compose.yml build

  4. Start License Manager

    Before DCaaS or any other services can be started, we need to start the license manager agent. DCaaS will do a license check during the start of the app as well as during runtime periodically. The license manager agent will allow DCaaS to check if the user has valid entitlement for usage. Failure during this check will exit DCaaS gracefully.

    For 1st time setup, please see wiki

    If you have already done the first time setup, open a different terminal and navigate to the directory where the LM_SCP_Start.sh script is located (ideally the extracted directory during the 1st time setup) sh  ./LM_SCP_Start.sh

  5. Start DCaaS and other Services sh     (venv) ./run.sh Once the above script runs, you should see the below prints. To check DCaaS container running logs and for more information, please refer Viewing logs

  6. Stop the Services sh     (venv) docker compose down -v

Using DCaaS in other modes

  • To run DCaaS in “Auto Annotation Mode” follow instructions here

  • To run DCaaS in “Simple Storage Mode” (stores the frames from “data_filter” in DataStore)

    • Remove "DataCollectionMicroservice": "" from the "subscriber_list" in [WORKDIR]/IEdgeInsights/build/builder_config.json if it already exists

    • Edit the default config.json as below

      • enabled

        • Ensure that you have set "enabled": true for “simple_storage”

        • Ensure that you have set "enabled": false for both “auto_annotation” and “human_annotation”

License

License details can be found in License.pdf

More Details

Please refer USAGE.md for more details

Troubleshooting

If you run into any issue, please refer TROUBLESHOOTING.md