Skip to content

Inference config

Supervisely supports few inference modes. These modes differ from each other in what image area will be feeded to NN.

Full image

{
  "gpu_devices": [0],
  "model_classes": {
      "save_classes": ["lemon", "kiwi"],
      "add_suffix": "_u"
  },
  "existing_objects": {
      "save_classes": [],
      "add_suffix": ""
  },
  "mode": {
      "source": "full_image"
  }
}

gpu_devices - device to use for inference. Right now we support only single GPU.

model_classes - which classes will be used, e.g. NN produces 80 classes and you are going to use only few and ignore other. In that case you should set save_classes field with the list of interested class names. add_suffix string will be added to new class to prevent similar class names with exisiting classes in project. If you are going to use all model classes just set "save_classes": "__all__".

existing_objects - defines which object classes (from project) will be saved after inference. All original classes will be ignored if save_classes equals to [].

mode - contains all mode settings

  • "source": "full_image" - apply NN to full image

Example

For example you have project where each person is segmented (class 'person'). And you are going to apply new model to this project also to segment humans, but you want to save both annotations: original and new predictions. Here is the appropriate config:

{
  "gpu_devices": [0],
  "model_classes": {
      "save_classes": ["person"],
      "add_suffix": "_unet"
  },
  "existing_objects": {
      "save_classes": ["person"],
      "add_suffix": "_original"
  },
  "mode": {
      "source": "full_image"
  }
}

All original annotations will have class person_original, all new annotations after NN inference will have class person_unet.

Objects

You can apply NN to image areas, that are defined by object bounding boxes. Instead of applying NN to the whole image, it will be applied to few image parts. It may be very usefull when you are going to apply few NNs in a sequence.

For example, you are going to segment person instances. But you don't want to use Mask-RCNN (due to low segmentation quality near object edges). Instead you apply Faster-RCNN to detect all persons, and then apply your custom UNet model to segment each person.

{
  "gpu_devices": [0],
  "model_classes": {
      "save_classes": ["person"],
      "add_suffix": "_unet"
  },
  "existing_objects": {
      "save_classes": [],
      "add_suffix": ""
  },
  "mode": {
      "source": "bboxes",
      "from_classes": ["person_bbox"],
      "padding": {
            "left": "5%",
            "top": "5%",
            "right": "5%",
            "bottom": "5%"
        },
        "save": true,
        "add_suffix": "_input_bbox"
  }
}

Many of fields are similar, that were already described in "Full image" chapter. Here is an explaination for new ones.

mode - contains all mode settings

  • "source": "bboxes" - apply NN to bounding boxes of specified objects

  • from_classes - list of classes. All objects of these classes will be used for inference: 1. get object bounding box, 2. apply padding to bounding box (slightly encreate its size), 3. feed to NN images area that is defined with bounding box.

  • padding - how to increase input bounding box. Possible values: "5%" or "15px"

  • save - save input bounding box after inference if set as true

  • add_suffix - suffix for input bounding box

ROI

This mode allow to use only part of input image.

{
  "gpu_devices": [0],
  "model_classes": {
      "save_classes": "__all__",
      "add_suffix": "_unet"
  },
  "existing_objects": {
      "save_classes": [],
      "add_suffix": ""
  },
  "mode": {
      "source": "roi",
      "bounds": {
            "left": "10%",
            "top": "50%",
            "right": "25%",
            "bottom": "0%"
        },
        "save": false,
        "class_name": "inference_roi"
  }
}

Many of fields are similar, that were already described in "Full image" chapter. Here is an explaination for new ones.

mode - contains all mode settings

  • "source": "roi" - apply NN to image part defined by bounds

  • bounds - defines the part of input image. Here is the graphical explanation below.

  • save - save input bounding box after inference if set as true

  • add_suffix - suffix for input bounding box

Sliding window

The mode allows to perform inference on (possibly overlapping) parts of image. It is useful to get inference results on large or relatively large images without splitting them and stitching.

Sliding window of fixed size moves on the source image and covers it entirely. In each position a model is applied to part of the image bounded by the window. After processing of the whole image, results from different positions are aggregated to get final prediction.

Now we support the mode for image segmentation models only. Segmentation results are aggregated by averaging of output probabilities.

{
  "gpu_devices": [
    0
  ],
  "model_classes": {
    "save_classes": "__all__",
    "add_suffix": "_unet"
  },
  "existing_objects": {
    "save_classes": [],
    "add_suffix": ""
  },
  "mode": {
    "source": "sliding_window",
    "window": {
      "width": 1000,
      "height": 750
    },
    "min_overlap": {
      "x": 600,
      "y": 600
    },
    "save": true
  }
}

mode section contains the mode settings.

  • "source": "sliding_window" selects the sliding window mode

  • window defines fixed size of sliding window in pixels; crop with the size will be passed to network (note that in different NN implementations it would after be resized to fixed network input)

  • min_overlap defines minimal overlap in pixels between different positions of sliding window; real overlap may be greater to cover source image entirely

  • save — save or not bounds of each sliding window position

  • class_name — class name for the sliding window bounds