Skip to main content
Version: 3.25.x

Object detection pro

imgproxy can detect objects in the image and use them for smart cropping, blurring the detections, or drawing the detections. You can also fetch the detected objects info.

For object purposes, imgproxy uses the YOLO (You Only Look Once) model family. imgproxy supports models in DarkNet or ONNX format. We provide Docker images with a model trained for face detection, but you can use any YOLO model found on the internet or train your own model.

Configuration

tip

You don't need to configure object detection if you're using an imgproxy Pro Docker image with a tag suffixed with -ml and you want to use the face detection model. The model is already included in the image and the configuration is already set up.

info

DarkNet model format has priority over ONNX model format. If you define both, imgproxy will use the DarkNet model.

DarkNet model format

You need to define the following config variables to enable object detection with a DarkNet model:

  • IMGPROXY_OBJECT_DETECTION_CONFIG: a path to the neural network config in DarkNet format
  • IMGPROXY_OBJECT_DETECTION_WEIGHTS: a path to the neural network weights in DarkNet format
  • IMGPROXY_OBJECT_DETECTION_CLASSES: a path to the text file with the classes names, one per line

ONNX model format

You need to define the following config variables to enable object detection with an ONNX model:

  • IMGPROXY_OBJECT_DETECTION_NET: a path to the neural network model in ONNX format
  • IMGPROXY_OBJECT_DETECTION_NET_TYPE: the type of the neural network model. Possible values:
    • yolox: (default) YOLOX model

      Export YOLOX to ONNX
      python tools/export_onnx.py \
      -f /path/to/experiment.py \
      -c /path/to/checkpoint.pth \
      --output-name /path/to/output.onnx \
      --decode_in_inference
    • yolov4: YOLOv4 model

      Export YOLOv4 to ONNX
      pip install onnxruntime

      python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <input_width> <input_height>

      # Example
      python demo_pytorch2onnx.py yolov4.pth dog.jpg 1 80 416 416
    • yolov5: YOLOv5 model

      Export YOLOv5 to ONNX
      # Export with FP32 precision
      python export.py \
      --weights yolov5s.pt \
      --include onnx \
      --simplify

      # Export with FP16 precision (CUDA-compatible GPU is required)
      python export.py \
      --weights yolov5s.pt \
      --include onnx \
      --simplify \
      --half
    • yolov6: YOLOv6 model

      Export YOLOv6 to ONNX
      # Export with FP32 precision
      python deploy/ONNX/export_onnx.py \
      --weights yolov6s.pt \
      --img 640 \
      --batch 1 \
      --simplify

      # Export with FP16 precision (CUDA-compatible GPU is required)
      python deploy/ONNX/export_onnx.py \
      --weights yolov6s.pt \
      --img 640 \
      --batch 1 \
      --simplify \
      --half
    • yolov7: YOLOv7 model

      Export YOLOv7 to ONNX
      # Export with FP32 precision
      python export.py \
      --weights yolov7-tiny.pt \
      --grid \
      --simplify \
      --img-size 640 640 \
      --max-wh 640

      # Export with FP16 precision (CUDA-compatible GPU is required)
      python export.py \
      --weights yolov7-tiny.pt \
      --grid \
      --simplify \
      --img-size 640 640 \
      --max-wh 640 \
      --fp16
    • yolov8: YOLOv8 model

      Export YOLOv8 to ONNX
      pip install ultralytics

      # Export with FP32 precision
      yolo export \
      model=yolov8n.pt \
      format=onnx \
      simplify=True

      # Export with FP16 precision (CUDA-compatible GPU is required)
      yolo export \
      model=yolov8n.pt \
      format=onnx \
      simplify=True \
      half=True
    • yolov9: YOLOv9 model

      Export YOLOv9 to ONNX
      # Export with FP32 precision
      python export.py \
      --weights yolov9-s.pt \
      --include onnx \
      --simplify

      # Export with FP16 precision (CUDA-compatible GPU is required)
      python export.py \
      --weights yolov9-s.pt \
      --include onnx \
      --simplify \
      --half
    • yolov10: YOLOv10 model

      Export YOLOv10 to ONNX

      Unfortunately, the export script from the original YOLOv10 repository adds NMS and other postprocessing operations to the model and doesn't allow to disable them. You can apply a patch to the YOLOv10 code to fix this issue:

      curl -Ls https://gist.githubusercontent.com/DarthSim/216551dfd58e5628290e90c1d358704b/raw/27a828a48c84f93e0e70b14923bf697541ebe5a1/yolov10.patch | git apply

      ...and then export the model:

      # Export with FP32 precision
      python export_opencv.py \
      --weights yolov10s.pt \
      --imgsz 640 640

      # Export with FP16 precision (CUDA-compatible GPU is required)
      python export_opencv.py \
      --weights yolov10s.pt \
      --imgsz 640 640 \
      --half
    • yolo-nas: YOLO-NAS model

      Export YOLO-NAS to ONNX
      from super_gradients.training import models
      from super_gradients.common.object_names import Models
      from super_gradients.conversion import DetectionOutputFormatMode
      from super_gradients.conversion.conversion_enums import ExportQuantizationMode

      # Load the model from the SuperGradients model zoo
      model = models.get(
      Models.YOLO_NAS_S,
      pretrained_weights="coco"
      )
      # Or load the model from a checkpoint
      model = models.get(
      Models.YOLO_NAS_S,
      num_classes=80,
      checkpoint_path=f"neural-yolo_nas_s.pth"
      )

      model.eval()
      model.prep_model_for_conversion(input_size=[1, 3, 640, 640])

      # Disable preprocessing and postprocessing since imgproxy will handle it
      model.export(
      "/content/yolo_nas_s.onnx",
      preprocessing=False,
      postprocessing=False,
      output_predictions_format=DetectionOutputFormatMode.FLAT_FORMAT,
      input_image_shape=[640, 640],
      quantization_mode=ExportQuantizationMode.FP16,
      )
  • IMGPROXY_OBJECT_DETECTION_CLASSES: a path to the text file with the classes names, one per line

Common config options

  • IMGPROXY_OBJECT_DETECTION_NET_SIZE: the size of the neural network input. The inputs' width and heights should be the same, so this config value should be a single number. Default: 416
  • IMGPROXY_OBJECT_DETECTION_CONFIDENCE_THRESHOLD: detections with confidence below this value will be discarded. Default: 0.2
  • IMGPROXY_OBJECT_DETECTION_NMS_THRESHOLD: non-max supression threshold. Don't change this if you don't know what you're doing. Default: 0.4

Usage examples

Object-oriented crop

You can crop your images and keep objects of desired classes in frame:

.../crop:256:256/g:obj:face/...

Blurring detections

You can blur objects of desired classes, thus making anonymization or hiding NSFW content possible:

.../blur_detections:7:face/...

Draw detections

You can make imgproxy draw bounding boxes for the detected objects of the desired classes (this is handy for testing your models):

.../draw_detections:1:face/...

Fetch the detected objects' info

You can fetch the detected objects info using the /info endpoint:

.../info/detect_objects:1/...