Object detection pro
imgproxy can detect objects in the image and use them for smart cropping, blurring the detections, or drawing the detections. You can also fetch the detected objects info.
For object purposes, imgproxy uses the YOLO (You Only Look Once) model family. imgproxy supports models in DarkNet or ONNX format. We provide Docker images with a model trained for face detection, but you can use any YOLO model found on the internet or train your own model.
Configuration
You don't need to configure object detection if you're using an imgproxy Pro Docker image with a tag suffixed with -ml
and you want to use the face detection model. The model is already included in the image and the configuration is already set up.
DarkNet model format has priority over ONNX model format. If you define both, imgproxy will use the DarkNet model.
DarkNet model format
You need to define the following config variables to enable object detection with a DarkNet model:
IMGPROXY_OBJECT_DETECTION_CONFIG
: a path to the neural network config in DarkNet formatIMGPROXY_OBJECT_DETECTION_WEIGHTS
: a path to the neural network weights in DarkNet formatIMGPROXY_OBJECT_DETECTION_CLASSES
: a path to the class names file
ONNX model format
You need to define the following config variables to enable object detection with an ONNX model:
IMGPROXY_OBJECT_DETECTION_NET
: a path to the neural network model in ONNX formatIMGPROXY_OBJECT_DETECTION_NET_TYPE
: the type of the neural network model. Possible values:-
yolox
: (default) YOLOX modelExport YOLOX to ONNX
python tools/export_onnx.py \
-f /path/to/experiment.py \
-c /path/to/checkpoint.pth \
--output-name /path/to/output.onnx \
--decode_in_inference -
yolov4
: YOLOv4 modelExport YOLOv4 to ONNX
pip install onnxruntime
python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <input_width> <input_height>
# Example
python demo_pytorch2onnx.py yolov4.pth dog.jpg 1 80 416 416 -
yolov5
: YOLOv5 modelExport YOLOv5 to ONNX
# Export with FP32 precision
python export.py \
--weights yolov5s.pt \
--include onnx \
--simplify
# Export with FP16 precision (CUDA-compatible GPU is required)
python export.py \
--weights yolov5s.pt \
--include onnx \
--simplify \
--half -
yolov6
: YOLOv6 modelExport YOLOv6 to ONNX
# Export with FP32 precision
python deploy/ONNX/export_onnx.py \
--weights yolov6s.pt \
--img 640 \
--batch 1 \
--simplify
# Export with FP16 precision (CUDA-compatible GPU is required)
python deploy/ONNX/export_onnx.py \
--weights yolov6s.pt \
--img 640 \
--batch 1 \
--simplify \
--half -
yolov7
: YOLOv7 modelExport YOLOv7 to ONNX
# Export with FP32 precision
python export.py \
--weights yolov7-tiny.pt \
--grid \
--simplify \
--img-size 640 640 \
--max-wh 640
# Export with FP16 precision (CUDA-compatible GPU is required)
python export.py \
--weights yolov7-tiny.pt \
--grid \
--simplify \
--img-size 640 640 \
--max-wh 640 \
--fp16 -
yolov8
: YOLOv8 modelExport YOLOv8 to ONNX
pip install ultralytics
# Export with FP32 precision
yolo export \
model=yolov8n.pt \
format=onnx \
simplify=True
# Export with FP16 precision using CUDA-compatible GPU
yolo export \
model=yolov8n.pt \
format=onnx \
simplify=True \
half=True \
device=0
# Export with FP16 precision using Apple Silicon GPU
yolo export \
model=yolov8n.pt \
format=onnx \
simplify=True \
half=True \
device=mps -
yolov9
: YOLOv9 modelExport YOLOv9 to ONNX
# Export with FP32 precision
python export.py \
--weights yolov9-s.pt \
--include onnx \
--simplify
# Export with FP16 precision (CUDA-compatible GPU is required)
python export.py \
--weights yolov9-s.pt \
--include onnx \
--simplify \
--half -
yolov10
: YOLOv10 modelExport YOLOv10 to ONNX
Unfortunately, the export script from the original YOLOv10 repository adds NMS and other postprocessing operations to the model and doesn't allow to disable them. You can apply a patch to the YOLOv10 code to fix this issue:
curl -Ls https://gist.githubusercontent.com/DarthSim/216551dfd58e5628290e90c1d358704b/raw/27a828a48c84f93e0e70b14923bf697541ebe5a1/yolov10.patch | git apply
...and then export the model:
# Export with FP32 precision
python export_opencv.py \
--weights yolov10s.pt \
--imgsz 640 640
# Export with FP16 precision (CUDA-compatible GPU is required)
python export_opencv.py \
--weights yolov10s.pt \
--imgsz 640 640 \
--half -
yolo-nas
: YOLO-NAS modelExport YOLO-NAS to ONNX
from super_gradients.training import models
from super_gradients.common.object_names import Models
from super_gradients.conversion import DetectionOutputFormatMode
from super_gradients.conversion.conversion_enums import ExportQuantizationMode
# Load the model from the SuperGradients model zoo
model = models.get(
Models.YOLO_NAS_S,
pretrained_weights="coco"
)
# Or load the model from a checkpoint
model = models.get(
Models.YOLO_NAS_S,
num_classes=80,
checkpoint_path=f"neural-yolo_nas_s.pth"
)
model.eval()
model.prep_model_for_conversion(input_size=[1, 3, 640, 640])
# Disable preprocessing and postprocessing since imgproxy will handle it
model.export(
"/content/yolo_nas_s.onnx",
preprocessing=False,
postprocessing=False,
output_predictions_format=DetectionOutputFormatMode.FLAT_FORMAT,
input_image_shape=[640, 640],
quantization_mode=ExportQuantizationMode.FP16,
)
-
IMGPROXY_OBJECT_DETECTION_CLASSES
: a path to the class names file
Common config options
IMGPROXY_OBJECT_DETECTION_NET_SIZE
: the size of the neural network input. The inputs' width and heights should be the same, so this config value should be a single number. Default: 416IMGPROXY_OBJECT_DETECTION_CONFIDENCE_THRESHOLD
: detections with confidence below this value will be discarded. Default: 0.2IMGPROXY_OBJECT_DETECTION_NMS_THRESHOLD
: non-max supression threshold. Don't change this if you don't know what you're doing. Default: 0.4IMGPROXY_OBJECT_DETECTION_SWAP_RB
: when set totrue
, imgproxy will swap the R and B channels in the input image. Some models are trained on BGR images and perform incorrectly with RGB inputs. This option allows you to fix this issue. Default:false
IMGPROXY_OBJECT_DETECTION_FALLBACK_TO_SMART_CROP
: pro defines imgproxy's behavior when object-oriented crop gravity is used but no objects are detected. When set totrue
, imgproxy will fallback to smart crop. When set tofalse
, imgproxy will fallback to the center gravity. Default:true
Class names file
The class names file is used to map the class indexes from the neural network output to human-readable class names. The path to the class names file should be defined in the IMGPROXY_OBJECT_DETECTION_CLASSES
config variable.
The class names file should contain one class name per line. The class names should be in the same order as the classes in the neural network output. Example:
person
dog
cat
By default, during the object-oriented crop, all the detected objects have the default weight of 1
. You can change the weight of the detected objects by adding the =%weight
suffix to the object class name. Example:
person=2
dog
cat=3
In this example, the detected person
and cat
objects will have the weight of 2
and 3
respectively. The dog
object will have the default weight of 1
.
Usage examples
Object-oriented crop
You can crop your images and keep objects of desired classes in frame:
.../crop:256:256/gravity:obj:face:cat:dog/...
Also, you can use the objw
gravity type to redefine the weights of the detected objects:
.../crop:256:256/gravity:objw:face:2:cat:3:dog:4/...
The all
pseudo-class matches all the detected objects. You can use it to set the weight for all the detected objects:
.../crop:256:256/gravity:objw:all:2:face:10/...
In this example, all the detected objects will have the weight of 2
, except for the face
objects, which will have the weight of 10
.
Blurring detections
You can blur objects of desired classes, thus making anonymization or hiding NSFW content possible:
.../blur_detections:7:face/...
Draw detections
You can make imgproxy draw bounding boxes for the detected objects of the desired classes (this is handy for testing your models):
.../draw_detections:1:face/...
Fetch the detected objects' info
You can fetch the detected objects info using the /info
endpoint:
.../info/detect_objects:1/...