Tiling

Boost small-object detection using tiling. Learn four strategies to tile, detect, and merge results effectively in PySDK.

Estimated read time: 6 minutes

High-resolution scenes with many small objects often benefit from tiling: you split the image into overlapping tiles, run detection per tile, then merge the results. Tiling typically improves small-object recall, but can introduce duplicates near tile borders or reduce large-object accuracy.

degirum_tools provides four ready-made strategies. TileModel, LocalGlobalTileModel, BoxFusionTileModel, and BoxFusionLocalGlobalTileModel.

Think of the four modes as incremental layers:

  • TileModel is the baseline: only tile inference plus optional NMS. Small objects pop, but large ones can fracture or vanish at tile seams.

  • LocalGlobalTileModel adds a global pass. After the tile run, any object whose area exceeds large_object_threshold is replaced with the global detection. It is an error-correction sweep that restores large objects without changing the grid.

  • BoxFusionTileModel keeps tile-only inference, but cleans up seam artifacts by performing a 1-D IoU fusion inside an edge band (edge_threshold). Boxes that overlap across tile borders are merged instead of duplicated.

  • BoxFusionLocalGlobalTileModel combines both upgrades: seam fusion and global rescue. Use it when you need the most faithful merged view—large and small targets, minimal duplicates.

TileModel highlights each tile and overlays merged detections.
TileModel highlights each tile and overlays merged detections.

The white square shows the current tile being processed. The final yellow and green boxes are the detections produced by the tiling strategy.

The caption under the gif indicates the model, tile grid and overlap, mode, and runtime.

Every command and JSON output in the demo includes these thresholds so you can tell at a glance which pipeline produced the result.

Example (ModelSpec + remote_assets)

  • When to tile: Use tiling for crowded scenes, small targets, or very high-res inputs. Expect improved recall on small objects—but always check that large-object accuracy doesn't regress.

  • Grid & overlap: Start with a 3×2 grid and ~10% overlap. More tiles may improve recall but increase compute; too little overlap can cause border splits.

  • Local vs Global: large_object_threshold controls when to trust whole-image detections (helps big objects).

  • Box fusion: Use edge_threshold to mark boxes near tile edges, and fusion_threshold (IoU) to fuse duplicates across tile seams.

  • NMS policy: Tune nms_options to your model and object density. MOST_PROBABLE is a good default.

  • Class filtering: Use output_class_set (as shown) to focus on relevant classes for your application.

  • Video: predict_stream handles capture and looping for files, RTSP, or webcams. You can use it with any of the tile models—they behave like standard models.

Last updated

Was this helpful?