dgs.utils.image.CustomToAspect¶
- class dgs.utils.image.CustomToAspect(*args: Any, **kwargs: Any)[source]¶
Custom torchvision Transform that modifies the image, bboxes, and coordinates simultaneously to match a target aspect ratio.
Notes
It is expected that Resize() is called after this transform, to not only match the aspect ratio but also the overall size.
This transforms’ default mode is zero-padding.
The following modes are available for resizing:
- distort
Skips CustomToAspect entirely and therefore does not change the original aspect ratio at all. This will result in a distorted image when using Resize(), iff the aspect ratios of the old and new shape aren’t close.
- edge-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is edge.
- inside-crop
Uses the target aspect ratio to extract a sub-image out of the original. Basically is a center crop with one dimension being as large as possible while maintaining the aspect ratio.
- outside-crop
Is only available for the CustomCrop() model, but will be passed through. Instead of cropping at the exact bounding box, match the aspect ratio by widening one of the dimensions
- fill-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is constant and the fill value has to be provided within the kwargs.
- mean-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is constant with a fill value as the RGB mean of the image.
- reflect-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is reflect.
- symmetric-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is symmetric.
- zero-pad
Uses Pad() to extend the image to the correct aspect ratio. The value used for padding_mode of Pad() is constant with a value of zero.
Methods
- __init__(*args: Any, **kwargs: Any) None ¶
- forward(*args, **kwargs) dict[str, any] [source]¶
Modify the image, bboxes and coordinates to have a given aspect ratio (shape)
Use module in Compose and pass structured dict as argument. This function will then obtain a dictionary as first and most likely only argument.
- Keyword Arguments:
image – One single image as tv_tensors.Image of shape
[B x C x H x W]
box – Zero, one, or multiple bounding boxes per image. With N detections and a batch size of B, the bounding boxes have a shape of
[B*N x 4]
. Also, keep in mind that bboxes has to be a two-dimensional tensor, because every image in this batch can have a different number of detections. The ordering of the bounding boxes will stay the same.keypoints – Joint-coordinates as key-points with coordinates in relation to the original image. With N detections per image and a batch size of B, the coordinates have a max shape of
[B*N x J x 2|3]
. Either batch and detections are stacked in one dimension, because every image in this batch can have a different number of detections, or there is not batched dimension at all. The ordering of the coordinates will stay the same.output_size – (h, w) as target height and width of the image
mode – See class description.
aspect_round_decimals – (int, optional) Before comparing them, round the aspect ratios to the number of decimals. Default
DEF_VAL.images.aspect_round_decimals
.fill – (Union[int, float, List[float]], optional) See parameter fill of
torchvision.transforms.v2.Pad()
. Only applicable ifmode
is ‘fill-pad’. In that instance, fill has to be set and is no longer optional / ignored.
- Returns:
Structured dict with updated and overwritten image(s), bboxes and coordinates. All additional input values are passed down as well.
Attributes
Original height of the image
Original width of the image
Original aspect ratio of the image as width / height
New height of the image
New width of the image
Target aspect ratio of the image as width / height