[拆轮子] PaddleDetection 中的预处理 NormalizeImage
相对路径在这里 ppdet/data/transform/operators.py
上一篇 https://blog.csdn.net/HaoZiHuang/article/details/128398000 中略讲了其基类 BaseOperator
其 __init__
中初始化了 self._id
比如下边的这个类实例化后,打印一下这个属性是:
>>> self._id
'NormalizeImage_d78ed6'
class NormalizeImage(BaseOperator):
def __init__(self,
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225],
is_scale=True,
norm_type='mean_std'):
"""
Args:
mean (list): the pixel mean
std (list): the pixel variance
is_scale (bool): scale the pixel to [0,1]
norm_type (str): type in ['mean_std', 'none']
"""
super(NormalizeImage, self).__init__()
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
if not (isinstance(self.mean, list) and isinstance(self.std, list) and
isinstance(self.is_scale, bool) and
self.norm_type in ['mean_std', 'none']):
raise TypeError("{}: input type is invalid.".format(self))
from functools import reduce
if reduce(lambda x, y: x * y, self.std) == 0:
raise ValueError('{}: std is invalid!'.format(self))
def apply(self, sample, context=None):
"""Normalize the image.
Operators:
1.(optional) Scale the pixel to [0,1]
2.(optional) Each pixel minus mean and is divided by std
"""
im = sample['image']
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
sample['image'] = im
return sample
self.mean
、self.std
分别是用来对图片进行正则化参数,分别是 [0.485, 0.456, 0.406]
, [0.229, 0.224, 0.225]
如果 self.is_scale
为 True
,则用255对原图先进行归一化
如果 self.norm_type
为 none
,则不对图片进行正则化,如果为 'mean_std'
则用self.mean
和self.std
进行正则化
NormalizeImage
类仅对图片进行处理
>>> pprint(sample)
{'curr_iter': 0,
'flipped': True,
'gt_bbox': array([[ 639.524 , 241.79735 , 683.641 , 366.2275 ],
[ 827.6553 , 287.004 , 1065. , 456.85568 ],
[ 0. , 361.1787 , 111.67373 , 502.13394 ],
[ 308.9322 , 400.6204 , 533.1966 , 559.8373 ]],
dtype=float32),
'gt_class': array([[58],
......
[60]], dtype=int32),
'h': 426.0,
'im_id': array([139]),
'im_shape': array([ 736., 1065.], dtype=float32),
'image': array([[[-0.7650483 , -0.757703 , -1.0724183 ],
......
[ 0.8618033 , -0.23249283, -0.7238344 ]]], dtype=float32),
'is_crowd': array([[0],
......
[0]], dtype=int32),
'scale_factor': array([1.7903621, 1.7861136], dtype=float32),
'w': 640.0}
注意与 Decode
输出不同的是多了个 'flipped': True
,因为我之前通过了 RandomFlip
在这里可能会遇到问题,看一下你的图片是 x 1 y 1 x 2 y 2 x_1y_1x_2y_2 x1y1x2y2标注的还是 x c y c x 2 y 2 x_cy_cx_2y_2 xcycx2y2还是 x c y c w h x_cy_cwh xcycwh
x1, y1, x2, y2 = sample['gt_bbox'][1].astype(int)
xx = cv2.rectangle(im, (x1, y1), (x2, y2), 255, thickness=2, lineType=8)
cv2.imwrite("xxx.png", xx)
这里有上边几种格式互相转换的函数们:
https://blog.csdn.net/HaoZiHuang/article/details/128213305