目录

OpenCV Chapter1 基本操作与阈值处理


OpenCV

图像基本操作

计算机眼中的图像

/img/Opencv/chapter1-1.png
image(计算机眼中的图像)

由一个个像素点构成了一张图像

像素点:一个值,范围$[0,255]$,表示了该点的一个亮度,$0\rightarrow black\ and\ 255\rightarrow white\newline$

一个点对应着3个值$R,G,B\rightarrow 图像的颜色通道\newline$

$RGB颜色通道$:(光学三原色), 通过表示其像素点的值可以表示彩色图像(黑白图像/灰度图就没有)

矩阵大小$\rightarrow$ 图像大小

数据读取——图像

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import cv2  # OpenCV读取格式是BGR(与其他库使用时注意格式是否一致)

img = cv2.imread("img path")
# The type of img -> 'numpy.ndarray', dtype=uint8 -> [0, 255]

# show the image
cv2.imshow('image', img)

# The waiting time(ms), if the para is zero, waiting the key entering
cv2.waitKey(0)

# Destroy the window
cv2.destroyWindow('image')  # can also use cv2.destroyAllWindows()

可调用属性

1
2
3
print(img.shape)  # Example:(414, 500, 3) 3 -> BGR
print(img.size)  # 像素点的个数
print(img.dtype)  # 通常为uint8

读取图像方式设置

  • cv2.IMREAD_COLOR:彩色图像
  • cv2.IMREAD_GRAYSCALE:灰度图像
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import cv2 

img = cv2.imread("img path", cv2.IMREAD_GRAYSCALE)
# In cv2.IMREAD_GRAYSCALE

print(img)

"""
[[ 48  48  48 ...  48  48  48]
[ 48  48  48 ...  48  48  48]
[ 48  48  48 ...  49  49  49]
...
[ 72  69  65 ... 210 210 210]
[ 65  63  60 ... 208 207 207]
[ 61  59  56 ... 203 201 200]]
"""

print(img.shape)  # (940, 940)

保存

1
2
cv2.imwrite("../LearnOpenCV/temp.jpg", img) 
#First para is path and name, Second para is the image(ndarray)

数据读取——视频

  • cv2.VideoCapture可以捕获摄像头,用数字控制不同的设备,例如0, 1
  • 如果是视频文件,直接指定好路径即可
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import cv2

video = cv2.VideoCapture("video path")
print(type(video))  # cv2.VideoCapture

opened = True

# Check that it is opened correctly
if video.isOpened():
    ret, frame = video.read()
    # ret -> it is read correctly(Every frame), frame -> image (Every frame)
else:
    opened = False

# traversal video
while opened:
    ret, frame = video.read()
    if frame is None:  # After reading the frame is None, exit
        break
    if ret:
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)  # Convert to grayscale(灰度图)
        cv2.imshow('result', gray)
        if cv2.waitKey(100) & 0xFF == 27:  # waiting time, and enter esc to exit
            break

video.release()
"""
The method is automatically called by subsequent VideoCapture::open and by VideoCapture
        .       destructor.
        .   
        .       The C function also deallocates memory and clears \*capture pointer.
"""
cv2.destroyAllWindows()

ROI(region of interest)

本质(通过数组切片来分割图像)

1
2
3
4
5
6
7
import cv2

img = cv2.imread("img path")
roi = img[0:300, 0:200]  # numpy 切片
cv2.imshow('result', roi)
cv2.waitKey()
cv2.destroyAllWindows()

颜色通道提取

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import cv2

img = cv2.imread("img path")
b, g, r = cv2.split(img)  # notice : B -> G -> R!!!
print(b)
"""
[[243 241 243 ... 239 240 241]
 [239 241 239 ... 239 238 239]
 [241 242 238 ... 241 238 237]
 ...
 [238 234 236 ... 240 239 239]
 [237 236 237 ... 240 241 237]
 [239 237 237 ... 240 240 237]]
"""
print(b.shape)  # (512, 512)

temp = cv2.merge((b, g, r))  # restore

cv2.imshow('result', temp)
cv2.waitKey()
cv2.destroyAllWindows()

只保留单个通道

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import cv2


def show(img):
    cv2.imshow('result', img)
    cv2.waitKey()
    cv2.destroyAllWindows()


img = cv2.imread("img path")

# R
cur_img = img.copy()
cur_img[:, :, 0] = 0
cur_img[:, :, 1] = 0
show(cur_img)

# G
cur_img = img.copy()
cur_img[:, :, 0] = 0
cur_img[:, :, 2] = 0
show(cur_img)

# B
cur_img = img.copy()
cur_img[:, :, 1] = 0
cur_img[:, :, 2] = 0
show(cur_img)

边界填充

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import cv2


def show(img):
    cv2.imshow('result', img)
    cv2.waitKey()
    cv2.destroyAllWindows()


img = cv2.imread("img path")

top_size, bottom_size, left_size, right_size = 50, 50, 50, 50  # 上下左右填充的大小

replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType=cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_WRAP)
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, cv2.BORDER_CONSTANT, value=0)

show(replicate)
show(reflect)
show(reflect101)
show(wrap)
show(constant)
  • BORDER_REPLICATE:复制法,也就是复制最边缘像素
  • BORDER_REFLECT:反射法,对感兴趣的图像中的像素进行两边复制,例如:(左边)fedcba|abcdefgh(图像)|hgfedcb(右边)
  • BORDER_REFLECT_101:反射法,也就是以最边缘像素为轴,对称,gfedcb|abcdefgh|gfedcba(a和h为对称轴)
  • BORDER_WRAP:外包装法, cdefgh|abcdefgh|abcdefg (按照原顺序)
  • BORDER_CONSTANT:常数法,常数值填充(需要指定$value$)

数值计算

图像数值加减

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import cv2

img = cv2.imread("img path")

print(img)

"""
[[[243 249 254]
  [241 249 255]
  [243 249 255]
  ...
  [239 247 254]
  [240 248 255]
  [241 247 254]]

 [[239 247 254]
  [241 247 254]
  [239 247 254]
  ...
"""

img1 = img + 10  # 每一个像素点都+10, 然后mod256(dytpe=unit8, 自我取余防止越界)
print(img1)

"""
[[[253   3   8]
  [251   3   9]
  [253   3   9]
  ...
  [249   1   8]
  [250   2   9]
  [251   1   8]]

 [[249   1   8]
  [251   1   8]
  [249   1   8]
  ...
"""

print(cv2.add(img, img1)[:5, :, 0])  # cv2.add(), 一旦越界,直接取255
"""
[[255 255 255 ... 255 255 255]
 [255 255 255 ... 255 255 255]
 [255 255 255 ... 255 255 255]
 [255 255 255 ... 255 255 255]
 [255 255 255 ... 255 255 255]]
 """

图像融合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import cv2

img1 = cv2.imread("img1 path")
img2 = cv2.imread("img2 path")

img2 = cv2.resize(img2, img1.shape[0:2])  # 大小一致

print(img1.shape)  # (512, 512, 3)
print(img2.shape)  # (512, 512, 3)

res = cv2.resize(img2, (0, 0), fx=3, fy=1)  # 按比例调整
print(res.shape)  # (512, 1536, 3)

# R = y1x1 + y2x2 + b (权重和附加值)
res = cv2.addWeighted(img1, 0.4, img2, 0.6, 0)
cv2.imshow('result', res)
cv2.waitKey()
cv2.destroyAllWindows()

图像阈值

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def threshold(src, thresh, maxval, type, dst=None): # real signature unknown; restored from __doc__
    """
    threshold(src, thresh, maxval, type[, dst]) -> retval, dst
    .   @brief Applies a fixed-level threshold to each array element.
    .   
    .   The function applies fixed-level thresholding to a multiple-channel array. The function is typically
    .   used to get a bi-level (binary) image out of a grayscale image ( #compare could be also used for
    .   this purpose) or for removing a noise, that is, filtering out pixels with too small or too large
    .   values. There are several types of thresholding supported by the function. They are determined by
    .   type parameter.
    .   
    .   Also, the special values #THRESH_OTSU or #THRESH_TRIANGLE may be combined with one of the
    .   above values. In these cases, the function determines the optimal threshold value using the Otsu's
    .   or Triangle algorithm and uses it instead of the specified thresh.
    .   
    .   @note Currently, the Otsu's and Triangle methods are implemented only for 8-bit single-channel images.
    .   
    .   @param src input array (multiple-channel, 8-bit or 32-bit floating point).
    .   @param dst output array of the same size  and type and the same number of channels as src.
    .   @param thresh threshold value.
    .   @param maxval maximum value to use with the #THRESH_BINARY and #THRESH_BINARY_INV thresholding
    .   types.
    .   @param type thresholding type (see #ThresholdTypes).
    .   @return the computed threshold value if Otsu's or Triangle methods used.
    .   
    .   @sa  adaptiveThreshold, findContours, compare, min, max
    """
    pass

参数说明:

  • src:输入图
  • thresh:阈值
  • maxval:当像素值超过了阈值(或者小于阈值,由type来决定),所赋予的值
  • type:二值化操作的类型,如下
    • cv2.THRESH_BINARY:超过阈值部分取maxval, 否则取0
    • cv2.THRESH_BINARY_INV:cv2.THRESH_BINARY的反转
    • cv2.THRESH_TRUNC:大于阈值的部分设为阈值,否则不变
    • cv2.THRESH_TOZERO:大于阈值部分不改变,否则设为零
    • cv2.THRESH_TOZERO_INV:cv2.THRESH_TOZERO的反转
1
ret, dst = cv2.threshold(src, thresh, maxval, type)

返回值:

  • ret:用于阈值化的值,等同于thresh
  • dst:阈值化后生成的图像
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path")

arr = []

ret1, thresh1 = cv2.threshold(img1, 127, 255, cv2.THRESH_BINARY)
arr.append(thresh1)
ret2, thresh2 = cv2.threshold(img1, 127, 255, cv2.THRESH_BINARY_INV)
arr.append(thresh2)
ret3, thresh3 = cv2.threshold(img1, 127, 255, cv2.THRESH_TRUNC)
arr.append(thresh3)
ret4, thresh4 = cv2.threshold(img1, 127, 255, cv2.THRESH_TOZERO)
arr.append(thresh4)
ret5, thresh5 = cv2.threshold(img1, 127, 255, cv2.THRESH_TOZERO_INV)
arr.append(thresh5)

for i in arr:
    show(i)