目录

OpenCV Chapter2 卷积下的运算


OpenCV

图像平滑处理

前导知识:卷积$(convolution)$

离散卷积

主要应用:数字图像处理

先导:如何计算两个骰子分别向外投掷一次不同和数的概率

怎么计算两个和数?

转置数组,从左向右移动,依次对齐计算

/img/Opencv/chapter2-1.png
image(卷积计算方法)

但是如果每一面的概率不一样,而不是像之前那样都是$\frac{1}{6}\newline$了呢?

/img/Opencv/chapter2-2.png
卷积(概率非均匀分布)

求概率的时候不再是$\frac {情况数}{组合总数}$了,而是使用纯概率计算

/img/Opencv/chapter2-3.png
卷积(公式化纯概率计算)

公式化

/img/Opencv/chapter2-4.png
卷积(公式化计算)

表格化,将每一条对角线上面的数求和,就得到了新的一组数(与上面一个原理,不过变成了可视化表格)

/img/Opencv/chapter2-5.png
卷积(转化为表格计算)

$$ \begin{align} &(a * b)\ is\ a\ new\ array\newline &and\ (a * b)[n]=\sum_{i, j, i + j = n}a_i\ b_j\newline \end{align} $$

$Example:\newline$

$$ \begin{align} &(1,2,3) * (4, 5, 6)\newline &=(1 * 4, 1 * 5 + 2 * 4, 1 * 6 + 2 * 5 + 3 * 4, 2 * 6 + 3 * 5,3 * 6)\newline &=(4, 13, 28, 27, 18)\newline \end{align} $$

python科学计算库numpy中的使用
1
2
3
4
import numpy as np

a, b = np.array([1, 2, 3]), np.array([4, 5, 6])
print(np.convolve(a, b))

输出

1
[ 4 13 28 27 18]

一个长的data list和一个短的概率array组合(权重可以不同,但是求和要等于1),概率数组从左往右滑动,同时求和 $\rightarrow$ 在求小窗口数据里面的平均值

/img/Opencv/chapter2-6.png
卷积(短数组与长数组)

应用与图像处理

(3X3)的矩阵扫描,用来求该矩阵下的加权平均值(RGB三个通道都是),实现了对图像的模糊化

/img/Opencv/chapter2-7.png
卷积(RGB三个通道分别计算)

形象描述

/img/Opencv/chapter2-8.png
卷积(图像模糊)

高斯分布

通过控制权重使得模糊的时候,权重从中心到两边逐渐变小,这样模糊出来的图像视觉效果更好

维基百科

正态分布(中国大陆作正态分布,香港作正态分布,台湾作常态分布,英语:Normal distribution),又名高斯分布(英语:Gaussian distribution)、正规分布,是一个非常常见的连续概率分布。正态分布在统计学上十分重要,经常用在自然社会科学来代表一个不明的随机变量。

随机变量https://wikimedia.org/api/rest_v1/media/math/render/svg/68baa052181f707c662844a465bfeeb135e82bab服从一个位置参数为https://wikimedia.org/api/rest_v1/media/math/render/svg/9fd47b2a39f7a7856952afec1f1db72c67af6161、尺度参数为https://wikimedia.org/api/rest_v1/media/math/render/svg/59f59b7c3e6fdb1d0365a494b81fb9a696138c36的正态分布,记为:

$X \sim N(\mu, \sigma^2)\newline$

则其概率密度函数为 $f(x)=\frac {1}{\sigma \sqrt{2\pi}}e^{-\frac {(x-\mu)^2}{2\sigma^2}}\newline$

正态分布的数学期望值或期望值https://wikimedia.org/api/rest_v1/media/math/render/svg/9fd47b2a39f7a7856952afec1f1db72c67af6161等于位置参数,决定了分布的位置;其方差$\sigma^2$的开平方或标准差$\sigma$等于尺度参数,决定了分布的幅度。

中心极限定理指出,在特定条件下,一个具有有限均值方差随机变量的多个样本(观察值)的平均值本身就是一个随机变量,其分布随着样本数量的增加而收敛于正态分布。因此,许多与独立过程总和有关的物理量,例如测量误差,通常可被近似为正态分布。

正态分布的概率密度函数曲线呈钟形,因此人们又经常称之为钟形曲线(类似于寺庙里的大钟,因此得名)。我们通常所说的标准正态分布是位置参数https://wikimedia.org/api/rest_v1/media/math/render/svg/3753282c0ad2ea1e7d63f39425efd13c37da3169,尺度参数$\displaystyle \sigma ^{2}=1$的正态分布

权值和不为1,且有负权重存在的情况

/img/Opencv/chapter2-9.png
卷积(不同的卷积核效果不同)

判别图像中所有竖向的边界

通过不同的卷积核($Kernel$)(即矩阵),就可以对图像做不同的处理

卷积神经网络 $\rightarrow$ 用数据来算出应该选取什么样的核(取决于检测目标)

notice:因为在纯数理论下算出来的卷积实际上总是要比原数组长的,所以在计算机科学中一般会把多余的值截掉

算法优化

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import time

import numpy as np

arr1 = np.random.random(100000)
arr2 = np.random.random(100000)

start = time.time()
arr3 = np.convolve(arr1, arr2)
end = time.time()
print(end - start)  # 10.330177783966064
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import time

import numpy as np
import scipy.signal
import scipy

arr1 = np.random.random(100000)
arr2 = np.random.random(100000)

start = time.time()
arr3 = np.convolve(arr1, arr2)
end = time.time()
print(end - start)  # 10.330177783966064

start = time.time()
arr4 = scipy.signal.fftconvolve(arr1, arr2)
end = time.time()
print(end - start)
'''
9.904504299163818
0.009002208709716797
'''

计算速度飙升!

什么是fftconvolve?
/img/Opencv/chapter2-10.png
卷积(fftclovolve)

先建立一个表格存储计算完成的组合乘积,然后沿着对角线求和,把原来老算法里面需要重复计算组合乘积的弊病给解决了

其他应用 $\rightarrow$ 多项式乘积

/img/Opencv/chapter2-11.png
卷积(多项式乘积)

代码部分

均值滤波与方框滤波
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path")
show(img1)

# 均值滤波
# 简单的平均卷积操作,第二个元组参数指定卷积核的大小,维度通常用奇数

img2 = cv2.blur(img1, (5, 5))
show(img2)

# 方框滤波
# 基本和均值一样,可以选择归一化,第二个参数是目标图像深度,一般使用-1(表示与原始图像深度相同)
# 第三个参数表示卷积核大小

img3 = cv2.boxFilter(img1, -1, (3, 3), normalize=True)
show(img3)

# 只要越界,就变成255

img4 = cv2.boxFilter(img1, -1, (3, 3), normalize=False)
show(img4)

有关$normalize$:(K为卷积核)

$$ \begin{align} &K=\alpha \begin{bmatrix} &1, 1, 1, \dots, 1, 1, 1\newline &1, 1, 1, \dots, 1, 1, 1\newline &\dots \dots\newline &1, 1, 1, \dots, 1, 1, 1\newline \end{bmatrix} \end{align} $$

$$ \alpha=\begin{cases} &\frac {1}{ksize.width * ksize.height}\ when\ normalize=True, is\ equal\ to\ the\ K\ of\ blur\newline &1\ otherwise \end{cases} $$

高斯滤波

维基百科

Mathematically, applying a Gaussian blur to an image is the same as convolving the image with a Gaussian function. This is also known as a two-dimensional Weierstrass transform. By contrast, convolving by a circle (i.e., a circular box blur) would more accurately reproduce the bokeh effect.

Since the Fourier transform of a Gaussian is another Gaussian, applying a Gaussian blur has the effect of reducing the image’s high-frequency components; a Gaussian blur is thus a low-pass filter.

The Gaussian blur is a type of image-blurring filter that uses a Gaussian function (which also expresses the normal distribution in statistics) for calculating the transformation to apply to each pixel in the image. The formula of a Gaussian function in one dimension is

$G(x)=\frac {1}{\sqrt {2\pi \sigma^2}}e^{-\frac {x^2}{2\sigma^2}}\newline$

In two dimensions, it is the product of two such Gaussian functions, one in each dimension:

$G(x, y)=\frac {1}{2\pi \sigma^2}e^{-\frac {x^2+y^2}{2\sigma^2}}\newline$

where x is the distance from the origin in the horizontal axis, y is the distance from the origin in the vertical axis, and σ is the standard deviation of the Gaussian distribution. It is important to note that the origin on these axes are at the center (0, 0). When applied in two dimensions, this formula produces a surface whose contours are concentric circles with a Gaussian distribution from the center point.

Values from this distribution are used to build a convolution matrix which is applied to the original image. This convolution process is illustrated visually in the figure on the right. Each pixel’s new value is set to a weighted average of that pixel’s neighborhood. The original pixel’s value receives the heaviest weight (having the highest Gaussian value) and neighboring pixels receive smaller weights as their distance to the original pixel increases. This results in a blur that preserves boundaries and edges better than other, more uniform blurring filters; see also scale space implementation.

“离中心点越近权重越大,反之则越小”

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path")
show(img1)

img2 = cv2.GaussianBlur(img1, (5, 5), 1) # -> src, Ksize, and sigma
show(img2)

参数说明

Parameters

1
cv2.GaussianBlur(src, ksize, sigmaX, sigmaY, borderType)
src input image; the image can have any number of channels, which are processed independently, but the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
dst output image of the same size and type as src.
ksize Gaussian kernel size. ksize.width and ksize.height can differ but they both must be positive and odd. Or, they can be zero’s and then they are computed from sigma.
sigmaX Gaussian kernel standard deviation in X direction.
sigmaY Gaussian kernel standard deviation in Y direction; if sigmaY is zero, it is set to be equal to sigmaX, if both sigmas are zeros, they are computed from ksize.width and ksize.height, respectively (see getGaussianKernel for details); to fully control the result regardless of possible future modifications of all this semantics, it is recommended to specify all of ksize, sigmaX, and sigmaY.
borderType pixel extrapolation method, see BorderTypes. BORDER_WRAP is not supported.
中值滤波

对卷积核内部的像素进行排序,取中值作为新的像素点

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path")
show(img1)

img2 = cv2.medianBlur(img1, 5)
show(img2)

注意第二个参数发生了变化

@param ksize aperture linear size; it must be odd and greater than 1, for example: 3, 5, 7 …

去噪效果最好,但是也有相应的损失

连接图像
1
2
3
import numpy as np
np.hstack((img1, img2, img3, ...)) # 横向拼接
np.vstack((img1, img2, img3, ...)) # 纵向拼接

示例代码

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import cv2
import numpy as np


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path")

x = []
for i in range(1, 11, 2):
    x.append(cv2.medianBlur(img1, i))

res1 = np.hstack((img1, x[0], x[1], x[2]))
res2 = np.vstack((img1, x[0], x[1], x[2]))
show(res1)
show(res2)

腐蚀与膨胀

腐蚀和膨胀为经典的两个数字形态学处理算法,一般用于二值图像,本质也是对于核的卷积

腐蚀

腐蚀操作也是用卷积核扫描图像,只不过腐蚀操作的卷积核一般都是1,如果卷积核内部所有像素点都是白色,那么中心点即为白色,否则为黑色,腐蚀效果强弱和卷积核的大小成正相关

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import cv2
import numpy as np


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

for i in range(1, 10):
    img2 = cv2.erode(img1, np.ones((3, 3), np.uint8), iterations=i) # -> 图像,卷积核,迭代次数
    show(img2)

获取核结构

1
def getStructuringElement(shape, ksize, anchor=None)

shape:

  • MORPH_RECT :矩形元素
  • MORPH_CROSS :十字形元素
  • MORPH_ELLIPSE:椭圆形元素

ksize:矩阵大小

膨胀

膨胀是腐蚀的相反操作,基本原理是只要保证卷积核的中心点是非零值,周边无论是零还是非零值,都会变成非零值

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

k = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3))

for i in range(1, 10):
    img2 = cv2.dilate(img1, k, iterations=i) # -> 图像,卷积核,迭代次数
    show(img2)

开闭运算

开运算

先腐蚀,后膨胀

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

img2 = cv2.morphologyEx(img1, cv2.MORPH_OPEN, kernel) # -> 通过选定参数来执行开运算
show(img2)

闭运算

先膨胀,后腐蚀

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

img2 = cv2.morphologyEx(img1, cv2.MORPH_CLOSE, kernel) # -> 通过选定参数来执行闭运算
show(img2)

梯度运算

原理:膨胀腐蚀 = 边界信息

使用例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

img2 = cv2.morphologyEx(img1, cv2.MORPH_GRADIENT, kernel) # -> cv2.MORPH_GRADIENT
show(img2)

礼帽与黑帽

原理:

  • 礼帽 = 原始输入 - 开运算结果 $\rightarrow$ 腐蚀掉的部分
  • 黑帽 = 闭运算-原始输入$\rightarrow$ 更强的的应该被腐蚀掉的部分(因为闭运算中先膨胀就无法腐蚀掉了),相减后没有了,得到的其实是原来图像的轮廓
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import cv2


def show(image):
    cv2.imshow('result', image)
    cv2.waitKey()
    cv2.destroyAllWindows()


img1 = cv2.imread("img path", 0)
show(img1)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

# 礼帽
img2 = cv2.morphologyEx(img1, cv2.MORPH_TOPHAT, kernel)
show(img2)

# hei'mao
img3 = cv2.morphologyEx(img1, cv2.MORPH_BLACKHAT, kernel)
show(img3)