博客详情页

2020-07-18

462

原创

1.新问题的出现

【之前的情况】：

如花	小倩	小明	小强
0	0	1	0
1	1	1	1
1	0	1	1
0	1	1	0
1	1	0	？

【现在的情况】：

如花	大美	小明	小强
0	0	1	0
0	1	1	1
1	0	1	1
1	1	1	0
1	0	0	？

【让之前的模型学习新的数据集，进行预测】：

import numpy as np

# 正向传播
def fp(input):
    # 利用矩阵的点乘一次性计算4个temp出来
    temp = np.dot(input, weights)
    # 使用Sigmoid函数【激活函数】，计算出最终的output
    return 1 / (1 + np.exp(-temp))

# 反向传播
def bp(Y, output):
    # 看看我们计算出来的和实际发生的有多大变化
    error = Y - output
    # 计算斜率
    slope = output * (1 - output)
    # 计算增量
    return error * slope

X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]]) //新的数据集
Y = np.array([[0,1,1,0]]).T

np.random.seed(1)
weights = 2 * np.random.random((3,1)) - 1
for it in range(10000):
    output = fp(X)
    delta = bp(Y, output)
    # 更新权重
    weights += np.dot(X.T, delta)
print(fp([[1,0,0]]))    //预测第5组的结果
print(fp([[1,1,1]]))    //预测模型已经见过的一组数据

【输出结果】：

[[0.5]]   //第5组的结果
[[0.5]]   //模型已经见过的一组数据

【分析】：面对新的数据集，为什么同样的模型训练下来，现在却无法预测结果，甚至连模型自己已经见过的数据也无法正确判断？其实只要你仔细分析就会发现：之前我们的数据集中小强与如花属于线性关系；而现在小强与如花、大美属于“异或xor”关系（相同为假，不同为真）属于非线性关系，关系很复杂。所以，仅凭借1个神经元的神经网络无法处理这样的情况。

2.多层神经网络

【之前的神经网络】：3个权重

avatar 【现在的神经网络】：3*4+4=16个权重

avatar

【分析】：这种分层的权重，会处理非线性的关系。

【经典案例】：手写数字识别：28*28个数字（小图片）

avatar

手写数字的神经网络含有两个隐藏层各有16个神经元 avatar 【计算权重的个数】：78416+1616+16*10=12960个

3.初始化权重和FP过程

【之前】：

# 初始化权重
np.random.seed(1)
weights = 2 * np.random.random((3,1)) - 1

# 正向传播
def fp(input):
    # 利用矩阵的点乘一次性计算4个temp出来
    temp = np.dot(input, weights)
    # 使用Sigmoid函数【激活函数】，计算出最终的output
    return 1 / (1 + np.exp(-temp))

【现在】：

np.random.seed(1)
# 隐藏层L1：4个神经元
w0 = 2 * np.random.random((3,4)) - 1
# 输出层L2：1个神经元
w1 = 2 * np.random.random((4,1)) - 1

# 正向传播
def fp(input):
    l1 = 1/(1 + np.exp(-np.dot(input, w0)))
    l2 = 1/(1 + np.exp(-np.dot(l1, w1)))
    return l1, l2

4.多层神经网络的BP过程

【之前】：

# 反向传播
def bp(Y, output):
    # 看看我们计算出来的和实际发生的有多大变化
    error = Y - output
    # 计算斜率
    slope = output * (1 - output)
    # 计算增量
    return error * slope

【现在】：

# 反向传播
def bp(l1, l2, Y):
    error = Y - l2
    slope = l2 * (1 - l2)
    l1_delta = error * slope

    l0_slope = l1 * (1 - l1)
    l0_error = np.dot(l1_delta, w1.T)
    l0_delta = l0_slope * l0_error
    return l0_delta, l1_delta

5.代码重构

import numpy as np

# 正向传播
def fp(input):
    l1 = 1/(1 + np.exp(-np.dot(input, w0)))
    l2 = 1/(1 + np.exp(-np.dot(l1, w1)))
    return l1, l2

# 反向传播
def bp(l1, l2, Y):
    error = Y - l2
    slope = l2 * (1 - l2)
    l1_delta = error * slope

    l0_slope = l1 * (1 - l1)
    l0_error = np.dot(l1_delta, w1.T)
    l0_delta = l0_slope * l0_error
	#计算增量
    return l0_delta, l1_delta

X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]])
Y = np.array([[0,1,1,0]]).T

np.random.seed(1)
# 隐藏层L1：4个神经元
w0 = 2 * np.random.random((3,4)) - 1
# 输出层L2：1个神经元
w1 = 2 * np.random.random((4,1)) - 1

# # 设置10个神经元
# w0 = 2 * np.random.random((3,10)) - 1
# w1 = 2 * np.random.random((10,1)) - 1

for it in range(10000):
    l0 = X
    l1, l2 = fp(l0)
    l0_delta, l1_delta = bp(l1, l2, Y)
    # 更新权重
    w1 += np.dot(l1.T, l1_delta)
    w0 += np.dot(l0.T, l0_delta)

print(w0)
print(w1)
print(fp([[0,0,1]])[1])  //预测学习过的结果
print(fp([[0,1,1]])[1])  //预测学习过的结果
print(fp([[1,0,0]])[1])  //预测新的结果

【输出】：

[[ 4.05751199  3.77592492 -6.07774883 -3.91512755]
 [-1.92447764 -5.52099478 -6.39028245 -3.19429746]
 [ 0.57447316 -1.83960582  2.41541223  5.23881505]]
 
[[-6.01822566]
 [ 6.39184307]
 [-9.63829175]
 [ 6.90678217]]
 
[[0.00702172]]
[[0.99101005]]
[[0.6096127]]

【结论】：学习过的数据[0,0,1]、[0,1,1]预测的结果为：0.00702172、0.99101005，正确；未见过的数据[1,0,0]，预测结果为：0.6096127，正确；一般情况下，大于0.5我们就可以认为是1。由于我们的数据集非常的少，所以面对这种复杂的预测，有的时候结果未必那么接近1，但是在实际的应用中，数据集是非常庞大的，效果也会比现在好很多。关于神经元的个数问题，上面代码中有一块注释了10个神经元的情况，只有通过多次尝试与训练，才能找出最佳的神经元个数。

6.结语

这两次内容仅仅是简单介绍什么是神经网络，要想真正了解神经网络在各个方面的具体应用，还应该多多查阅书籍深入学习，今后的日子中我还会发布一些关于更深程度的神经网络的应用，与大家一起探讨学习&hearts谢谢;。

Python

深度学习

神经网络