DL之NN:NN算法(本地數(shù)據(jù)集50000張訓(xùn)練集圖片)進(jìn)階優(yōu)化之三種參數(shù)改進(jìn),進(jìn)一步提高手寫數(shù)字圖片識別的準(zhǔn)確率
導(dǎo)讀
上一篇文章,比較了三種算法實現(xiàn)對手寫數(shù)字識別,其中,SVM和神經(jīng)網(wǎng)絡(luò)算法表現(xiàn)非常好準(zhǔn)確率都在90%以上,本文章進(jìn)一步探討對神經(jīng)網(wǎng)絡(luò)算法優(yōu)化,進(jìn)一步提高準(zhǔn)確率,通過測試發(fā)現(xiàn),準(zhǔn)確率提高了很多。
相關(guān)文章
CNN:人工智能之神經(jīng)網(wǎng)絡(luò)算法進(jìn)階優(yōu)化,六種不同優(yōu)化算法實現(xiàn)手寫數(shù)字識別逐步提高,應(yīng)用案例自動駕駛之捕捉并識別周圍車牌號
思路設(shè)計
首先,改變之一:
先在初始化權(quán)重的部分,采取一種更為好的隨機初始化方法,我們依舊保持正態(tài)分布的均值不變,只對標(biāo)準(zhǔn)差進(jìn)行改動,
初始化權(quán)重改變前,
def large_weight_initializer(self):
self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]]
self.weights = [np.random.randn(y, x) for x, y in zip(self.sizes[:-1], self.sizes[1:])]
初始化權(quán)重改變后,
def default_weight_initializer(self):
self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]]
self.weights = [np.random.randn(y, x)/np.sqrt(x) for x, y in zip(self.sizes[:-1], self.sizes[1:])]
改變之二:
為了減少Overfitting,降低數(shù)據(jù)局部噪音影響,將原先的目標(biāo)函數(shù)由?quadratic cost?改為?cross-enrtopy cost
class CrossEntropyCost(object):
def fn(a, y):
return np.sum(np.nan_to_num(-y*np.log(a)-(1-y)*np.log(1-a)))
def delta(z, a, y):
return (a-y)
改變之三:
將S函數(shù)改為Softmax函數(shù)
class SoftmaxLayer(object):
def __init__(self, n_in, n_out, p_dropout=0.0):
self.n_in = n_in
self.n_out = n_out
self.p_dropout = p_dropout
self.w = theano.shared(
np.zeros((n_in, n_out), dtype=theano.config.floatX),
name='w', borrow=True)
self.b = theano.shared(
np.zeros((n_out,), dtype=theano.config.floatX),
name='b', borrow=True)
self.params = [self.w, self.b]
def set_inpt(self, inpt, inpt_dropout, mini_batch_size):
self.inpt = inpt.reshape((mini_batch_size, self.n_in))
self.output = softmax((1-self.p_dropout)*T.dot(self.inpt, self.w) + self.b)
self.y_out = T.argmax(self.output, axis=1)
self.inpt_dropout = dropout_layer(
inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)
self.output_dropout = softmax(T.dot(self.inpt_dropout, self.w) + self.b)
def cost(self, net):
"Return the log-likelihood cost."
return -T.mean(T.log(self.output_dropout)[T.arange(net.y.shape[0]), net.y])
def accuracy(self, y):
"Return the accuracy for the mini-batch."
return T.mean(T.eq(y, self.y_out))