博舍

【人工智能选修课】实验报告 人工智能课程实验报告总结范文大全图片高清

【人工智能选修课】实验报告

基于神经网络的MNIST手写数字识别

仅仅是“人工智能选修课”的上机作业

文章目录基于神经网络的MNIST手写数字识别一、实验目的二、实验内容MNIST数据集分层采样方法神经网络模型评估方法三、实验方法设计1)模型构建的程序设计(伪代码或源代码截图)及说明解释(10分)2)模型迭代训练的程序设计(伪代码或源代码截图)及说明解释(10分)3)模型训练过程中周期性测试的程序设计(伪代码或源代码截图)及说明解释(周期性测试指的是每训练n个step就对模型进行一次测试,得到准确率和loss值)(10分)4)分层采样的程序设计(伪代码或源代码截图)及说明解释(10分)5)k折交叉验证法的程序设计(伪代码或源代码截图)及说明解释(10分)四、实验结果展示1)模型在验证集下的准确率(输出结果并截图)(10分)2)不同模型参数(隐藏层数、隐藏层节点数)对准确率的影响和分析(10分)不同的隐藏层数不同的隐藏层节点数3)不同训练参数(batchsize、epochnum、学习率)对准确率的影响和分析(10分)不同的batchsize不同的epochnum不同的学习率4)留出法不同比例对结果的影响和分析(10分)5)k折交叉验证法不同k值对结果的影响和分析(10分)五、实验总结及心得参考一、实验目的掌握运用神经网络模型解决有监督学习问题掌握机器学习中常用的模型训练测试方法了解不同训练方法的选择对测试结果的影响二、实验内容MNIST数据集

本实验采用的数据集MNIST是一个手写数字图片数据集,共包含图像和对应的标签。数据集中所有图片都是28x28像素大小,且所有的图像都经过了适当的处理使得数字位于图片的中心位置。MNIST数据集使用二进制方式存储。图片数据中每个图片为一个长度为784(28x28x1,即长宽28像素的单通道灰度图)的一维向量,而标签数据中每个标签均为长度为10的一维向量。

分层采样方法

分层采样(或分层抽样,也叫类型抽样)方法,是将总体样本分成多个类别,再分别在每个类别中进行采样的方法。通过划分类别,采样出的样本的类型分布和总体样本相似,并且更具有代表性。在本实验中,MNIST数据集为手写数字集,有0~9共10种数字,进行分层采样时先将数据集按数字分为10类,再按同样的方式分别进行采样。

神经网络模型评估方法

通常,我们可以通过实验测试来对神经网络模型的误差进行评估。为此,需要使用一个测试集来测试模型对新样本的判别能力,然后以此测试集上的测试误差作为误差的近似值。两种常见的划分训练集和测试集的方法:

留出法(hold-out)直接将数据集按比例划分为两个互斥的集合。划分时为尽可能保持数据分布的一致性,可以采用分层采样(stratifiedsampling)的方式,使得训练集和测试集中的类别比例尽可能相似。需要注意的是,测试集在整个数据集上的分布如果不够均匀还可能引入额外的偏差,所以单次使用留出法得到的估计结果往往不够稳定可靠。在使用留出法时,一般要采用若干次随机划分、重复进行实验评估后取平均值作为留出法的评估结果。

k折交叉验证法(k-foldcrossvalidation)先将数据集划分为k个大小相似的互斥子集,每个子集都尽可能保持数据分布的一致性,即也采用分层采样(stratifiedsampling)的方法。然后,每次用k-1个子集的并集作为训练集,余下的那个子集作为测试集,这样就可以获得k组训练集和测试集,从而可以进行k次训练和测试。最终返回的是这k个测试结果的均值。显然,k折交叉验证法的评估结果的稳定性和保真性在很大程度上取决于k的取值。k最常用的取值是10,此外常用的取值还有5、20等。

三、实验方法设计

介绍实验中程序的总体设计方案、关键步骤的编程方法及思路,主要包括:

1)模型构建的程序设计(伪代码或源代码截图)及说明解释(10分)

构建全连接神经网络,每一层的神经元个数分别为:784->128->128->10

采用Adam优化器,使用softmax函数计算loss

具体解释见代码注释

#构建和训练模型deftrain_and_test(images_train,labels_train,images_test,labels_test,images_validation,labels_validation):x=tf.placeholder(tf.float32,[None,784],name="X")y=tf.placeholder(tf.float32,[None,10],name="Y")h1=fcn_layer(inputs=x,input_dim=784,output_dim=128,activation=tf.nn.relu)h2=fcn_layer(inputs=h1,input_dim=128,output_dim=128,activation=tf.nn.relu)forward=fcn_layer(inputs=h2,input_dim=128,output_dim=10,activation=None)pred=tf.nn.softmax(forward)loss_function=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=forward,labels=y))optimizer=tf.train.AdamOptimizer(learning_rate).minimize(loss_function)#优化器correct_prediction=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#比较预测值和真实值accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

其中fcn_layer函数:

deffcn_layer(inputs,#输入数据input_dim,#输入神经元数量output_dim,#输出神经元数量activation=None):#激活函数W=tf.Variable(tf.truncated_normal([input_dim,output_dim],stddev=0.1))#初始化权重b=tf.Variable(tf.zeros([output_dim]))#初始化为0XWb=tf.matmul(inputs,W)+breturnXWbifactivationisNoneelseactivation(XWb)2)模型迭代训练的程序设计(伪代码或源代码截图)及说明解释(10分)

具体解释见代码注释

train_epochs=32#训练轮数batch_size=64#单次训练样本数(批次大小)display_step=4096#显示粒度learning_rate=0.001#学习率optimizer=tf.train.AdamOptimizer(learning_rate).minimize(loss_function)#优化器correct_prediction=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))#比较预测值和真实值#准确率,将布尔值转化为浮点数,并计算平均值accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))withtf.Session()assess:init=tf.global_variables_initializer()#初始化变量sess.run(init)step=0for(batchImages,batchLabels)inbatch_iter(images_train,labels_train,batch_size,train_epochs,shuffle=True):sess.run(optimizer,feed_dict={x:batchImages,y:batchLabels})3)模型训练过程中周期性测试的程序设计(伪代码或源代码截图)及说明解释(周期性测试指的是每训练n个step就对模型进行一次测试,得到准确率和loss值)(10分)

具体解释见代码注释

display_step=4096#显示粒度withtf.Session()assess:init=tf.global_variables_initializer()#初始化变量sess.run(init)step=0for(batchImages,batchLabels)inbatch_iter(images_train,labels_train,batch_size,train_epochs,shuffle=True):sess.run(optimizer,feed_dict={x:batchImages,y:batchLabels})ifstep%display_step==0:loss,acc=sess.run([loss_function,accuracy],feed_dict={x:images_validation,y:labels_validation})#测试print(f"step:{step+1}Loss={loss}accuracy={acc}")step+=1

输出结果:

step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===4)分层采样的程序设计(伪代码或源代码截图)及说明解释(10分)

利用sklearn中的train_test_split实现。十次随机抽取训练集和测试集,取平均值。

具体解释见代码注释

#留出法(hold-out)fromsklearn.model_selectionimporttrain_test_splitdefhold_out(images,labels,train_percentage):accu=[]#十次随机抽取训练集和测试集,取平均值for_inrange(10):train_images,test_images,train_labels,test_labels= rain_test_split(images,labels,train_size=train_percentage,#训练集比例stratify=labels#保持类别分布)accu.append(train_and_test(train_images,train_labels,test_images,test_labels,test_images,test_labels))print("hold-outaccuracy:",accu)5)k折交叉验证法的程序设计(伪代码或源代码截图)及说明解释(10分)

利用sklearn中的KFold实现。计算k中不同抽取下的平均值。

具体解释见代码注释

#k折交叉验证法(k-foldcrossvalidation)fromsklearn.model_selectionimportKFolddefcross_validation(images,labels,k):accu=[]kf=KFold(n_splits=k,shuffle=True)fortrain_index,test_indexinkf.split(images):images_train,images_test=images[train_index],images[test_index]labels_train,labels_test=labels[train_index],labels[test_index]accu.append(train_and_test(images_train,labels_train,images_test,labels_test,images_test,labels_test))print("cross-validationaccuracy:",np.mean(accu))四、实验结果展示

展示程序界面设计、运行结果及相关分析等,主要包括:

1)模型在验证集下的准确率(输出结果并截图)(10分)step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===2)不同模型参数(隐藏层数、隐藏层节点数)对准确率的影响和分析(10分)不同的隐藏层数隐藏层数为0时:step:1Loss=2.5073938369750977accuracy=0.0729999914765358step:4097Loss=0.27769413590431213accuracy=0.9217997789382935step:8193Loss=0.26662880182266235accuracy=0.9259997010231018step:12289Loss=0.263393372297287accuracy=0.9231997728347778step:16385Loss=0.26742368936538696accuracy=0.9237997531890869step:20481Loss=0.26651620864868164accuracy=0.9251997470855713step:24577Loss=0.26798802614212036accuracy=0.9247996807098389===testaccuracy:0.9248===0.92479974隐藏层数为1时:step:1Loss=2.4127447605133057accuracy=0.09719999134540558step:4097Loss=0.08607088774442673accuracy=0.9745997190475464step:8193Loss=0.07784661650657654accuracy=0.9785997271537781step:12289Loss=0.095745749771595accuracy=0.9759998321533203step:16385Loss=0.09472983330488205accuracy=0.9799997210502625step:20481Loss=0.09713517129421234accuracy=0.9787996411323547step:24577Loss=0.0993366464972496accuracy=0.9801996946334839===testaccuracy:0.9802===0.98019964隐藏层数为2时:step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===

综上可以得出结论:隐藏层的层数越多,训练越久,但是得到的结果也越准确。但越多增加的效果也越不明显

不同的隐藏层节点数隐藏节点数为10*10时:step:1Loss=2.300844669342041accuracy=0.12519998848438263step:4097Loss=0.2754775583744049accuracy=0.9239997863769531step:8193Loss=0.24036210775375366accuracy=0.9319997429847717step:12289Loss=0.22833241522312164accuracy=0.9349997639656067step:16385Loss=0.22694511711597443accuracy=0.9351996779441833step:20481Loss=0.2160138636827469accuracy=0.9395997524261475step:24577Loss=0.20927678048610687accuracy=0.9417997598648071===testaccuracy:0.9392===0.93919969隐藏层数为16*16时:step:1Loss=2.302095890045166accuracy=0.10459998995065689step:4097Loss=0.24206139147281647accuracy=0.9285997152328491step:8193Loss=0.19353719055652618accuracy=0.9429997801780701step:12289Loss=0.18354550004005432accuracy=0.9491997361183167step:16385Loss=0.18149533867835999accuracy=0.9485996961593628step:20481Loss=0.1877274215221405accuracy=0.9493997097015381step:24577Loss=0.1913667917251587accuracy=0.951799750328064===testaccuracy:0.9548===0.95479971隐藏层数为128*128时:step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===

综上可以得出结论:

隐藏层的节点数越多,参数量指数上升,训练越久,但是得到的结果也越准确。

(但随着参数量到一定程度,训练结果准确率上升趋于不明显,甚至发生过拟合现象。)

3)不同训练参数(batchsize、epochnum、学习率)对准确率的影响和分析(10分)不同的batchsizebatchsize=64step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===batchsize=4096step:1Loss=2.2310731410980225accuracy=0.16619999706745148===testaccuracy:0.9718===0.97179973batchsize=32786step:1Loss=2.2919344902038574accuracy=0.15059998631477356===testaccuracy:0.8782===0.87819982

综上可以得出结论:batchsize越大,训练越快。

但是过大的batchsize会占用过多的显存,甚至导致溢出;同时也不利于随机梯度下降。

不同的epochnumepochnum=1step:1Loss=2.343087673187256accuracy=0.11779998987913132step:129Loss=0.330795556306839accuracy=0.9037997722625732step:257Loss=0.24847447872161865accuracy=0.925399661064148step:385Loss=0.198109969496727accuracy=0.9409997463226318step:513Loss=0.17987975478172302accuracy=0.9479997754096985step:641Loss=0.1628917008638382accuracy=0.9541996717453003step:769Loss=0.14910820126533508accuracy=0.9547997117042542===testaccuracy:0.9526===0.95259976epochnum=32step:1Loss=2.238192558288574accuracy=0.17159998416900635step:4097Loss=0.09725397080183029accuracy=0.9717997312545776step:8193Loss=0.10235630720853806accuracy=0.9781997203826904step:12289Loss=0.13071678578853607accuracy=0.9735997915267944step:16385Loss=0.12960655987262726accuracy=0.9757996797561646step:20481Loss=0.14140461385250092accuracy=0.9765996932983398step:24577Loss=0.16358020901679993accuracy=0.9759997129440308===testaccuracy:0.97===

综上可以得出结论:epochnum训练了多少数据后停止训练。最好在模型准确率趋于稳定之后停止训练,不然准确率将达不到期望值。

不同的学习率学习率=0.0001step:1Loss=2.328317642211914accuracy=0.09679999947547913step:129Loss=1.5215153694152832accuracy=0.7011998891830444step:257Loss=0.8109210133552551accuracy=0.8205997943878174step:385Loss=0.5582057237625122accuracy=0.863199770450592step:513Loss=0.4527219235897064accuracy=0.8837997317314148step:641Loss=0.39591166377067566accuracy=0.8925997614860535step:769Loss=0.3588014245033264accuracy=0.8997997641563416===testaccuracy:0.9004===0.9003998学习率=0.001step:1Loss=2.343087673187256accuracy=0.11779998987913132step:129Loss=0.330795556306839accuracy=0.9037997722625732step:257Loss=0.24847447872161865accuracy=0.925399661064148step:385Loss=0.198109969496727accuracy=0.9409997463226318step:513Loss=0.17987975478172302accuracy=0.9479997754096985step:641Loss=0.1628917008638382accuracy=0.9541996717453003step:769Loss=0.14910820126533508accuracy=0.9547997117042542===testaccuracy:0.9526===0.95259976学习率=0.01step:1Loss=2.3110170364379883accuracy=0.2709999680519104step:129Loss=0.2452460527420044accuracy=0.9277997016906738step:257Loss=0.215981587767601accuracy=0.9361997246742249step:385Loss=0.21104326844215393accuracy=0.9363997578620911step:513Loss=0.172766774892807accuracy=0.9469997882843018step:641Loss=0.14438582956790924accuracy=0.9573997855186462step:769Loss=0.15849816799163818accuracy=0.9527996778488159===testaccuracy:0.9558===0.95579976学习率=0.1step:1Loss=43.770484924316406accuracy=0.10619999468326569step:129Loss=1.7850791215896606accuracy=0.2809999883174896step:257Loss=1.7752128839492798accuracy=0.3105999827384949step:385Loss=1.719871997833252accuracy=0.3147999942302704step:513Loss=1.6704318523406982accuracy=0.3511999845504761step:641Loss=1.6277217864990234accuracy=0.34059998393058777step:769Loss=1.8401107788085938accuracy=0.2733999788761139===testaccuracy:0.2738===0.27379999

综上可以得出结论:

学习率减小,准确率提高,但收敛慢。

学习率减小,学习速率增加,但易震荡

4)留出法不同比例对结果的影响和分析(10分)print("=====hold-out=====")print("train_percentage:0.8:",end=')hold_out(total_images,total_labels,0.8)print("train_percentage:0.9:",end=')hold_out(total_images,total_labels,0.9)print("train_percentage:0.5:",end=')hold_out(total_images,total_labels,0.5)print("train_percentage:0.2:",end=')hold_out(total_images,total_labels,0.2)

结果:

=====hold-out=====train_percentage:0.8:hold-outaccuracy:[0.97072774,0.974455,0.97690958,0.97781873,0.97545499,0.96990955,0.97654593,0.97209132,0.97390956,0.97772777]train_percentage:0.9:hold-outaccuracy:[0.97945446,0.97327256,0.97381806,0.97654533,0.97981811,0.97472721,0.97472715,0.97327256,0.97436351,0.97163624]train_percentage:0.5:hold-outaccuracy:[0.97501898,0.97083724,0.97414637,0.97454625,0.97087359,0.97469169,0.97520077,0.96894628,0.97367346,0.97367346]train_percentage:0.2:hold-outaccuracy:[0.96202344,0.95893252,0.95929617,0.95911437,0.95968258,0.95965987,0.96059167,0.96009171,0.96056885,0.95806891]

综上可以得出结论:太极端的train_percentage使测试说服性降低,最好在0.8附近

5)k折交叉验证法不同k值对结果的影响和分析(10分)print("=====cross-validation=====")print("k=5:",end=')cross_validation(total_images,total_labels,5)print("k=10:",end=')cross_validation(total_images,total_labels,10)print("k=20:",end=')cross_validation(total_images,total_labels,20)print("k=2:",end=')cross_validation(total_images,total_labels,2)

结果:

=====cross-validation=====k=5:cross-validationaccuracy:0.975146k=10:cross-validationaccuracy:0.976927k=20:cross-validationaccuracy:0.977491k=2:cross-validationaccuracy:0.973474

综上可以得出结论:k折交叉验证法比留出法对k的鲁棒性要好一点

五、实验总结及心得

我通过MNIST手写数字图片数据集训练一个简单的手写数字识别神经网络为例子,了解了用Transflow训练全连接神经网络的技巧,探究了神经网络的各种参数对训练过程以及训练结果的影响。

还尝试了“留出法”与“k折交叉验证法”这两种神经网络模型评估方法,探索了这两种方法参数对评估结果的影响。

参考

anaconda优雅安装tensorflow(不用手动安装cuda、cudnn等):

condacreate--nametf_gpu_envpython=3.6anacondatensorflow-gpu

不踩坑:Ubuntu下安装TensorFlow的最简单方法(无需手动安装CUDA和cuDNN)-知乎(zhihu.com)

运行jupyter时遇到的问题解决:

彻底解决:AttributeError:typeobjectIOLoophasnoattributeinitialized_Joyyang_c的博客-CSDN博客

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。

上一篇

下一篇