Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore ສ້າງລະບົບກວດບົດສອງເສັງປາລະໄນເເບບທັນສະໄໝໂດຍການອ່ານລະຫັດບັດນັກສອບເສັງທີ່ຂຽນດ້ວຍມື. (The pattern of Handwritten Digit Recognition for Entrance system)

ສ້າງລະບົບກວດບົດສອງເສັງປາລະໄນເເບບທັນສະໄໝໂດຍການອ່ານລະຫັດບັດນັກສອບເສັງທີ່ຂຽນດ້ວຍມື. (The pattern of Handwritten Digit Recognition for Entrance system)

Published by s.thoummaly, 2022-05-13 07:14:30

Description: ຂຽນໂດຍ: ທ້າວ. ເພັດ ສອນວິໄລ
ນໍາພາໂດຍ: ອຈ.ປອ. ສົມສັກ ອິນທະສອນ
ສົກສຶກສາ: 2021-2022

Search

Read the Text Version

ຮບູ ທີ 4. 16 ສະແດງ ຜນການຈາໍ ແນກຜິດພາດຂອງ CNN 4.1.5 ຜນການປະເມນີ ຜນການປະເມນີ ແບບຈາໍ ລອງ KNN, SVM, CNN ແລະ ສມທຽບການຮຽນຮເູ້ ພ່ ືຼອຈາໍ ແນກ ຕວເລກ ເຫັນວາ່ KNN ສາມາດຈາໍ ແນກໄດມ້ ຄີ າ່ ຄວາມຖກຼື ຕອ້ ງ 96.98%, ຈາໍ ແນກຜິດພາດ 3.02% ແລະ ເມ່ອືຼ ນາໍ ແບບຈາໍ ລອງໄປຈາໍ ແນກກບັ ຊຸດຂມໍ້ ນູ ທດສອບ 20% (Test set) ເຫັນວາ່ ມຄີ ວາມຖກຼື ຕອ້ ງ 96.66%, SVM ສາມາດຈາແນກໄດມ້ ຄີ າ່ ຄວາມຖກຼື ຕອ້ ງ 97.63%, ຈາແນກຜິດພາດ 2.37% ແລະ ເມ່ອືຼ ນາໍ ແບບຈາໍ ລອງໄປຈາໍ ແນກກບັ ຊຸດຂໍມ້ ນູ ທດສອບ 20% (Test set) ເຫັນວາ່ ມຄີ ວາມຖກືຼ ຕອ້ ງ 97.51%, CNN ສາມາດຈາໍ ແນກໄດມ້ ຄີ າ່ ຄວາມຖກືຼ ຕອ້ ງ 98.45%, ຈາໍ ແນກຜິດພາດ 1. 55%, ເມ່ອືຼ ນາໍ ແບບຈາໍ ລອງໄປຈາໍ ແນກກບັ ຊຸດຂໍມ້ ນູ ທດສອບ 20% (Test set) ເຫັນວາ່ ມຄີ ວາມຖກືຼ ຕອ້ ງ 97.29%. ສະນນັ້ , ເມ່ອຼື ສມທຽບທງັ 3 ແບບຈາລອງແລວ້ ເຫັນວາ່ ແບບຈາລອງ CNN ສາມາດຮຽນຮໃູ້ ນການ ຈາໍ ແນກໄດດ້ ກີ ວາ່ ໝໃູ່ ນການຮຽນຮູ້ ມຄີ າ່ ຄວາມຖກຼື ຕອ້ ງ 98.45% ສະແດງດງ່ ັ ຕາຕະລາງທີ 4.1 ແລະ ຮບູ ທີ 4.17 ຕາຕະລາງທີ 4. 1 ການສມທຽບແບບຈາໍ ລອງ KNN, SVM ແລະ CNN Confusion Matrix KNN SVM CNN ຄາ່ ຄວາມຖກຼື ຕອ້ ງ 96.98% 97.63% 98.45% Precision 0.97 0.98 0.98 37

Recall 0.97 0.98 0.98 F-Measure 0.97 0.98 0.98 ຜນການສມທຽບປະສດິ ທິພາບຂອງແບບຈາໍ ລອງທງັ 3 ໃນການຮຽນຮສູ້ ະແດງດງ່ ັ ຮບູ ທີ 4.17 ຮບູ ທີ 4. 17 ຜນການປຽບທຽບປະສດິ ທິພາບແບບຈາໍ ລອງ 4.1.6 ຜນການນາໍ ແບບຈາໍ ລອງເຂ້າໃນການພດັ ທະນາໂປຣແກຣມ ຮບູ ທີ 4. 18 ສະແດງ ຜນການນາໍ ແບບຈາໍ ລອງໄປນາໍ ໃຊ້ 38

ໃນຮບູ ທີ 4.18 ສະແດງ ຜນການນາໍ ແບບຈາໍ ລອງເຂ້າໃນການພດັ ທະນາໂປຣແກຣມການ ກວດບດສອບເສັງແບບປາລະໄນ ເພ່ ືຼອອາ່ ນລະຫດັ ນກັ ສອບເສັງ ແລະ ລະຫດັ ຫວບດສອບເສັງ ໂດຍ ໄດນ້ າໍ ໃຊ້ ແບບຈາໍ ລອງທ່ ີສາ້ ງດວ້ ຍພາສາ Python ແລະ ສ່ງອອກເປັນໄຟລ .onnx ເປັນແບບຈາໍ ລອງ ສາໍ ເລັດແລວ້ ແລະ ນາໍ ໃຊພ້ າສາ C#, EmguCV ໃນການເຮັດວຽກພາຍໃຕ້ .Net Framework. 4.2 ການສນທະນາ ອງີ ຕາມ ຜນການຄນ້ ຄວາ້ ຄງັ້ ນ,ີ້ ເຫັນວາ່ ມຫີ ຼາຍປດັ ໃຈທາງດາ້ ນເທັກນກິ ທ່ ີມຜີ ນຕ່ ໍຂະບວນການໃນ ການເຮັດທດລອງ, ຊ່ ງຶ ມລີ າຍລະອຽດຕ່ ໍໄປນ:້ີ ຊດຸ ຂມໍ້ ນູ ມຈີ າໍ ນວນຫຼາຍຕວຢາ່ ງ (Record/Rows) ແລະ ມຈີ າໍ ນວນຫຼາຍມຕິ ິ (Dimension) ຫຼື ຫຼາຍຖນັ (Field/Column) ເຮັດໃຫມ້ ຜີ ນຕ່ ໍການສາ້ ງແບບຈາໍ ລອງ ດວ້ ຍຂນັ້ ຕອນວທິ ີ KNN ແລະ SVM ພບບນັ ຫາໃນການປະມວນຜນຊາ້ ຊ່ ງຶ ຈາໍ ເປັນຕອ້ ງໄດເ້ ຮັດການ ຫຼຸດຈາໍ ນວນມຕິ ຂິ ອງຊດຸ ຂໍມ້ ນູ ລງໃຫເ້ ໝາະສມ. ສາໍ ລບັ ການຫຼດຸ ຈາໍ ນວນມຕິ ຂິ ອງຊດຸ ຂມໍ້ ນູ ລງ ເພ່ ຼືອເພ່ ີມ ປະສິດທິພາບໃນການປະມວນຜນ, ແຕທ່ າງດາ້ ນຄນຸ ນະພາບຂອງຊດຸ ຂມໍ້ ນູ ຫຼຸດລງ, ໝາຍຄວາມວາ່ ມີ ຜນຕ່ ໍການຮຽນຮຂູ້ ອງແບບຈາໍ ລອງຕວຈງິ ເຮັດໃຫປ້ ະສດິ ທິຜນຂອງການຈາໍ ແນກຂໍມ້ ນູ ຂອງແບບ ຈາໍ ລອງໃນການຮຽນຮດູ້ ງ່ ັ ກາ່ ວມຄີ ວາມຖກືຼ ຕອ້ ງຫຼດຸ ລງ ແລະ ຜິດພາດຫຼາຍຂນື້ຼ ເຊ່ ັນວາ່ : KNN ສາມາດ ຈາໍ ແນກໄດມ້ ຄີ ວາມຖກຼື ຕອ້ ງ 96.98% ແລະ SVM ຈາໍ ແນກໄດຖ້ ກືຼ ຕອ້ ງພຽງແຕ່ 97.63%. ເມ່ອຼື ນາໍ ແບບ ຈາໍ ລອງໄປຈາໍ ແນກກບັ ຊຸດຂໍມ້ ນູ ທດສອບ 20% (Test set) ເຫັນວາ່ KNN ມຄີ ວາມຖກືຼ ຕອ້ ງ 96.66%, SVM ມຄີ ວາມຖກືຼ ຕອ້ ງ 97.51%. ສາໍ ລບັ ຂນັ້ ຕອນວທິ ີຂອງ CNN ແມນ່ ເຮັດວຽກໄດດ້ ກີ ບັ ຊຸດຂໍມ້ ນູ ຈາໍ ນວນຫຼາຍ ແລະ ມຫີ ຼາຍມຕິ ,ິ ເຮັດໃຫຜ້ ນໄດຮ້ ບັ ຂອງແບບຈາໍ ລອງໃນການຮຽນຮູ້ ມຄີ ວາມຖກຼື ຕອ້ ງສງູ ແລະ ຜິດພາດໜອ້ ຍລງ, ເຊັນ ວາ່ : CNN ມຄີ ວາມຖກຼື ຕອ້ ງ (Accuracy) 98.45%, ແຕວ່ າ່ ເມ່ອືຼ ນາໍ ໄປທດສອບກບັ ຂໍມ້ ນູ ທດສອບ (Test set) 20% ມຄີ າ່ ຄວາມຖກຼື ຕອ້ ງພຽງແຕ່ 97.29%, ເມ່ອືຼ ປຽບທຽບການຮຽນຮແູ້ ລວ້ CNN ມຄີ ວາມ ຖກືຼ ຕອ້ ງສງູ ກວາ່ , ແຕກ່ ່ຍໍ ງັ ມຂີ ໍຜ້ ິດພາດຈາໍ ນວນໜ່ ຶງຢ.ູ່ ດາ້ ນການພດັ ທະນາ ຫຼື ນາໍ ແບບຈາໍ ລອງໄປນາໍ ໃຊ້ ກ່ເໍ ປັນອີກສວ່ ນໜ່ ຶງທ່ ີສາໍ ຄນັ , ຊ່ ງຶ ການນາໍ ແບບຈາໍ ລອງໄປພດັ ທະນາເຂ້າກບັ ໂປຣແກຣມນາໍ ໃຊ້ ແມນ່ ເປັນສ່ ງິ ທາ້ ທາຍສາໍ ສບັ ນກັ ພດັ ທະນາ, ສາໍ ລບັ ການທດລອງຄງັ້ ນ້ີ ແມນ່ ໄດແ້ ບບຈາໍ ລອງທ່ ີສາ້ ງດວ້ ຍ ພາສາ Python ຮວ່ ມກບັ ພາສາ C#, EmguCV ເຮັດວຽກຢໃູ່ ນ .Net Framework. ສະນນັ້ , CNN ເປັນແບບຈາໍ ລອງທ່ ີເໝາະສມກວາ່ ໝໃູ່ ນຂມໍ້ ນູ ຊດຸ ນ້ີ ແລະ ໄດນ້ າໍ ແບບຈາໍ ລອງ CNN ໄປພດັ ທະນາເຂ້າກບັ ລະບບການກວດບດສອບເສັງໄດຢ້ າ່ ງມປີ ະສດິ ທິພາບ ແລະ ອາໍ ນວຍຄວາມ ສະດວກໃຫກ້ ບັ ຜທູ້ ່ ີນາໍ ໃຊລ້ ະບບພໍສມຄວນ, ຖາ້ ທຽບໃສກ່ ບັ ລະບບເດມີ ທ່ ີນາໍ ໃຊ.້ 39

ພາກທີ 5 ສະຫຼບຸ , ຂຈໍ້ າໍ ກດັ ແລະ ຂໍແ້ ນະນາໍ ໃນການສກຶ ສາ 5.1 ສະຫຼບຸ ຜນໃນການສກຶ ສາ ຜາ່ ນການເກບັ ກາໍ ຂໍມ້ ນູ , ກະກຽມຂມໍ້ ນູ ແລະ ຈດັ ລະບຽບຂມໍ້ ນູ , ສາ້ ງແບບຈາໍ ລອງ ແລະ ນາໍ ແບບ ຈາໍ ລອງການອາ່ ນຕວເລກລະຫດັ ນກັ ສອບເສັງທ່ ີຂຽນດວ້ ຍມືຼ ເຂ້າໃນການພດັ ທະນາໂປຣແກຣມກວດ ບດສອບເສັງແບບປາລະໄນ ຂອງມະຫາວທິ ະຍາໄລແຫງ່ ຊາດ, ຊ່ ງຶ ໄດ້ ສາ້ ງແບບຈາໍ ລອງໂດຍນາໍ ໃຊ້ ຂນັ້ ຕອນວທິ ີການຮຽນຮດູ້ ວ້ ຍຄອມພິວເຕີ (Machine Learning) ແລະ ປນັ ຍາປະດດິ ຂນັ້ ສງູ (Deep Learning) ໃນການຮຽນຮ,ູ້ ປະເມນີ ຄວາມຖກືຼ ຕອ້ ງຂອງແບບຈາໍ ລອງດວ້ ຍວທິ ີ Confusion Matrix, ສມ ທຽບ, ຄດັ ເລືຼອກເອາແບບຈາໍ ລອງທ່ ີເໝາະສມທ່ ີສດຸ , ນາໍ ໃຊແ້ ບບຈາໍ ລອງທ່ ີໄດໄ້ ປພດັ ທະນາເຂ້າໃນ ລະບບການກວດບດສອບເສັງ. ການຄນ້ ຄວາ້ ຄງັ້ ນ,້ີ ແມນ່ ໄດນ້ າໍ ໃຊ້ ທິດສະດີ ການວເິ ຄາະຂມໍ້ ນູ (Tukey, 1977), (Clark, 2021) ແລະ (Diaconis, 2011), ບດຄນ້ ຄວາ້ ທ່ ີກຽ່ ວຂອ້ ງເຊ່ ັນ: ທາ່ ນ Liu, W., Wei, J., & Meng, Q. (2020) ໄດຄ້ ນ້ ຄວາ້ ຫວຂໍ້ ສມທຽບຂນັ້ ຕອນວທິ ີ KNN, SVM, BP neural network, CNN ໃນການອາ່ ນຂມໍ້ ນູ ຕວເລກທ່ ີຂຽນດວ້ ຍມ.ືຼ ນາໍ ໃຊຂ້ ມໍ້ ນູ ທງັ ໝດ 70,361 ຕວຢາ່ ງ ຈາກ ຖານຂໍມ້ ນູ ຮບູ ພາບການສອບເສັງ ເຂ້າມະຫາວທິ ະຍາໄລແຫງ່ ຊາດ ແລະ ຖານຂມໍ້ ນູ MNIST, ໄດນ້ າ ໃຊເ້ ຄ່ ອຼື ງມຼື ພາສາ Python ໃນການສາ້ ງແບບຈາໍ ລອງດວ້ ຍ KNN, SVM, CNN, ນາໃຊ້ EmguCV, .Net Framework, ພາສາ C# ໃນການພດັ ທະນາ ແລະ ປະມວນຜນຂໍມ້ ນູ . ຜນຂອງການວເິ ຄາະຂມໍ້ ນູ ຈາກ 70,361 ຕວຢາ່ ງ ພບວາ່ ແບບຈາໍ ລອງໂດຍນາໍ ໃຊ້ KNN ສາມາດ ຈາໍ ແນກຕວເລກໄດ້ 96.98% ຂມໍ້ ນູ ການຮຽນຮູ້ 80% ຂອງຂໍມ້ ນູ 70,361 ຕວຢາ່ ງ ແລະ ຂມໍ້ ນູ ທດສອບ 20% (Test set) ໄດ້ 96.66%, SVM ຈາແນກໄດ້ 97.63% ຂມໍ້ ນູ ການຮຽນຮູ້ 80% ຂອງຂມໍ້ ນູ 70,361 ຕວຢາ່ ງ ແລະ ຂໍມ້ ນູ ທດສອບ 20% (Test set) ໄດ້ 97.51%, CNN ສາມາດຈາໍ ແນກຕວເລກໄດ້ 98.45% ຂໍມ້ ນູ ການຮຽນຮູ້ 80% ຂອງຂມໍ້ ນູ 70,361 ຕວຢາ່ ງ ແລະ ຂມໍ້ ນູ ທດສອບ 20% (Test set) ໄດ້ 97.29%. ສະນນັ້ , ເມ່ອຼື ສມທຽບປະສິດທິພາບທງັ 3 ແບບຈາລອງ ສາມາດສະຫຼບຸ ໄດວ້ າ່ ແບບຈາໍ ລອງ ໂດຍນາໍ ໃຊ້ CNN ແມນ່ ມຄີ ວາມຊດັ ເຈນໃນການຈາແນກຂໍມ້ ນູ ຕວເລກໄດດ້ ກີ ວາ່ . ດງ່ ັ ນນັ້ , ນາໍ ແບບ ຈາໍ ລອງດງ່ ັ ກາ່ ວໄປພດັ ທະນາເຂ້າກບັ ລະບບກວດບດສອບເສັງແບບປາລະໄນ ແລະ ທດສອບສາມາດ ອາ່ ນເລກລະຫດັ ບດັ ນກັ ສອບເສັງໄດ,້ ມຄີ ວາມສະດວກ ແລະ ຫຼດຸ ຜອ່ ນຂໍຜ້ ິດພາດຈາກລະບບເດມີ ທ່ ີມຢີ ູ່ ໄດ.້ 40

5.2 ຂຈໍ້ າໍ ກດັ ໃນການສກຶ ສາ ການຄນ້ ຄວາ້ ຄງັ້ ນ,້ີ ໄດພ້ ບຂໍຈ້ າໍ ກດັ ບາງດາ້ ນເຊ່ ັນ: ດາ້ ນທິດສະດີ ຍງັ ບ່ ໍໄດລ້ ງເລິກ, ການເກບັ ກາໍ ຂໍມ້ ນູ ຍງັ ບ່ ໍທນັ ພຽງພໍອີງໃສເ່ ວລາໃນການລງເກບັ ກາໍ ຍງັ ຈາໍ ກດັ , ເຄ່ ອືຼ ງມນືຼ າໍ ໃຊໃ້ ນການພດັ ທະນາ ແລະ ປະມວນຜນຍງັ ມຂີ ໍຈ້ າໍ ກດັ , ການສາ້ ງແບບຈາໍ ລອງແມນ່ ຍງັ ບ່ ໍໄດປ້ ບັ ຄາ່ ຕາ່ ງໆທ່ ີມຜີ ນຕ່ ໍການເຮັດວຽກ ຂອງແຕລ່ ະຂນັ້ ຕອນວທິ ີໃຫຄ້ ບທກຸ ຄາ່ ເພ່ ຼືອເພ່ ີມປະສດິ ທິພາບໃຫກ້ ບັ ແບບຈາໍ ລອງໃຫຈ້ າໍ ແນກຂມໍ້ ນູ ຜິດພາດໜອ້ ຍລງ. ໄດນ້ າໍ ໃຊກ້ ານຫຼຸດມຕິ ິ ເພ່ ຼືອໃຫກ້ ານປະມວນຜນຂອງຂນັ້ ຕອນວທິ ີເຮັດວຽກໄດໄ້ ວຂນຶ້ ແຕຍ່ ງັ ບ່ ໍໄດສ້ ມທຽບວາ່ ຖາ້ ຫຼດຸ ມຕິ ນິ ອ້ ຍລງຈະເຮັດໃຫຄ້ ນຸ ນະພາບຂອງຂມໍ້ ນູ ແລະ ຜນການເຮັດວຽກ ຂອງແຕລ່ ະແບບຈາໍ ລອງ, ບນັ ຫາການນາໍ ແບບຈາໍ ລອງໄປນາໍ ໃຊ້ ຍງັ ນາໍ ໃຊແ້ ຕແ່ ບບ ONNX ແລະ ດາ້ ນການພດັ ທະນາໂປຣແກຣມ ແມນ່ ນາໍ ໃຊພ້ າສາ C# ຊ່ ງຶ ຍງັ ບ່ ໍທນັ ໄດວ້ ດັ ປະສດິ ທິພາບຄວາມໄວໃນ ການປະມວນຜນ, ສະແດງຜນເວລານາໍ ໃຊ້ ແລະ ຍງັ ບ່ ໍໄດທ້ ດສອບກບັ ລະບບປະຕບິ ດັ ການ (OS) ອ່ ືຼນໆ. 5.3 ຂແໍ້ ນະນາໍ ໃນການສກຶ ສາ ການສກຶ ສາຄນ້ ຄວາ້ ໃນຂນັ້ ຕ່ ໍໄປ: ຄວນສກຶ ສາບນັ ດາທິດສະດໃີ ຫເ້ ລິກເຊ່ ງິ , ຫຼກັ ການ, ວທິ ີການ, ເຄ່ ອຼື ງມ,ືຼ ເກບັ ກາໍ ຂມໍ້ ນູ ໃຫຫ້ ຼາຍຂນ້ຶ ຕ່ ມືຼ , ຄວນສກຶ ສາບດຄນ້ ຄວາມທ່ ີກຽ່ ວຂອ້ ງກບັ ປນັ ຍາປະດດິ ຂນັ້ ສງູ (Deep Learning) ໃຫຫ້ ຼາຍຂນຶ້ ຕ່ ຼືມເພ່ ຼືອສມທຽບກບັ CNN ເພ່ ຼືອພດັ ທະນາແບບຈາໍ ລອງໃຫເ້ ຮັດວຽກໄດດ້ ີ ຂນຶ້ . ສາໍ ລບັ ຊດຸ ຂໍມ້ ນູ ຮບູ ພາບຄວນໃຫມ້ ຈີ າໍ ນວນຫຼາຍ, ມຫີ ຼາຍວຊິ າ. ການນາໍ ແບບຈາໍ ລອງໄປນາໍ ໃຊ້ (Deployment) ຄວນສມທຽບກບັ ຮບູ ແບບ (Model) ອ່ ຼືນໆອີກ ນອກຈາກ ONNX ແລະ ດາ້ ນການ ພດັ ທະນາ ຕອ້ ງເລືຼອກເຄ່ ອືຼ ງມືຼ ທ່ ີເໝາະສມ ເພ່ ຼືອໃຫລ້ ະບບທ່ ີຖກຼື ພດັ ທະນາເຮັດວຽກໄດດ້ ີ ແລະ ມີ ປະສດິ ທິພາບ. 41

ເອກະສານອາ້ ງອງີ ສອນວໄິ ລ, ເ., ທິບພະວງ, ບ., & ອິນທະສອນ, ສ. (2020). ແບບຈາໍ ລອງການອາ່ ນໂຕເລກແບບ ຂຽນດວ້ ຍມໂືຼ ດຍນາໍ ໃຊເ້ ທັກນກິ Machine Learning. ອິນທະສອນ, ສ., ສອນວໄິ ລ, ເ., ແສງມະໂນທາໍ , ສ., ພນັ ທະວງ, ບ., ຜນດາລາ, ພ., & ພນັ ທະ ວງ, ສ. (2015). ພດັ ທະນາລະບບກວດບດສອບເສັງໂດຍນາໍ ໃຊ້ OMR(Optical Mark Recognition). ວາລະສານ ວທິ ະຍາສາດມະຫາວທິ ະຍາໄລແຫ່ງຊາດ. ອນິ ທະສອນ, ສ., ແສງມະໂນທາໍ , ສ., ຜນດາລາ, ພ., ສອນວໄິ ລ, ເ., ພມມະສອນ, ສ., & ຄນູ ສະ ຫວດັ , ເ. (2010). ການສາ້ ງຊອບແວໃ໌ ນປະເມນີ ຜນເອານກັ ນກັ ຮຽນນອກແຜນເຂ້າໂຮງຮຽນ ຊນັ້ ສງູ ແລະ ມະຫາວທິ ະຍາໄລແຫງ່ ຊາດ ໂດຍນາໍ ໃຊລ້ ະບບ ບາໂຄດ. ວາລະສານ ວທິ ະຍາສາດມະຫາວທິ ະຍາໄລແຫ່ງຊາດ. Ramesh, G., Prasanna, G. B., Bhat, S. v., Naik, C., & Champa, H. N. (2021). An Efficient Method for Handwritten Kannada Digit Recognition based on PCA and SVM Classifier. Journal of Information Systems and Telecommunication, 9(35). https://doi.org/10.52547/jist.9.35.169 Ali, S., Sun, H., & Zhao, Y. (2021). Model learning: a survey of foundations, tools and applications. Frontiers of Computer Science, 15(5). https://doi.org/10.1007/s11704-019- 9212-z Ramesh, G., Prasanna, G. B., Bhat, S. v., Naik, C., & Champa, H. N. (2021). An Efficient Method for Handwritten Kannada Digit Recognition based on PCA and SVM Classifier. Journal of Information Systems and Telecommunication, 9(35). https://doi.org/10.52547/jist.9.35.169 Wong, T. T., & Yeh, P. Y. (2020). Reliable Accuracy Estimates from k-Fold Cross Validation. IEEE Transactions on Knowledge and Data Engineering, 32(8). https://doi.org/10.1109/TKDE.2019.2912815 Liu, W., Wei, J., & Meng, Q. (2020). Comparisions on KNN, SVM, BP and the CNN for Handwritten Digit Recognition. Proceedings of 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, AEECA 2020, 587–590. https://doi.org/10.1109/AEECA49918.2020.9213482 Pedrycz, W., & Chen, S.-M. C. (2020). Deep Learning: Concepts and Architectures. In Studies in Computational Intelligence (Vol. 866). Kupssinskü, L. S., Guimarães, T. T., de Souza, E. M., Zanotta, D. C., Veronez, M. R., Gonzaga, L., & Mauad, F. F. (2020). A method for chlorophyll-a and suspended solids prediction through remote sensing and machine learning. Sensors (Switzerland), 20(7). https://doi.org/10.3390/s2007212 42

Fadhil, A. T., Abbar, K. A., & Qusay, A. M. (2020). Computer Vision-Based System for Classification and Sorting Color Objects. IOP Conference Series: Materials Science and Engineering, 745(1). https://doi.org/10.1088/1757-899X/745/1/012030 Ahamed, H. Alam I., & Islam, M. M. (2019). SVM Based Real Time Hand-Written Digit Recognition System. International Conference on Engineering Research and Education, School of Applied sciences & Technology, SUST, Sylhet Surjandari, I., Wayasti, R. A., Laoh, E., Zulkarnain, Rus, A. M. M., & Prawiradinata, I. (2019). Mining public opinion on ride-hailing service providers using aspect-based sentiment analysis. International Journal of Technology, 10(4). https://doi.org/10.14716/ijtech.v10i4.2860 Bai, J., Lu, F., & Zhang, K. (2019). ONNX: Open Neural Network Exchange. GitHub Repository. Aik, L. E., Hong, T. W., & Junoh, A. K. (2019). A new formula to determine the optimal dataset size for training neural networks. ARPN Journal of Engineering and Applied Sciences, 14(1). Khan, S., Ali, H., Ullah, Z., Minallah, N., Maqsood, S., & Hafeez, A. (2018). KNN and ANN- based recognition of handwritten Pashto letters using zoning features. International Journal of Advanced Computer Science and Applications, 9(10), 570–577. https://doi.org/10.14569/IJACSA.2018.091069 Hamid, N. binti A., & Sjarif, N. N. B. A. (2017, February 1). Handwritten recognition using SVM, KNN and neural network. ArXiv. arXiv. Sammut, C. and Webb, Eds., “Holdout Evaluation,” in Encyclopedia of Machine Learning and Data Mining, Boston, MA: Springer US, 2017, p. 624. Cady, F. (2017). The Data Science Handbook. Machine Learning Overview (pp. 87–97). John Wiley & Sons,Ltd. https://10.1002/9781119092919 AL-Behadili, H. (2016). Classification Algorithms for Determining Handwritten Digit. Iraqi Journal for Electrical and Electronic Engineering, 12(1). https://doi.org/10.37917/ijeee.12.1.10 Awad, M., Khanna, R., Awad, M., & Khanna, R. (2015). Support Vector Machines for Classification. In Efficient Learning Machines (pp. 39–66). https://doi.org/10.1007/978-1- 4302-5990-9_3 OECD. 2015. Frascati Manual: Guidelines for Collecting and Reporting Data on Research and Experimental Development. Paris, France: OECD. Available at https://www.oecd.org/publications/frascati-manual-2015-9789264239012-en.htm Y. Lecun, C. Cortes, and C. J. C. Burges, The MNIST Database, Courant Institute, NYU, 2014. [Online]. Available: http://yan.lecun.com/exdb/mnist. 43

Han, J., Kamber, M. & Pei, J. (2012). Data mining concepts and techniques, third edition Morgan Kaufmann Publisher Diaconis, P. (2011). Theories of Data Analysis: From Magical Thinking Through ClassicalStatistics. InExploring Data Tables, Trends, and Shapes(pp. 1–36). John Wiley & Sons,Ltd. https://doi:10.1002/9781118150702.ch1 Wu, X. & Kumar, V. (2009). The Top Ten Algorithms in Data Mining (pp. 151–155). Chapman and Hall/CRC. https://doi.org/10.1201/9781420089653 Haykin, S. S. (2009). Neural networks and learning machines, third edition Harlow; London: Pearson Education. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), (pp. 10-18). Hall, B. H. (2006). “Research and Development (R&D)” Contribution to the International Encyclopedia of the Social Sciences, second edition. Available at https://eml.berkeley.edu//~bhhall/papers/BHH06_IESS_R%26D.pdf Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786). https://doi.org/10.1126/science.1127647 Haykin, S., Neural networks: a comprehensive foundation. 1994: Prentice Hall PTR. 44

ເອກະສານຊອ້ ນທາ້ ຍ ການຕດິ ຕງັ້ ໂປຣແກຣມ ແລະ ຕງັ້ ຄາ່ ລະບບ 1.1 ໂປຣແກຣມ Weka WEKA ໂປຣແກຣມ Open Source ທ່ ີໃຊສ້ າໍ ລບັ Data mining ແລະ ການຮຽນຮດູ້ ວ້ ຍຄອມພິວ ເຕີ (Machine learning) ໃນການ Classification, Clustering ແລະ Association rule, ສາມາດນາໍ ໃຊ້ ເປັນແບບ User Interface (GUI) ແລະ ແບບ Java API (Standard Terminal Applications). (Hall et al., 2009). ການຕດິ ຕງັ້ ແລະ ຕງັ້ ຄາ່ ມຂີ ນັ້ ຕອນລມຸ່ ນ:້ີ 1. ດາວໂຫລດໄດຈ້ າກ https://waikato.github.io/weka-wiki/downloading_weka/ ຮບູ ທີ 1 ໂປຣແກຣມ WEKA 45

ຮບູ ທີ 2 ໜາ້ GUI ໂປຣແກຣມ WEKA ຮບູ ທີ 3 ໂປຣແກຣມ WEKA ສາໍ ລບັ ການຈດັ ການຂມໍ້ ນູ 47

ຮບູ ທີ 4 ໂປຣແກຣມ WEKA ແບບ Java API Command line 2. ການຕງັ້ ຄາ່ ໃຫໜ້ ວ່ ຍຄວາມຈາໍ ຂອງໂປຣແກຣມ Weka ໃຫຮ້ ອງຮບັ ການປະມວນຜນຊຸດຂໍມ້ ນູ ຂະໜາດໃຫຍ.່ 48

ຮບູ ທີ 5 ສະແດງຂໍມ້ ນູ ແວດລອ້ ມຂອງໂປຣແກຣມ WEKA 3. ນາໍ ໃຊຄ້ າໍ ສງ່ ັ ຢໃູ່ ນໜາ້ DOS ເພ່ ືຼອປບັ ຄາ່ ໜວ່ ຍຄວາມຈາໍ ຂະໜາດ 5GB ຫຼື 5120 MB ໃຫ້ ຮອງຮບັ ການປະມວນຜນຂມໍ້ ນູ ຈາໍ ນວນຫຼາຍໄດ.້ ຮບູ ທີ 6 ສະແດງການຕງັ້ ຄາ່ ໜວ່ ຍຄວາມຈາໍ 49

1.2 Python ແລະ Keras Libraries  ຕດິ ຕງັ້ Miniconda ແລະ Python 1. ດາວໂຫຼດໂປຣແກຣມຈາກ https://docs.conda.io/projects/conda/en/latest/user- guide/install/windows.html , ຊ່ ງຶ ຈະມໂີ ປຣແກຣມ Python ມາພອ້ ມ ແລະ ຕິດຕງັ້ ໃຫ້ ສາໍ ເລັດ. ຮບູ ທີ 7 ສະແດງໜາ້ ເວບສາໍ ລບັ ດາວໂຫຼດ Miniconda 2. ພາຍຫຼງັ ຕດິ ຕງັ້ ສາໍ ເລັດແມນ່ ໃຫຕ້ ງັ້ ຄາ່ ແລະ ຕດິ ຕງັ້ ບນັ ດາເຄ່ ອຼື ງມືຼ ແລະ Libraries ຕາ່ ງໆ ດງ່ ັ ນ:້ີ ຮບູ ທີ 8 ສະແດງໜາ້ Miniconda ແບບຄາສ່ງັ #conda create -n envs python=3.7 # conda create --name tensorflow # activate tensorflow 50

# pip install tensorflow==2.0.0-alpha0 # python -m ipykernel install --user --name tensorflow --display-name “TensorFlow” # conda install jupyter ສາໍ ລບັ Windows 10, ຖາ້ ມບີ ນັ ຫາ jupyter conda install -c conda-forge ipython=7.19.0 tensorflow-error ກລໍ ະນມີ ຂີ ຜໍ້ ິດພາດ: C:\\Users\\PC\\Anaconda3\\envs\\tut\\lib\\site- packages\\tensorflow\\python\\framework\\dtypes.py ໃຫປ້ ່ຽນຄາໍ ສ່ງັ ໃນແຖວ : (line 516) _np_qint8 = np.dtype([(\"qint8\", np.int8, 1)]) _np_quint8 = np.dtype([(\"quint8\", np.uint8, 1)]) _np_qint16 = np.dtype([(\"qint16\", np.int16, 1)]) _np_quint16 = np.dtype([(\"quint16\", np.uint16, 1)]) _np_qint32 = np.dtype([(\"qint32\", np.int32, 1)]) # _np_bfloat16 is defined by a module import. # Custom struct dtype for directly-fed ResourceHandles of supported type(s). np_resource = np.dtype([(\"resource\", np.ubyte, 1)]) ດວ້ ຍຄາໍ ສງ່ ັ ດງ່ ັ ນີ້ : _np_qint8 = np.dtype([(\"qint8\", np.int8, (1,))]) _np_quint8 = np.dtype([(\"quint8\", np.uint8, (1,))]) _np_qint16 = np.dtype([(\"qint16\", np.int16, (1,))]) _np_quint16 = np.dtype([(\"quint16\", np.uint16, (1,))]) _np_qint32 = np.dtype([(\"qint32\", np.int32, (1,))]) # _np_bfloat16 is defined by a module import. # Custom struct dtype for directly-fed ResourceHandles of supported type(s). np_resource = np.dtype([(\"resource\", np.ubyte, (1,))]) ແລະ ປະຕບິ ດັ ແບບດຽວກນັ ກບັ ໄຟລດງ່ ັ ນີ້ : 51

C:\\Users\\PC\\Anaconda3\\envs\\tut\\lib\\site- packages\\tensorboard/compat/tensorflow_stub/dtypes.py ສ່ງັ ໃຫ້ tensorboard ເຮັດວຽກ ຮບູ ທີ 9 ສະແດງການສ່ງັ ໃຫ້ Tensorboard ເຮັດວຽກ ເປີດໂປຣແກຣມ Web browser ຕາມ URL ນີ້ : http://localhost:6006 ຮບູ ທີ 10 ສະແດງໜາ້ Tensorboard 52

1.3 C# ແລະ .Net Framework 1. ເມ່ອືຼ ດາວໂຫຼດ ແລະ ຕດິ ຕງັ້ ໂປຣແກຣມ Visual Studio 2019 ແລະ .Net Framework 4.5 ຈະ ປະກອບມບີ ນັ ດາພາສາໂປຣແກຣມຕາ່ ງໆ ເຊ່ ັນວາ່ : C# ແລະ ອ່ ນຶ ໆ ມາພອ້ ມ. ສະນນັ້ , ໃຫເ້ ລຼອື ກ ເອາ Community ຈະສະດວກໃນການນາໍ ໃຊ້ ເພາະເປັນ Open source ທ່ ີບໍລິສດັ Microsoft ໃຫ້ ນາໍ ໃຊໂ້ ດຍບ່ ໍເສບຄາ່ ລິຂະສດິ . ຮບູ ທີ 11 ສະແດງດາວໂຫຼດ Visual Studio 2019 2. ພາຍຫຼງັ ດາຍໂຫຼດ, ຕດິ ຕງັ້ ສາໍ ເລັດ ແລະ ເຂ້າໃຊ.້ 53

ຮບູ ທີ 12 ສະແດງໜາ້ Visual Studio 2019 3. ການສາ້ ງ Project ໃນ Visual Studio 2019 ຮບູ ທີ 13 ສະແດງໜາ້ ການສາ້ ງ Project ໃນ Visual Studio 2019 54

4. ເມ່ອືຼ ສາ້ ງ Project ສາໍ ເລັດ. ຮບູ ທີ 14 ສະແດງການສາ້ ງ Project ສາເລັດ. ການຈດັ ການຊຸດຂໍມ້ ນູ 2.1 ການປະມວນຜນຮບູ ພາບ ນາໍ ໃຊໂ້ ປຣແກຣມ Paint ໃນການຕດັ ຮບູ ເອາແຕສ່ ວ່ ນລະຫດັ ນກັ ສອບເສັງ ແລະ ລະຫດັ ຫວບດ ເພ່ ຼືອ ນາໍ ໄປເປັນ Dataset ໃນການຮຽນຮໃູ້ ຫກ້ ບັ ແບບຈາໍ ລອງ. 55

ຮບູ ທີ 15. ສະແດງການນາໍ ໃຊ້ Paint ໃນການຕດັ ຮບູ 56

ຮບູ ທີ 16. ສະແດງຜນການຕດັ ຮບູ ປບັ ຮບູ ໃຫເ້ ປັນຂາວດາໍ (Gray scale) ແລະ ປບັ ຂະໜາດໃຫເ້ ປັນ 28x28 Pixel. ຮບູ ທີ 17. ສະແດງຜນແຍກຮບູ ຕວເລກ. ນາໃຊຄ້ າສງ່ ັ ພາສາ Python ທາໍ ການປ່ຽນຂໍມ້ ນູ ຮບູ ພາບໃຫເ້ ປັນຂມໍ້ ນູ ໃນຮບູ ແບບ .csv ໄຟລ ດງ່ ັ ນ:້ີ 57

import numpy as np import cv2 import os IMG_DIR = 'data1' for img in os.listdir(IMG_DIR): img_array = cv2.imread(os.path.join(IMG_DIR,img), cv2.IMREAD_GRAYSCALE) img_array = (img_array.flatten()) img_array = img_array.reshape(-1, 1).T print(img_array) with open('output9.csv', 'ab') as f: np.savetxt(f, img_array, header=str(), delimiter=\",\") import pandas as pd df = pd.read_csv('output9.csv') df.shape 2.2 ການແບງ່ ຊດຸ ຂໍມ້ ນູ ນາໍ ໃຊໂ້ ປຣແກຣມ Weka ໃນການຈດັ ການແບງ່ ຊຸດຂໍມ້ ນູ ເປັນ Train ແລະ Test. 1. ເປີດໂປຣແກຣມ Weka 58

ຮບູ ທີ 18. ສະແດງໜາ້ ໂປຣແກຣມ Weka 2. ເລຼືອກ Tab ຊ່ ຼື Preprocess>Open file ເພ່ ຼືອເອາເອາຊດຸ ຂໍມ້ ນູ ເຂ້າສໂູ່ ປຣແກຣມ. 3. ຢໃູ່ ນຫອ້ ງ Filter ໃຫເ້ ລອືຼ ກ Choose>Filters>Unsupervised>instance>Resample ຮບູ ທີ 19. ສະແດງໜາ້ ການເລຼອື ກ Resample 59

4. ປບັ ຄາ່ ໃນການແບງ່ ຊຸດຂມໍ້ ນູ ຕາມອດັ ຕາສວ່ ນຕາ່ ງໆ ເປັນ 80% ແລະ 20% ໃນການ Train ແລະ Test ຕາມລາໍ ດບັ ດວ້ ຍການປບັ ຄາ່ sampleSizePercent ເປັນ 80.0, ປບັ ຄາ່ noReplacement ເປັນ True ແລະ ເລືຼອກ OK. ຮບູ ທີ 20. ສະແດງການປບັ ຄາ່ ສວ່ ນຮອ້ ຍຂອງການແບງ່ 5. ໃຫ້ Click ເລອືຼ ກ Apply ແລະ ໂປຣແກຣມຈະແບງ່ ຂມໍ້ ນູ . 60

ຮບູ ທີ 21. ສະແດງການການແບງ່ ຂໍມ້ ນູ 6. ບນັ ທຶກຂມໍ້ ນູ ທ່ ີແບງ່ ອອກເປັນໄຟລ .csv ຮບູ ທີ 22. ສະແດງການບນັ ທຶກການແບງ່ ຊຸດຂໍມ້ ນູ 61

2.3 ການຫຼດຸ ມຕິ ິຂອງຊດຸ ຂໍມ້ ນູ ການຫຼດຸ ມຕິ ິຂອງຂມໍ້ ນູ ເປັນການປບັ ຫຼດຸ ຈາໍ ນວນຕວປ່ຽນ ຫຼື ຈາໍ ນວນຖນັ ຂອງຂມໍ້ ນູ ລງ ເພ່ ືຼອໃຫເ້ ໝາະສມ ແລະ ເພ່ ຼືອໃຫຂ້ ນັ້ ຕອນວທິ ີໃນການປະມວນຜນຂໍມ້ ນູ ໃນການສາ້ ງແບບຈາໍ ລອງໄວຂນ້ຶ , ຊ່ ງຶ ມີ ຂນັ້ ຕອນລມຸ່ ນ:ີ 1. ເລຶອກ Tab ຊ່ ງຶ Select attributes>Choose ໃນຫອ້ ງຂອງ Attribute Evaluator ແລະ Search Method ແລະ ເລອືຼ ກ Yes. ຮບູ ທີ 23. ສະແດງໜາ້ Select attributes. 2. ສາໍ ລບັ ຫອ້ ງ Attribute Evaluator ໃຫເ້ ລືອຼ ກ Principal Components>Close 62

ຮບູ ທີ 24. ສະແດງໜາ້ PrincipalComponents. 63

3. ສາໍ ລບັ ຫອ້ ງ Search Method ໃຫເ້ ລືອຼ ກ Rank- ຮບູ ທີ 25. ສະແດງໜາ້ Rank. 4. ເລືອຼ ກປ່ມຸ Apply ແລະ Save ຂມໍ້ ນູ ອອກເປັນ csv 64

ຮບູ ທີ 26. ສະແດງສາໍ ເລັດການຫຼດຸ ມຕິ ຂິ ໍມ້ ນູ . ໂຄດແບບຈາໍ ລອງ 3.1 ແບບຈາໍ ລອງ KNN import numpy as np import pandas as pd import random as rn import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns %matplotlib inline # plotly library from chart_studio import plotly import plotly.graph_objs as go from plotly import tools from plotly.offline import init_notebook_mode, iplot init_notebook_mode(connected=True) from sklearn.preprocessing import LabelEncoder from sklearn.metrics import confusion_matrix, classification_report, accuracy_score 65

from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score, KFold from sklearn.model_selection import RandomizedSearchCV, GridSearchCV from sklearn.pipeline import Pipeline from scipy.stats import uniform import itertools import warnings from sklearn.exceptions import ConvergenceWarning warnings.filterwarnings(\"ignore\", category=ConvergenceWarning) from keras.utils.np_utils import to_categorical from keras.utils import np_utils import math import tensorflow as tf def get_best_score(model): print(model.best_score_) print(model.best_params_) print(model.best_estimator_) return model.best_score_ def print_validation_report(y_true, y_pred): print(\"Classification Report\") print(classification_report(y_true, y_pred)) acc_sc = accuracy_score(y_true, y_pred) print(\"Accuracy : \"+ str(acc_sc)) return acc_sc cbar=True, square=True, plot_confusion_matrix def plot_confusion_matrix(y_true, y_pred): mtx = confusion_matrix(y_true, y_pred) fig, ax = plt.subplots(figsize=(9,7)) sns.heatmap(mtx, annot=True, fmt='d', linewidths=.5, ax=ax) # square=True, plt.title(\"Confusion Matrix\") plt.ylabel('true label') plt.xlabel('predicted label') plt.show() # read train 66

train = pd.read_csv(\"full-mnist_NUoL70kPlus80train.csv\") print(train.shape) train.head() # read test test= pd.read_csv(\"full-mnist_NUoL70kPlus80test.csv\") print(test.shape) test.head() # put labels into y_train variable Y_train = train[\"class\"] # Drop 'label' column X_train = train.drop(labels = [\"class\"],axis = 1) # visualize number of digits classes plt.figure(figsize=(9,7)) g = sns.countplot(Y_train, palette=\"icefire\") plt.title(\"Number of digit classes\") plt.show() Y_train.value_counts() # plot some samples img = X_train.iloc[0].to_numpy() img = img.reshape((28,28)) plt.imshow(img,cmap='gray') plt.title(train.iloc[0,0]) plt.axis(\"off\") plt.show() #Train test Split for KNN # split into train test sets X_train_knn, X_test_knn, Y_train_knn, Y_test_knn = train_test_split(X_train, Y_train, random_state=42) #print(X_train.shape, X_test.shape, y_train.shape, y_test.shape) #knn for GridSearchCV crossvalidation from sklearn.neighbors import KNeighborsClassifier knn_cv = KNeighborsClassifier() k_range = list(range(1, 31)) param_grid = dict(n_neighbors=k_range) # defining parameter range 67

GridCV_grid_knn = GridSearchCV(knn_cv, param_grid, cv=3, scoring='accuracy', return_train_score=False,verbose=1) # fitting the model for grid search GridCV_knn=GridCV_grid_knn.fit(X_train_knn, Y_train_knn) #print(GridCV_knn.best_score_) score_grid_knn = get_best_score(GridCV_knn) #KNN from sklearn.neighbors import KNeighborsClassifier from sklearn.preprocessing import StandardScaler from sklearn.model_selection import cross_val_score import pickle #create a new KNN model knn_cv = KNeighborsClassifier(n_neighbors=9) #train model with cv of 5 cv_scores = cross_val_score(knn_cv, X_train_knn, Y_train_knn, cv=5) #print each cv score (accuracy) and average them print(cv_scores) print('cv_scores mean(Accuracy):{}'.format(np.mean(cv_scores))) # save the model import pickle pickle.dump(GridCV_knn, open(\"finalized_knn_model.pkl\", \"wb\")) # save the model to disk filename = 'finalized_knn_model.sav' pickle.dump(GridCV_knn, open(filename, 'wb')) # load the model knn_model = pickle.load(open(\"finalized_knn_model.pkl\", \"rb\")) pred_val_knn = knn_model.predict(X_test_knn) acc_knn = print_validation_report(Y_test_knn, pred_val_knn) 3.2 ແບບຈາໍ ລອງ SVM import numpy as np import pandas as pd import random as rn import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns %matplotlib inline # plotly library 68

from chart_studio import plotly import plotly.graph_objs as go from plotly import tools from plotly.offline import init_notebook_mode, iplot init_notebook_mode(connected=True) from sklearn.preprocessing import LabelEncoder from sklearn.metrics import confusion_matrix, classification_report, accuracy_score from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score, KFold from sklearn.model_selection import RandomizedSearchCV, GridSearchCV from sklearn.pipeline import Pipeline from scipy.stats import uniform import itertools import warnings from sklearn.exceptions import ConvergenceWarning warnings.filterwarnings(\"ignore\", category=ConvergenceWarning) from keras.utils.np_utils import to_categorical from keras.utils import np_utils import math import tensorflow as tf def get_best_score(model): print(model.best_score_) print(model.best_params_) print(model.best_estimator_) return model.best_score_ def print_validation_report(y_true, y_pred): print(\"Classification Report\") print(classification_report(y_true, y_pred)) acc_sc = accuracy_score(y_true, y_pred) print(\"Accuracy : \"+ str(acc_sc)) return acc_sc plot_confusion_matrix def plot_confusion_matrix(y_true, y_pred): mtx = confusion_matrix(y_true, y_pred) fig, ax = plt.subplots(figsize=(9,7)) 69

sns.heatmap(mtx, annot=True, fmt='d', linewidths=.5, cbar=True, square=True, ax=ax) # square=True, plt.title(\"Confusion Matrix\") plt.ylabel('true label') plt.xlabel('predicted label') plt.show() # read train train = pd.read_csv(\"full-mnist_NUoL70kPlus80train.csv\") print(train.shape) train.head() # read test test= pd.read_csv(\"full-mnist_NUoL70kPlus80test.csv\") print(test.shape) test.head() # put labels into y_train variable Y_train = train[\"class\"] # Drop 'label' column X_train = train.drop(labels = [\"class\"],axis = 1) # visualize number of digits classes plt.figure(figsize=(9,7)) g = sns.countplot(Y_train, palette=\"icefire\") plt.title(\"Number of digit classes\") plt.show() Y_train.value_counts() # plot some samples img = X_train.iloc[0].to_numpy() img = img.reshape((28,28)) plt.imshow(img,cmap='gray') plt.title(train.iloc[0,0]) plt.axis(\"off\") plt.show() #Train test Split for svm # split into train test sets X_train_svm, X_test_svm, Y_train_svm, Y_test_svm = train_test_split(X_train, Y_train, random_state=42) #print(X_train.shape, X_test.shape, y_train.shape, y_test.shape) #SVM 70

#Support Vector Machine Classifier cv=3, from sklearn.model_selection import GridSearchCV from sklearn.svm import SVC #knn for GridSearchCV crossvalidation param_grid_svm={'C':[1],'kernel':['poly'],'degree':[3],'gamma':[2]} GridCV_grid_svm = GridSearchCV(SVC(),param_grid_svm, return_train_score=False,verbose=1) # fitting the model for grid search GridCV_svm=GridCV_grid_svm.fit(X_train_svm,Y_train_svm) score_grid_svm = get_best_score(GridCV_svm) #model no GridCv clf_svm = SVC(C=1.0, kernel='poly', degree=3, gamma=2) clf_svm.fit(X_train_svm,Y_train_svm) # save the model import pickle pickle.dump(GridCV_svm, open(\"finalized_svm_model.pkl\", \"wb\")) # save the model to disk filename = 'finalized_svm_model.sav' pickle.dump(GridCV_svm, open(filename, 'wb')) # load the model svm_model = pickle.load(open(\"finalized_svm_model.pkl\", \"rb\")) pred_val_svm = svm_model.predict(X_test_svm) acc_svm = print_validation_report(Y_test_svm, pred_val_svm) plot_confusion_matrix(Y_test_svm, pred_val_svm) 3.3 ແບບຈາໍ ລອງ CNN import numpy as np import pandas as pd import random as rn import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns %matplotlib inline # plotly library from chart_studio import plotly 71

import plotly.graph_objs as go from plotly import tools from plotly.offline import init_notebook_mode, iplot init_notebook_mode(connected=True) from sklearn.preprocessing import LabelEncoder from sklearn.metrics import confusion_matrix, classification_report, accuracy_score from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score, KFold from sklearn.model_selection import RandomizedSearchCV, GridSearchCV from sklearn.pipeline import Pipeline from scipy.stats import uniform import itertools import warnings from sklearn.exceptions import ConvergenceWarning warnings.filterwarnings(\"ignore\", category=ConvergenceWarning) from keras.utils.np_utils import to_categorical from keras.utils import np_utils from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D, MaxPool2D from keras.layers import AvgPool2D, BatchNormalization, Reshape from tensorflow.keras.optimizers import Adadelta, RMSprop, Adam from keras.losses import categorical_crossentropy from keras.wrappers.scikit_learn import KerasClassifier from keras.models import load_model import tensorflow as tf import os print(os.listdir(\"../NUoL\")) img_rows, img_cols = 28, 28 np.random.seed(5) #get_best_score for GridSearchCV def get_best_score(model): print(model.best_score_) print(model.best_params_) print(model.best_estimator_) 72

return model.best_score_ #print Classification Report and Accuracy def print_validation_report(y_true, y_pred): print(\"Classification Report\") print(classification_report(y_true, y_pred)) acc_sc = accuracy_score(y_true, y_pred) print(\"Accuracy : \",acc_sc) return acc_sc cbar=True, square=True, #plot_confusion_matrix def plot_confusion_matrix(y_true, y_pred): mtx = confusion_matrix(y_true, y_pred) fig, ax = plt.subplots(figsize=(9,7)) sns.heatmap(mtx, annot=True, fmt='d', linewidths=.5, ax=ax) # square=True, plt.title(\"Confusion Matrix\") plt.ylabel('true label') plt.xlabel('predicted label') plt.show() #plot_history_loss_and_acc def plot_history_loss_and_acc(history_keras_nn): fig, axs = plt.subplots(1,2, figsize=(13,4)) axs[0].plot(history_keras_nn.history['loss']) axs[0].plot(history_keras_nn.history['val_loss']) axs[0].set_title('Model loss') axs[0].set_ylabel('loss') axs[0].set_xlabel('epoch') axs[0].legend(['train', 'validation'], loc='upper right') axs[1].plot(history_keras_nn.history['accuracy']) axs[1].plot(history_keras_nn.history['val_accuracy']) axs[1].set_title('Model accuracy') axs[1].set_ylabel('accuracy') axs[1].set_xlabel('epoch') axs[1].legend(['train', 'validation'], loc='upper left') plt.show() # read train train = pd.read_csv(\"full-mnist_NUoL70kPlus80train.csv\") print(train.shape) 73

train.head() # read test test= pd.read_csv(\"full-mnist_NUoL70kPlus80test.csv\") print(test.shape) test.head() # put labels into y_train variable Y_train = train[\"class\"] # Drop 'label' column X_train = train.drop(labels = [\"class\"],axis = 1) #Normalization, Reshape and Label Encoding # Normalize the data X_train = X_train / 255.0 test = test / 255.0 print(\"x_train shape: \",X_train.shape) print(\"test shape: \",test.shape) X_train = X_train.values.reshape((X_train.shape[0], 28, 28, 1)) X_test = test.values.reshape((test.shape[0], 28, 28, 1)) # Label Encoding from keras.utils.np_utils import to_categorical # convert to one-hot-encoding Y_train = to_categorical(Y_train, num_classes = 10) #Train Test Split #90:10% # Split the train and the validation set for the fitting from sklearn.model_selection import train_test_split X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size = 0.1, random_state=2) print(\"x_train shape\",X_train.shape) print(\"x_test shape\",X_val.shape) print(\"y_train shape\",Y_train.shape) print(\"y_test shape\",Y_val.shape) batchsize = 128 epochs = 10 activation = 'relu' adadelta = Adadelta() loss = categorical_crossentropy def cnn_model(activation): model = Sequential() ## Declare the layers layer_1 = Conv2D(32, kernel_size=3, activation=activation, input_shape=(28, 28, 1)) 74

layer_2 = Conv2D(64, kernel_size=3, activation=activation) layer_3 = Flatten() layer_4 = Dense(10, activation='softmax') ## Add the layers to the model model.add(layer_1) model.add(layer_2) model.add(layer_3) model.add(layer_4) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) return model model_cnn = cnn_model(activation) model_cnn.summary() #model_cnn_1.fit(X_train, y_train, batch_size=batchsize, epochs=epochs, verbose=1) history_cnn = model_cnn.fit(X_train, Y_train, validation_data=(X_val,Y_val), epochs=epochs, batch_size=batchsize, verbose=1) # save the model import pickle #keras_model.save(\"model.h5\") model_cnn.save(\"finalized_cnn_model-v1.h5\") # It can be used to reconstruct the model identically. cnn_model = tf.keras.models.load_model(\"finalized_cnn_model-v1.h5\") #Deploy to onnx import onnxmltools import tf2onnx import onnxmltools.utils cnn_model.compile(\"adam\",\"mse\") onnx_cnn_model = onnxmltools.convert_keras(cnn_model, target_opset=7) # Save as protobuf onnxmltools.utils.save_model(onnx_cnn_model, 'finalized_cnn_model-v1.onnx') import onnx #load onnx cnn_onnx_model = onnx.load('finalized_cnn_model.onnx') plot_history_loss_and_acc(history_cnn) # Predict the values from the validation dataset Y_pred = model_cnn.predict(X_val) # Convert predictions classes to one hot vectors 75

Y_pred_classes = np.argmax(Y_pred,axis = 1) # Convert validation observations to one hot vectors Y_true = np.argmax(Y_val,axis = 1) # Display some error results # Errors are difference between predicted labels and true labels errors = (Y_pred_classes - Y_true != 0) Y_pred_classes_errors = Y_pred_classes[errors] Y_pred_errors = Y_pred[errors] Y_true_errors = Y_true[errors] X_val_errors = X_val[errors] def display_errors(errors_index,img_errors,pred_errors, obs_errors): \"\"\" This function shows 6 images with their predicted and real labels\"\"\" n=0 nrows = 2 ncols = 3 fig, ax = plt.subplots(nrows,ncols, figsize=(8, 8),sharex=True,sharey=True) for row in range(nrows): for col in range(ncols): error = errors_index[n] ax[row,col].imshow((img_errors[error]).reshape((28,28)),cmap='binary') ax[row,col].set_title(\"Predicted label :{}\\nTrue label :{}\".format(pred_errors[error],obs_errors[error])) n += 1 plt.show() # Probabilities of the wrong predicted numbers Y_pred_errors_prob = np.max(Y_pred_errors,axis = 1) # Predicted probabilities of the true values in the error set true_prob_errors = np.diagonal(np.take(Y_pred_errors, Y_true_errors, axis=1)) # Difference between the probability of the predicted label and the true label delta_pred_true_errors = Y_pred_errors_prob - true_prob_errors # Sorted list of the delta prob errors sorted_dela_errors = np.argsort(delta_pred_true_errors) # Top 6 errors most_important_errors = sorted_dela_errors[-6:] # Show the top 6 errors display_errors(most_important_errors, X_val_errors, Y_pred_classes_errors, Y_true_errors) 76

3.4 ການປະເມນີ ຜນ Confusion Matrix y_test_arg=np.argmax(Y_val,axis=1) pred_val_cnn = np.argmax(model_cnn.predict(X_val),axis=1) plot_confusion_matrix(y_test_arg, pred_val_cnn) print(classification_report(y_test_arg, pred_val_cnn)) acc_cnn = accuracy_score(y_test_arg, pred_val_cnn) print(\"Accuracy: \",acc_cnn) # Compare our result fig, axis = plt.subplots(3, 4, figsize=(8, 10)) for i, ax in enumerate(axis.flat): ax.imshow(X_val[i].reshape(28, 28), cmap='binary') ax.set(title = \"Predicted digit {0}\\n True digit {1}\".format(pred_val_cnn[i], y_test_arg[i])) plt.show() ໂຄດການພດັ ທະນາ 4.1 ການຕງັ້ ຄາ່ EmguCV 1. ດາວໂຫຼດ EmguCV ເປັນ Libray ຂອງ Computer Vision, ນາໍ ໃຊໃ້ ນການປະມວນຜນຮບູ ພາບ, ສາມາດດາວໂຫຼດໄດຕ້ າມ URL ນ:້ີ https://www.emgu.com/wiki/index.php /Download_And_Installation 2. ການຕງັ້ ຄາ່ EmguCV ຮວ່ ມກບັ ພາສາ C# ພາຍຫຼງັ ດາວໂຫຼດ ແລະ ຕດິ ຕງັ້ ຫຼື ແຕກໄຟລບບີ ອດັ ສາໍ ເລັດ ຈະສະແດງດງ່ ັ ນ:ີ້ ຮບູ ທີ 27. ສະແດງໄຟລ Library ຂອງ EmguCV 77

3. ສາ້ ງ Project ແລະ ຕງັ້ ຄາ່ ໃຫພ້ າສາ C# ຮຈູ້ ກັ EmguCV ຢໃູ່ ນ Visual Studio 2019 Click ຂວາໃສ່ Reference>Add> ຮບູ ທີ 28. ສະແດງການເພ່ ີມ Library ຂອງ EmguCV 78

ຮບູ ທີ 29. ສະແດງການເພ່ ີມ Library ຂອງ EmguCV 79

ຮບູ ທີ 30. ສະແດງການເພ່ ີມ Library ຂອງ Project ຮບູ ທີ 31. ສະແດງຕງັ້ ຄາ່ ໃນການ Copy Library ໃສໃ່ ນ Project 80

4.2 ການເຊ່ ອຼື ມຕ່ ໍ ແບບຈາໍ ລອງ C# ແລະ ONNX Open Neural Network Exchange (ONNX) ເປັນຮບູ ແບບ(Format) ສາໍ ລບັ ການນາໍ ແບບ ຈາໍ ລອງຂອງ Deep Learning ອອກໄປນາໍ ໃຊ,້ ຊ່ ງຶ ເປັນເທັກນກິ ໃນການແລກປ່ຽນຂມໍ້ ນູ ທ່ ີແຕກຕາ່ ງ Platform ສາມາດເຊ່ ມຼື ຕ່ ໍກບັ ແບບຈາໍ ລອງໄດ.້ ONNX ໄດອ້ ານວຍຄວາມສະດວກໃຫກ້ ບັ ຜພູ້ ດັ ທະນາ AI ສາມາດນາໍ ແບບຈາໍ ລອງທ່ ີພດັ ທະນາຈາກຫຼາກຫຼາຍ Framework ສາມາດໂຮມມາເປັນມາດຕະຖານດຽວ ກນັ ໄດ(້ Bai et al., 2019). ພາສາ C# ສາມາດເຊ່ ອຼື ມຕ່ ໍກບັ ONNX ໄດດ້ ງ່ ັ ນ:້ີ 1. ຕດິ ຕງັ້ Library ຂອງ Microscoft.ML, Microsoft.ML.ImageAnalytics ແລະ Microsoft.ML.OnnxTransformer ຜາ່ ນທາງ NuGet Package ຮບູ ທີ 32. ສະແດງຕິດຕງັ້ Library ຜາ່ ນທາງ NuGet Packages ຮບູ ທີ 33. ສະແດງທດສອບການເຊ່ ມືຼ ຕ່ ໍແບບຈາລອງ 81

using Microsoft.ML; using Microsoft.ML.Transforms.Image; using System; using System.Collections.Generic; using System.Drawing; using System.IO; using System.Linq; namespace CustomVisionOnnx { class Program { public const int rowCount = 13, columnCount = 13; public const int featuresPerBox = 5; private static readonly (float x, float y)[] boxAnchors = { (0.573f, 0.677f), (1.87f, 2.06f), (3.34f, 5.47f), (7.88f, 3.53f), (9.77f, 9.17f) }; private static string[] testFiles = new[] { @\"C:\\\\Users\\\\Pheth\\\\source\\\\repos\\\\CustomVisionOnnx\\\\test\\\\digit-test.jpg\"}; static void Main() { Bitmap testImage; var context = new MLContext(); var emptyData = new List<DigitInput>(); var data = context.Data.LoadFromEnumerable(emptyData); var pipeline = context.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: \"data\", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(DigitInput.Image)) .Append(context.Transforms.ExtractPixels(outputColumnName: \"data\")) .Append(context.Transforms.ApplyOnnxModel(modelFile: @\"C:\\\\Users\\\\Pheth\\\\source\\\\repos\\\\CustomVisionOnnx\\\\Model\\\\model.onnx\", outputColumnName: \"model_outputs0\", inputColumnName: \"data\")); 82

var model = pipeline.Fit(data); var predictionEngine = context.Model.CreatePredictionEngine<DigitInput, DigitPredictions>(model); var labels = File.ReadAllLines(@\"C:\\\\Users\\\\Pheth\\\\source\\\\repos\\\\CustomVisionOnnx\\\\Model\\\\labels.txt \"); foreach (var image in testFiles) { using (var stream = new FileStream(image, FileMode.Open)) { testImage = (Bitmap)Image.FromStream(stream); } var prediction = predictionEngine.Predict(new DigitInput { Image = testImage }); var boundingBoxes = ParseOutputs(prediction.PredictedLabels, labels); var originalWidth = testImage.Width; var originalHeight = testImage.Height; if (boundingBoxes.Count > 1) { var maxConfidence = boundingBoxes.Max(b => b.Confidence); var topBoundingBox = boundingBoxes.FirstOrDefault(b => b.Confidence == maxConfidence); boundingBoxes.Clear(); boundingBoxes.Add(topBoundingBox); } foreach (var boundingBox in boundingBoxes) { float x = Math.Max(boundingBox.Dimensions.X, 0); float y = Math.Max(boundingBox.Dimensions.Y, 0); float width = Math.Min(originalWidth - x, boundingBox.Dimensions.Width); float height = Math.Min(originalHeight - y, boundingBox.Dimensions.Height); // fit to current image size x = originalWidth * x / ImageSettings.imageWidth; y = originalHeight * y / ImageSettings.imageHeight; 83

width = originalWidth * width / ImageSettings.imageWidth; height = originalHeight * height / ImageSettings.imageHeight; using (var graphics = Graphics.FromImage(testImage)) { graphics.DrawRectangle(new Pen(Color.Red, 3), x, y, width, height); graphics.DrawString(boundingBox.Description, new Font(FontFamily.Families[0], 30f), Brushes.Red, x + 5, y + 5); } testImage.Save($\"{image}-predicted.jpg\"); } } } public static List<BoundingBox> ParseOutputs(float[] modelOutput, string[] labels, float probabilityThreshold = .5f) { var boxes = new List<BoundingBox>(); for (int row = 0; row < rowCount; row++) { for (int column = 0; column < columnCount; column++) { for (int box = 0; box < boxAnchors.Length; box++) { var channel = box * (labels.Length + featuresPerBox); var boundingBoxPrediction = ExtractBoundingBoxPrediction(modelOutput, row, column, channel); var mappedBoundingBox = MapBoundingBoxToCell(row, column, box, boundingBoxPrediction); if (boundingBoxPrediction.Confidence < probabilityThreshold) continue; float[] classProbabilities = ExtractClassProbabilities(modelOutput, row, column, channel, boundingBoxPrediction.Confidence, labels); var (topProbability, topIndex) = classProbabilities.Select((probability, index) => (Score: probability, Index: index)).Max(); if (topProbability < probabilityThreshold) 84

continue; boxes.Add(new BoundingBox { Dimensions = mappedBoundingBox, Confidence = topProbability, Label = labels[topIndex] }); } } } return boxes; } private static BoundingBoxDimensions MapBoundingBoxToCell(int row, int column, int box, BoundingBoxPrediction boxDimensions) { const float cellWidth = ImageSettings.imageWidth / columnCount; const float cellHeight = ImageSettings.imageHeight / rowCount; var mappedBox = new BoundingBoxDimensions { X = (row + Sigmoid(boxDimensions.X)) * cellWidth, Y = (column + Sigmoid(boxDimensions.Y)) * cellHeight, Width = MathF.Exp(boxDimensions.Width) * cellWidth * boxAnchors[box].x, Height = MathF.Exp(boxDimensions.Height) * cellHeight * boxAnchors[box].y, }; the center // The x,y coordinates from the (mapped) bounding box prediction represent // of the bounding box. We adjust them here to represent the top left corner. mappedBox.X -= mappedBox.Width / 2; mappedBox.Y -= mappedBox.Height / 2; return mappedBox; } private static BoundingBoxPrediction ExtractBoundingBoxPrediction(float[] modelOutput, int row, int column, int channel) { return new BoundingBoxPrediction 85

{ X = modelOutput[GetOffset(row, column, channel++)], Y = modelOutput[GetOffset(row, column, channel++)], Width = modelOutput[GetOffset(row, column, channel++)], Height = modelOutput[GetOffset(row, column, channel++)], Confidence = Sigmoid(modelOutput[GetOffset(row, column, channel++)]) }; } public static float[] ExtractClassProbabilities(float[] modelOutput, int row, int column, int channel, float confidence, string[] labels) { var classProbabilitiesOffset = channel + featuresPerBox; float[] classProbabilities = new float[labels.Length]; for (int classProbability = 0; classProbability < labels.Length; classProbability++) classProbabilities[classProbability] = modelOutput[GetOffset(row, column, classProbability + classProbabilitiesOffset)]; return Softmax(classProbabilities).Select(p => p * confidence).ToArray(); } private static float Sigmoid(float value) { var k = MathF.Exp(value); return k / (1.0f + k); } private static float[] Softmax(float[] classProbabilities) { var max = classProbabilities.Max(); var exp = classProbabilities.Select(v => MathF.Exp(v - max)); var sum = exp.Sum(); return exp.Select(v => v / sum).ToArray(); } private static int GetOffset(int row, int column, int channel) { const int channelStride = rowCount * columnCount; return (channel * channelStride) + (column * columnCount) + row; } } class BoundingBoxPrediction : BoundingBoxDimensions 86

{ public float Confidence { get; set; } } } ແຜນດາໍ ເນນີ ງານໂຄງການ ແລະ ໄລຍະເວລາ ຮບູ ທີ 34. ສະແດງແຜນໄລຍະການດາເນນີ ໂຄງການ 87