2019 4th International Conference on Information Technology (InCIT2019) TABLE III involves the tasks for preparing the VM and all necessary INTERNAL AND PRODUCTION AND LOAD TESTS instruments for the examinees to attend the test. As for the VM, the system within the VM should be differentiated into Test Method Result Response/Action Server Specification three with each having the time zone configured to the one Fix the bugs VirtualBox, Windows 7, used at the designated location. Internal 1 Test 80 simultaneous users Some functional bugs 2 cores, 2 GB RAM using sample test items but no performance J. Operation and Maintenance problem The system has been designed and developed to run with Internal 2 Test 160 simultaneous users Some login and Tweak the problem and VirtualBox, Windows 7, a minimum if not zero failure. Tasks to ensure the quality of via Internet performance problem, determine the 2 cores, 2 GB RAM the software system has been performed in multiple levels. However, something might not work as expected. To increase but for most users still maximum number of the security level, the system has been designed as a closed usable PC clients for one system. No one has access directly to the VM or client server application. So, at the time the system should be maintained, for example, in case the server application needs to be fixed Internal 3 Test 380 simultaneous users Some login and It seems that the VirtualBox, Windows 7, for error or bug, a method to patch the system without via Internet performance problem, number of users 2 cores, 2 GB RAM compromising the security has been developed. The patch, if exceeds the number necessary, will be delivered to the technical personnel, and not usable for most that the server can then the personnel can run the patch without knowing the users handle normally root’s password and with minimum impact to the overall system. Production / 80 virtual/automated Automated test starts VirtualBox, Linux, 10 Load Test 1 simultaneous and finish normally cores, 10 GB RAM IV. EVALUATION examinees without performance Production / problem Lower the number of VirtualBox, Linux, 10 Overall, the test has been operated smoothly. Some prob- Load Test 2 480 virtual/automated simultaneous cores, 10 GB RAM lems have been identified at several locations, some could be Production / simultaneous Automated test stop at examinees addressed, but some still could not be addressed. This section Load Test 3 examinees the step of generating VirtualBox, Linux, 10 describes the result of the implementation, the problems 160 virtual/automated test items Lower a little more, cores, 10 GB RAM identified and the counteractions have been performed to Production / simultaneous should be between 80 address the problems. This section also reports the result of Load Test 4 examinees Automatic test starts and 120 (estimated) a survey we have conducted after the test to all technical and finish but with personnel to ask about their experiences during the test 160 virtual/automated performance problem VMware vSphere, 8 operation. simultaneous at the step of cores, 8 GB RAM examinees generating test items The most identified problem was the performance problem at the beginning of the test. The test items are generated at Automated test starts the time the examinee starts the test. The items are selected and finish normally from the items collection and stored them in the test items without performance table. It seems that if a number of examinees are doing problem this process simultaneously, a transaction waiting problem happens. We have determined the threshold for the number deployed at each test location, a technical personnel is nec- of examinees for the system running normally, but in a essary to conduct the deployment. The technical personnel particular environment, that number needs to be adjusted. must have a qualification as a computer institution or have The problem was successfully addressed by splitting the experience in computer networking. server from one into two VMs. However, at the location where splitting the server was not possible, the solution was G. Internal Test and Production and Load Test by differentiating the start time of the examinees by a few minutes. The test process consists of an internal test and a produc- tion and load test. Both tests are intended to conduct the Still related to the performance problem, we found that, operation test with the environment similar to the real-world at one location, it was a very bad server performance. After operation. The internal test is performed internally involving investigating the system and its configuration, we found that the real users while the production and load test is performed the reason was the nested virtualization. Unintentionally, the using the same computer system used for the real test but VM has been installed at a nested virtualization configura- with automated examinees. In the automated examinees, the tion. As the reason for the problem has been identified, the client application is modified to run the test automatically dedicated host machine was soon prepared and the problem and simultaneously with other client applications. It simu- is solved. lates the real operation where the examinees will be given instruction to start the test simultaneously. Table III shows From the survey we have conducted, 28 from the overall the tests, the results, and the counteractions. 35 technical personnel have answered the survey. The result is as follows: From the production and load tests, we found that the based TCExam seems to have a performance problem when a number of the examinees that start the test simultaneously exceed a particular number. From the test, we found that if the host uses VirtualBox, the threshold should be around 100 to 120 examinees in order to gain fair performance. So the countermeasure was to establish several VM according to the number of examinees attending the test. As for the VMware host, the number was higher. With 160 examinees, we still can get fair performance. H. Training The training process is a process to gather all technical personnel and give them training and instruction on how to operate the system. We conduct a half day training with hands-on. We have prepared a special VM for the training and provided to the trainees. The trainees then could try by themselves again at their own office or home to ensure they understand and know what they have to do. I. Deployment The deployment process involves all necessary tasks to make the system run at every test location. The process 277
2019 4th International Conference on Information Technology (InCIT2019) • None of the personnel reported a problem during instal- technical matter, which is how to control the system lation of the server. that involves people who in charge in the test location, including the technical and non-technical person. To • For the installation of the client application, one re- enable that, the standard operating manual and real-time spondent reported a problem during the distribution communication have been established. process of client application due to lack of knowledge of operating computer system of the local staff. V. CONCLUDING REMARKS • Four respondents were experiencing some client PCs In this paper, we have reported our implementation of a having a blue screen or become not responding. The computer-based test (CBT) in the new student recruitment solution was to restart the PC or replace the PC with process. The CBT has been implemented to replace the the reserved one. paper-based test (PPT). To implement the CBT, we described how the software engineering practices have been introduced • Four respondents were experiencing performance prob- to develop a high-quality system based on an open source lem due to the simultaneous access. Two of them were software and to deliver the system within the designated experiencing the server become not responding. The schedule. The result was quite successful, despite some problem was solved by differentiating the start time. performance problems have been identified but still could be addressed by troubleshooting the problem. The main cause Ninety-three percent of the personnel was saying that the of the performance problem remains untouched and needs to installation processes were very easy or fairly easy, where be addressed. 82% for the former and 11% for the latter. As for the distribution of the client application, some personnel was Other future work is to develop the feature to deliver test saying they have some difficulties, due to the number of data to the central server from the virtual machine (VM) client PCs and lack of centralized application for deploying directly from the test location. The feature addresses the file or application. As for the test performance, one third problem when recovering the VM after the test is failed in were experiencing a low performance of test especially in the case where the VM file become corrupted or lost during the beginning of the test. the delivery. From the evaluation, there are some findings which need to ACKNOWLEDGMENT be considered when implementing a CBT system, especially when building a system on top of an open source system. A part of this work was supported by JSPS KAKENHI Grant Number 19K04920. • From the system architecture point of view, there are several types of CBT system. We need to choose one REFERENCES which meets our requirements. [1] C. G. Parshall, J. A. Spray, J. C. Kalohn, and T. Davey, Practical Con- • For offline, or semi-online, we can use virtualization siderations in Computer-Based Testing. Springer Science+Business technology to simplify the deployment process. Media, LLC, 2002. • Despite the hardware specification or the VM guest’s [2] N. Asuni, “TCExam,” Tecnick.com, 2004-2017. configuration, different virtualization platform can give [3] C. Y. Piaw, “Replacing paper-based testing with computer-based a different performance. To ensure the server capability, a real test needs to be conducted. An automated test (a testing in assessment: Are we doing wrong?” in 12th International test robot application) can be introduced. Educational Technology Conference - IETC 2012, 2012, pp. 655–664. [4] K. O. Michael Russell, Amie Goldberg, “Computer-based testing and • Building a system on top of an open source system is validity: a look back into the future,” Assessment in Education, vol. 10, economically beneficial and also can reduce the time no. 3, pp. 279–293, 2003. and effort for development. However, in our case, we [5] M. M. Llabre, N. E. Clements, K. B. Fitzhugh, G. Lancelotta, R. D. have learned that the OSS we have used as the base Mazzagatti, and N. Quinones, “The effect of computer-administered system did not perfectly fit with our environment. The testing on test anxiety and performance,” Journal of Educational load of the system at the start, where all examinees Computing Research, vol. 3, no. 4, pp. 429–433, 1987. start the test at the same time, was very high. The [6] T. J. Ward, Jr., S. R. Hooper, and K. M. Hannafin, “The effects same problem occurs when at the end of the test, all of computerized tests on the performance and attitudes of college examinees finish the test, where the system calculates students,” Journal of Educational Computing Research, vol. 5, no. 3, the final score of each examinee. The cause was that pp. 327–333, 1989. the test is generated when the examinee starts the test. [7] J. Hardcastle, C. F. Herrmann-Abell, DeBoer, and E. George, “Com- To address the problem of the high server loads when a paring student performance on paper-and-pencil and computer-based- large number of examinees starting the test, we pre- tests,” in Annual Meeting of the American Educational Research generate the test for each examinee before the test Association, April 2017. started. [8] ITC Guidelines on Computer-Based and Internet Delivered Testing, International Test Commission, July 2005. • Regarding the cost and delivery, with CBT, we eliminate [9] N. McCulloch and B. S. Sjahrir, “Endowments, location or luck?: the cost of printing the test materials as well as the Evaluating the determinants of sub-national growth in decentralized cost for logistics. Accordingly, the delivery time for test indonesia,” The World Bank, November 2008. materials have been reduced. [10] S. Meiningsih and T. Pratiwi, “The usage of ICT by households and individual in indonesia,” in 10th World Telecommunication/ICT • Despite there is nothing new in the technology being Indicator Meeting (WTIM-12). Information Telecommunications used, since the solution is by using the offline archi- Union, September 2012. tecture (see Fig. 2), the challenge is rather in the non- [11] Ministry of Communications and Information Technology, Republic of Indonesia, “ICT indicator infographic (in Indonesian lang.),” Tech. Rep., 2016. 278
2019 4th International Conference on Information Technology (InCIT2019) Physical Activity Recognition Using Streaming Data from Wrist-worn Sensors Katika Kongsil Jakkarin Suksawatchon Ureerat Suksawatchon Mobile Application Developers Mobile Application Developers Mobile Application Developers Incubation Laboratory Incubation Laboratory Incubation Laboratory Faculty of Informatics Faculty of Informatics Faculty of Informatics Burapha University Burapha University Burapha University Chonburi, Thailand Chonburi, Thailand Chonburi, Thailand [email protected] [email protected] [email protected] Abstract—Most of the existing researches in smartwatches activity recognition which used triaxial accelerometer sensors based activity recognition focused on developing the subject or multiple sensors, achieved more than 90% accuracy, but (user) specific approach or personal model which the subject those sensors were attached to different positions on the must collect the labeled data for training the model. It is incon- subjects’ body like arms, chest and waist etc. for gathering venient for the users are unable to perform all activities during the sensor data. It makes these approaches may not be the specified times. In this paper, we introduce a cross subjects practical usage in the real living situations because the approach or impersonal activity recognition model based on the sensors over body may obstruct for performing the activities. fusion of two sensors embedded on smartwatches called S-PAR. Despite, there exist some researches have studied the use It stands for Smartwatches based Physical Activity Recognition. of a single triaxial accelerometer attached to only waist, Therefore, the users who utilize the model, are not necessary to chest or kept smartphone in the pocket for smartphone-based gather initial the labeled data. The experiments were carried activity recognition [7], however, placing the sensors on those out to examine the performance of S-PAR model against with locations cause obstruct and discomfort in doing the daily state-of-the-art methods by using two public databases collected activity especially in elderly persons or limit for women who under realistic conditions. From the results, S-PAR model usually kept the smartphone in their handbag not the pocket. provides the overall performance in detection and prediction activities type. Therefore, our proposed model can be used in As wrist-worn devices such as smartwatches, fitness wrist- the real life environment. bands etc., become widespread usage, it has integrated the multiple sensors which are easier to detect the physical Index Terms—Activity recognition, Smartwatches, Machine activities. There exists many approaches that aim to detect learning, Impersonal model, Wrist-worn sensors and identify human activity types. Most of existing works built the activity recognition model depended on a subject I. INTRODUCTION specific approach or personal model. That is the target users must gather and annotate the types of activities by performed In the current situation, non-communicable diseases all activities for a definite time and enough training data. In (NCDs) are the world’s number one health problem which addition, the subject has to perform the activities with no is increasing in low and mid-income countries. The greatest movement constraint or less movement, i.e., standing still, disease burden of NCDs is from diabetes, cardiovascular sitting on a chair, walking without swinging arms etc. This diseases, cancers, hypertension, and obesity. It dues to five is often not practicable and discomfort, especially, elderly or main NCDs risk factors, including unhealthy diet, tobacco patients who cannot perform all activities. use, air pollution, and physical inactivity or unhealthy habits [1]. Especially, physical inactivity is considered as one of Therefore, in this study we introduce the new model called the biggest public healthy problems because of sedentary Smartwatch-based Physical Activity Recognition or S-PAR lifestyle such as hypersomnia, office working, sitting in front using data collected from a triaxial accelerometer and a of a computer all the day, or playing games for long hours. triaxial gyrometer which built on smartwatches. We focus Strategies for health promotion should be some efficient on common physical activities usually performing in daily applications or mechanisms to track or monitor for quantify- life such as standing, sitting, lying, walking, stairs up, stairs ing physical activities. To measure the quantifying physical down, and running. The S-PAR is composed of two com- activities, the automated detection of physical activity types ponents: modeling component and recognition component. (for example, walking, running, sitting) doing during the day The modeling component is an offline processing which aims should be analyzed first by an activity recognition method to building two activities recognition models. One model is [2]. used to detect and predict the dormant activities types and the another is used for the energetic activities types. The As a continuous growth of technology, most smartphones recognition component is a online processing which uses and smartwatches are being integrated with multiple sensors, those two models to detect and predict the current activities i.e., accelerometers, magnetometers, gyrometers etc., and type from streaming sensor data in real time manner. wireless communication [3]. These advancements can be utilized to identify the physical activity type by analyzing the sensors data. Most researches focused on smartphone-based physical activity recognition, for example, those proposed in [4]–[6]. Although many researches in smartphone-based 279
2019 4th International Conference on Information Technology (InCIT2019) The main contributions of our research are as follows: The authors claimed that their model can reduce the amount of training data about 46% with an average accuracy 92%. • We present the new model which is a cross subjects However, there some limitations of this model. For example, activity recognition model or impersonal model. There- if the active learning provided queries for confirmation the fore, the target users who utilize the model, have not current activity or unknown activity, but the users cannot prepare the training data because our proposed model respond the system immediately. So, the retraining cannot can be built once and used training data from the other performed and the performance will be decreased. users. III. THE PROPOSED ACTIVITY RECOGNITION • We used the training data obtained the public databases. These datasets were collected under realistic constrains, In this section, we described our proposed the Smartwatch- i.e., walking in the city with swinging arms, running based Physical Activity Recognition which called S-PAR in a forest, or climbed us the stairs of old castle [8]. for shortly as shown in Fig. 1. Our proposed framework Therefore, our proposed model can use in the real life is composed of two components which are the modeling environment. component and the recognition component. The modeling component is to build dormant activity model and energetic II. RELATED WORK activity model. For the recognition component is used to identify physical activity type from the streaming sensory There have been several studies which have focused on data produced from sensors in real time. wrist-worn devices like smartwatches or fitness wristbands for activity recognition. A. Sensory data understanding For example, Da-Silva, et al. [9] proposed two activity This study used streaming sensory data obtained from recognition systems with a wrist-worn accelerometer sensor two sensors of wrist-worn device like smartwatch, which to recognize eight physical activities such as standing, sitting, widely used in physical activity recognition. For the first lying, running, walking, stairs up, stairs down, working on a sensor, the single accelerometer sensor is used to measure computer. The first system has a single classifier to identify the acceleration signals along the X-axis, Y-axis and Z-axis all activities and the second system added a pre-classifier to named Ax, Ay and Az values respectively. In generally, each separate type of activities: movement of postures. The two acceleration signal is combined between body acceleration architectures were evaluated with MLP neural networks, k- due to the movement of the users’ body and constant (grav- NN, support vector machine (SVM) algorithms. The result itational) acceleration due to gravity force [4]. The second shows the best performance with accuracy of 93.47% by sensor is the single gyroscope sensor used to measure the using SVM. From studied those two pieces of research, it angular velocity along three axes named Gx, Gy and Gz was found that the best accuracy of more than 90% was values. Typically, the gyroscope sensor mostly uses together achieved by using only personal data or a subject-specific with accelerometer sensor to detect the movement activities approach that is the target user has to collect and label like walking, running, etc. [11]. The samples of streaming data. Moreover, the subject has to perform activities with dataset are depicted in Fig. 2. no movement constraint and discomfort, i.e., walking with normal speed or standing still. This situation is not feasible For further using, some notations are defined here. Let for elders or patients who are unable to perform all activities ACT be the set of activities in labeled training data. with the constraints. Each Act is consisted of the set of samples S, where S = {s1, s2, . . . , si, . . . , sN }, and N is the number of sam- Shoaib et al. [7] focused on the fusion of wrist-worn ples in each activity. Each si is defined as a 7 − tuple (smartwatches) devices and smartphones for physical activity (Axi, Ayi, Azi, Gxi, Gyi, Gzi, ti) where ti is the activity recognition. Seven subjects had to carry smartphones in the type of the sample si. In this research, the basic movements right jeans pocket and smartwatches on their right wrist were selected which are standing, sitting, lying, walking, position to collect the training data. They performed 7 basic stairs up, stairs down, and running. activities like walking, jogging, biking, etc., and 6 complex activities such as typing, drinking coffee, giving a talk, B. Data Preprocessing smoking. The average and standard deviation were extracted from accelerometer and gyroscope sensor to recognize 13 ac- The streaming data collected from smartwatches were pre- tivities. For efficiency analysis, three classification algorithms processed for noise reduction. In this work the acceleration including decision tree, SVM, and k-NN, were selected. The values due to gravity (constant acceleration) were considered result shows the fusion of two devices could achieve with as noise. We segment the signal data based on the 2-second high accuracy. However, in case of the subjects did not carry non-overlapping sliding window. In each window j, we both devices or the position of two devices on subjects’ applied the low-pass filter for each axis of the acceleration body can impact the recognition accuracy. Besides, Shah- in order to separate the body acceleration and the constant mohammadi et al. [10] proposed smartwatch-based activity acceleration due to gravity force. The 3rd order Butterworth recognition with active learning to provide personalize model. low-pass filter with a cut off of 0.3 Hz was used [12] to Each subject who utilize the model had to prepare the training obtain the constant acceleration, and then these raw signals data by performing 5 commonly daily activities including in window j were subtracted with the constant acceleration standing, sitting, laying down, walking, and running in a to extract the body acceleration [13]. After performing this certain of time. In addition, Shahmohammadi and the team step, the gravitation acceleration data must be filter out, so introduced retraining the model by inquiry true activity types we obtained the body accelerations of the signals for using from the subjects when the unknown activities occurred. 280
2019 4th International Conference on Information Technology (InCIT2019) Fig. 1: The proposed physical activity recognition system (S-PAR). Fig. 2: The samples of streaming dataset. of three axes of walking, and running are high vibrant. These signals are the energetic activities. On the other hand, the acceleration of standing, sitting and lying activities are influenced from the gravity force, and all the acceleration values are almost still [5]. These signals are called the dormant activities. This work applied the T hreshold finding from [5] for separating the incoming signal to dormant or energetic activities. This T hreshold value is used in recognition component. The steps of finding this T hreshold as the follows: Step 1: For each activity in Act and for each window j, the magnitude of a sample si (Mi) is computed using Eq. 1. Then we calculate the standard deviation of window j (SDj) by using Eq. 2. (a) Acceleration signal (b) Body acceleration Mi = Axi2 + Ayi2 + Azi2 (1) where i = 1, 2, 3, . . . , n; and n is the number of samples within window j. In this paper, n is predefined as the sampling rate. SDjAct = 1 n (2) n−1 (Mi − M¯j )2 (c) Constant acceleration i=1 Fig. 3: The result of extraction the acceleration signals with where M¯j is the average of magnitude in window j. the low-pass filter. Step 2: The average of the standard deviation for each in the next step. The results after applied data preprocessing activity in (SDAAvctg) is computed by using Eq. 3. step are shown in Fig. 3. SDAAvctg = J S DjAct (3) C. Modeling Component j=1 The modeling component is an offline processing which J aims (i) to find the T hreshold value for separation the category of activities, and (ii) to build a classifier model where J is number of windows for each activity. for each category of activities. One model is for dormant activities and the another is for energetic activities. Step 3: We compute the minimum of the average of stan- dard deviation considered only energetic activities (SDmenienr), 1) Finding Threshold: We began with consider the pattern and the maximum of the average of standard deviation of acceleration signal. It was found that the acceleration in dormant activities (SDmdoarx). Then, the T hreshold is computed from the average of these value by using Eq. 4. T hreshold = SDmdoarx(M) + SDmenienr(M) (4) 2 281
2019 4th International Conference on Information Technology (InCIT2019) 1 n n RM SjA(cXt) = x2i RM SjA(cYt) = RM SjA(cZt) = i=1 1 n n yi2 (6) i=1 1 n n zi2 i=1 Fig. 4: The T hreshold value represented by green line. SkewjA(cXt ) = 1 ni=1(xi − x¯j )3 n n x¯j )2 3 1 i=1 n (xi − Finally, this T hreshold can be used to separate the SkewjA(cYt) = 1 in=1(yi − y¯j )3 categories of activity into dormant activities and energetic n activities as shown in Fig. 4. n y¯j )2 3 (7) 1 i=1 2) Building The Dormant Activities Recognition Model: n (yi − The only constant acceleration signals along X-axis, Y- axis and Z-axis were used, and the process of building the SkewjA(cZt) = 1 n (zi − y¯j )3 proposed model was described as follows: n i=1 Step 1: For each activity in Act ∈ {sitting, standing, 1 n z¯j )2 3 lying}, the constant acceleration along three axes defined n i=1 as (graXi, graYi, graZi) are transformed to the average of (zi − constant acceleration related to each axis in each window j denoted as (M eanAj(cgtraX), M eanjA(cgtraY ), M eanjA(cgtraZ)) by IQRjA(cXt ) = Q3jA(cXt ) − Q1Aj(cXt ) (8) using Eq. 5. I QRjA(cYt) = Q3jA(cYt) − Q1Aj(cYt) IQRjA(cZt) = Q3Aj(cZt) − Q1jA(cZt) M eanjA(cgtraX) = n graXi Step 2: In this study, we compared the performance of five M eanAj(cgtraY ) = i=1 classifiers including the decision tree, k-Nearest Neighbors M eanjA(cgtraZ) = (kNN), random forest, Support Vector Machine (SVM) with n n RBF kernel, and Na¨ıve Bayes. The experimental results are i=1 shown in Section IV, and SVM with RBF kernel gave the graYi (5) best performance. Therefore, SVM was used for the energetic activities recognition model. n n i=1 graZi D. Recognition component n This section explained how to apply the S-PAR model to detect and predict the current activity type from streaming Step 2: In order to search the best classifier model, five sensory data. Fig. 1 shows the recognition component pro- different techniques were evaluated. The decision tree, k- cessing. We simulated the real-time environment for usage Nearest Neighbors (kNN), random forest, Support Vector S-PAR model by feeding the continuous streaming data from Machine (SVM) with RBF kernel, and Linear Discriminant the testing data. A 2-second non-overlapping sliding window, Analysis (LDA) techniques were selected for this purpose. was applied to deal with the stream of data into a small The experimental results are shown in Section IV. After data chunk. So, each small data chunk was performed in the evaluation with the average of constant acceleration for all following steps. dormant activites, the LDA presented the best performance. Therefore, LDA was used for the dormant activities recogni- Step 1: In Data Preprocessing step, the only acceleration tion model. signals along three axes were separated to the body accelera- tion and the constant acceleration due to gravity by using the 3) Building The Energetic Activities Recognition Model: 3rd order Butterworth low-pass filter. After that we calculated The body acceleration signals and angular velocity signals the magnitude (Mi) of each raw acceleration signal and then related three axes were used. Now, we defined new notations compute the standard deviation (SDw) of all the magnitude for easier to understanding. So, these two signals along (Mi) in such data chunk. X-axis, Y-axis and Z-axis were denoted as xi, yi, and zi respectively. The process of proposed model was described Step 2: If SDw is more than or equal T hreshold, then the as follows: incoming signals will be the dormant activities. Otherwise, the incoming signals will be the energetic activities. Step 1: For each activity in Act ∈ {walking, stairsup stairsdown, running}, each signal along three axes is trans- Step 3: If the incoming signals were dormant activities, formed to the root mean squared (RM S), skewness (Skew), then the only constant acceleration signal along three axes and interquatile range (IQR) in each window j by using Eqs. was computed the average of each axis by using Eq. 5. Then 6, 7, and 8. After performing this step, there are 18 features we applied these average values to Dormant Activities Model used in the next step. (LDA) to obtained the predicted activity type. Otherwise if 282
2019 4th International Conference on Information Technology (InCIT2019) the incoming signals were energetic activities, then both the standard deviation. Why did all of algorithms obtained the body acceleration signals and angular velocity signals along results lower 80% on DS2? Because DS2 dataset was carried three axes were computed root mean square (RM S), skew- out under realistic constraints in real life environment as ness (Skew) and interqutitle (IQR) of each axis. After that, mentioned before. Therefore, we can conclude that the S- we applied the transformed signals to Energetic Activities PAR model can achieve the best performance with the lowest Model (SVM) to obtained the predicted activity type. dispersion in all datasets compared with 3 models. IV. RESULTS AND DISCUSSION TABLE II: The average of F-score measurement on public datasets. In this section, we presented how to evaluate S-PAR model and display the experimental results on two public datasets. Algorithms DS1 DS2 Architecture 1 (2013) [9] 79.14±22.65 % 68.67±13.26 % A. Experiments Setup Architecture 2 (2013) [9] 85.70±19.53 % 66.84±11.72 % 86.28±16.65 % 62.75±15.35 % The dataset for this research were collected from the public Research (2015) [7] 88.62±11.71 % 71.52±11.34 % databases. The first dataset (DS1) is the Complex Human S-PAR Activities Dataset from pervasive system research datasets [14]. This dataset contains a linear acceleration and a angular C. Evaluation the Classifier Models velocity values of 10 participants. The participants performed 6 activities listed in Table I. The DS1 dataset were collected As mentioned before in subsection III-C, the S-PAR model from Samsung Galaxy S2 Smartphone attached on the right uses LDA and SVM as the classifiers for detection and wrist with the sampling frequency is 50Hz. The second prediction the current activity type. We compared their per- dataset (DS2) is acquired from [8]. This data set contains a formance to the other common classification techniques and linear acceleration data and a triaxial angular velocity values evaluated with LOSO cross validation. Each classification collected from 15 subjects. Each subject wore LG G Watch R algorithm was implemented with Scikit-learn which is a free and performed 7 physical activities under realistic conditions software machine learning library for the Python program- such as the subjects walked through the city or jogged in a ming language, and the parameters of each algorithm were forest etc. [8]. The details of DS2 dataset are listed in Table set by using the default. Tables III and IV demonstrate the I. The sensors on device were sent with a sampling rate of efficiency of the different classification techniques performed 50Hz. in two datasets using the F-score measurement. Table III shows that LDA can achieve the high performance for dor- TABLE I: The number of samples for each activity in two mant activities in both datasets. For energetic activities, LDA datasets. cannot deal with the non-linear data and overlapping data, so we decided to use Na¨ıve Bayes in stead of LDA. Table Activity Types DS1 DS2 IV demonstrates that SVM can obtain the best efficiency in overall. Standing 90,000 155,788 Sitting 90,000 138,063 V. CONCLUSIONS Lying 142,316 Walking - 138,409 In this research, we present the new activity recognition Stairs Up 90,000 114,703 model using the streaming sensor data produced from the Stairs Down 90,000 116,806 accelerometer and gyroscope embedded on smartwatches. Running 90,000 153,089 The proposed model is a cross subjects approach or an 90,000 impersonal activity recognition model. That means the model was built at once, and the new users can be utilized without To evaluate the performance of the proposed model, we requiring the training data from those users. For building used a leave-one-subject-out (LOSO) cross validation [10]. the activity recognition, we examined varities of machine Therefore, the data of one subject were used as the test dataset learning algorithms. LDA and SVM presented the best results and the data of the other subjects were used as the training and were applied for dormant activites model and energetic dataset. This process was repeated for every subject as the activities model, respectively. In additional, the proposed test dataset. In addition, the proposed model (S-PAR) used model was carried out to examine the performance compared F-score [15] to measure the efficiency compared with state- with the other models. From the results, the proposed model of-the-art models which are two architectures proposed in [9] provides the overall performance in detection and prediction and the research [7]. activities types. For a future research, we have planned to enhance S-PAR model to detect and predict the activities F − score = 2 × (P recision × Recall) (9) types more efficiency. (P recision + Recall) ACKNOWLEDGMENT B. Evaluation the Activity Recognition This work was financially supported by Research Grant In order to show the performance of S-PAR, we compared of Burapha University through National Research Council of it to state-of-the-art methods. Table II shows the average of Thailand in Grant no. 141/2560 and no. 113/2561. F-score matrices of all activities on two public datasets. From these results, S-PAR obtains the best performance which is able to achieve at rate 88.62% on DS1 dataset with the lowest standard deviation. In the same way, S-PAR presents the best results for DS2 at accuracy rate 71.52% with the lowest 283
2019 4th International Conference on Information Technology (InCIT2019) TABLE III: F-score measure metric for each classifier in dormant activities. Activity Types Decision Tree Classification techniques k-NN Random forest SVM LDA DS1 98.35±2.55% 98.92±2.00% 98.50±2.25% 98.92±1.95% 98.92±2.00% Standing Sitting 98.61±2.06% 99.26±1.00% 98.79±1.70% 99.27±0.93% 99.27±1.00% DS2 Standing 71.30±12.90% 74.57±12.86% 74.60±13.02% 74.68±13.51% 75.43±15.18% Sitting 55.63±22.28% 58.80±25.40% 52.48±23.00% 53.19±23.66% 69.51±27.57% Lying 37.01±37.05% 43.01±62.13% 43.09±41.18% 46.26±37.71% 70.56±35.78% TABLE IV: F-score measure metric for each classifier in energetic activities. Activity Types Classification techniques DS1 Decision Tree k-NN Random forest SVM Na¨ıve Bayes Walking Stairs Up 52.79±22.6% 66.74±20.44% 66.48±27.21% 75.12±22.128% 66.40±33.54% Stairs Down 55.41±9.89% 63.61±13.70% 68.26±11.05% 75.31±9.44% 61.15±22.92% Running 71.66±11.56% 76.96±14.00% 81.71±9.11% 81.37±11.07% 68.81±20.38% DS2 98.62±1.51% 99.17±1.17% 99.17±0.94% 99.24±1.57% 97.71±4.05% Walking Stairs Up 54.20±10.61% 62.13±7.75% 64.00±9.22% 68.37±10.21% 63.22±12.59% Stairs Down 40.75±12.16% 41.78±12.43% 43.96±13.82% 50.02±14.88% 29.30±17.15% Running 62.74±9.82% 66.52±7.74% 68.68±10.41% 74.21±8.18% 54.58±16.95% 89.04±9.86% 90.00±10.45% 89.91±10.00% 87.99±9.58% 87.65±11.03% REFERENCES [15] M.-C. Kwon and S. Choi, “Recognition of daily human activity using an artificial neural network and smartwatch,” Wireless Communications [1] I. Vuori, “World health organization and physical activity,” Progress and Mobile Computing, 2018. in Preventive Medicine, 2018. [2] A. Bonomi and K. Westerterp, “Advances in physical activity moni- toring and lifestyle interventions in obesity: A review,” International Journal of Obesity, pp. 167–177, 2005. [3] M. Kheirkhahan, S. Nair, A. Davoudi, P. Rashidi, A. A. Wani- gatunga, D. B. Corbett, T. Mendoza, T. M. Manini, and S. Ranka, “A smartwatch-based framework for real-time and online assessment and mobility monitoring,” Journal of Biomedical Informatics, pp. 29 – 40, 2019. [4] A. Bayat, M. Pomplun, and D. A. Tran, “A study on human activity recognition using accelerometer data from smartphones,” Procedia Computer Science, pp. 450–457, 2014. [5] T. Dungkaew, J. Suksawatchon, and U. Suksawatchon, “Impersonal smartphone-based activity recognition using the accelerometer sensory data,” in 2017 2nd International Conference on Information Technology (INCIT), 2017, pp. 1–6. [6] Y. Lu, Y. Wei, L. Liu, J. Zhong, L. Sun, and Y. Liu, “Towards unsuper- vised physical activity recognition using smartphone accelerometers,” Multimedia Tools and Applications, pp. 10 701–10 719, 2017. [7] M. Shoaib, S. Bosch, H. Scholten, P. J. M. Havinga, and O. D. Incel, “Towards detection of bad habits by fusing smartphone and smartwatch sensors,” in 2015 IEEE International Conference on Pervasive Com- puting and Communication Workshops (PerCom Workshops), 2015, pp. 591–596. [8] T. Sztyler, H. Stuckenschmidt, and W. Petrich, “Position-aware activity recognition with wearable devices,” Pervasive and mobile computing, pp. 281–295, 2017. [9] F. G. da Silva and E. Galeazzo, “Accelerometer based intelligent system for human movement recognition,” in 5th IEEE International Workshop on Advances in Sensors and Interfaces IWASI, 2013, pp. 20–24. [10] F. Shahmohammadi, A. Hosseini, C. E. King, and M. Sarrafzadeh, “Smartwatch based activity recognition using active learning,” in Proceedings of the Second IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technolo- gies. IEEE Press, 2017, pp. 321–329. [11] Y. Kwon, K. Kang, and C. Bae, “Unsupervised learning for human activity recognition using smartphone sensors,” Expert Systems with Applications, pp. 6067–6074, 2014. [12] M. T. Uddin, M. M. Billah, and M. F. Hossain, “Random forests based recognition of human activities and postural transitions on smartphone,” in 2016 5th International Conference on Informatics, 2016, pp. 250–255. [13] A. M. Khan, A. Tufail, A. M. Khattak, and T. H. Laine, “Activity recognition on smartphones via sensor-fusion and kda-based svms,” International Journal of Distributed Sensor Networks, 2014. [14] M. Shoaib, S. Bosch, O. Incel, H. Scholten, and P. Havinga, “Complex human activity recognition using smartphone and wrist-worn motion sensors,” Sensors, p. 426, 2016. 284
2019 4th International Conference on Information Technology (InCIT2019) AppDOSI: An Application for Analyzing and Monitoring the Personal Software Process Nuansri Denwattana Apisit Saengsai Eakachai Charoenchaimonkon Software engineering program Centre for Educational Technology Information System Engineering Faculty of Informatics Research Laboratory Ministry of Education Burapha University Faculty of Informatics Bangkok, Thailand Chonburi, Thailand Burapha University [email protected] [email protected] Chonburi, Thailand [email protected] Abstract— The software quality can be achieved by the evaluation and discussion for the further work. Section 5 process improvement; software engineers aim to analyze and gives the summary and conclusions. monitor the whole process of the development. This paper presents 'AppDOSI ', a web application that allows software II. RELATED WORKS engineers to manage and keep track of their work This section can be divided into two parts. The first performance. The design of the application was in line with the section reviews the theory of PSP and PROBE, while the principle of the Personal Software Process (PSP). The later gives a comparison of existing tools that support PSP application has been used extensively in the Software and PROBE. Developing Camp and gained a positive feedback from A. PSP and PROBE software engineering undergraduates as well as their course The Personal Software Process (PSP) is a framework of instructors. techniques to help engineers improve their performance, and that of their organizations, through a step-by-step disciplined Keywords— personal software process, PROxy-Based approach to measuring and analyzing their work [1]. The estimating, tracking individual performance, software goal of PSP is to develop a high-quality product on time and architectural design, PSP in budget [2]. PSP can be taught and be applied in different software engineering tasks [3]. The process consists of a set I. INTRODUCTION of methods, forms, and scripts that show software engineers how to plan, measure and manage their work. It was Performance testing and monitoring methods are the key designed to be used with any programming language or factors to the product success; this is the reason why a vast design methodology, and it can be used, for most aspects of number of software developers pay attention to the process software work, including writing requirements, running tests, of development rather than the choice of technology. Since defining processes, and repairing defects [4]. analyzing and measuring performance are not a trivial task, the recommendation had been made for the software PSP helps software engineers to (1) improve their engineers to focus on improving the processes. Improved estimating and planning skills, (2) make commitments they software processes will lead to the improve of product can keep, (3) manage the quality of their projects and (4) quality [1]. reduce the number of defects in their work; in other words, PSP explains: Enhancing learners' skill in software development and externalizing their developed skill to a team are also - How to accurately estimate, plan, track, and preplan the challenging issues for software engineering undergraduate time required for individual software development efforts learners. Collecting project records can help learners to identify their work performance, e.g., software size, time - How to work according to a well-defined process management and defects in handling. The researchers - How to define and refine the process adopted the theory of Personal Software Process (PSP) and - How to use reviews effectively and efficiently for PROxy-Based Estimating (PROBE), proposed by Humphrey improving software quality and productivity (by finding [1], and investigated tools that can organize and support the defects early) software project management teams. We conclude by - How to avoid defects developing and testing a tool called 'AppDOSI' (Taking a - How to analyze measurement data for improving look) which is a web-based application that allows software estimation, defect removal, and defect prevention engineering students and their instructors used to estimate - How to identify and tackle other kinds of process and plan their projects, measure and track their work and deficiencies [5]. improve the quality of the products. The rationale behind the application is in line with Humphrey[1]. In terms of PROBE, it uses proxies or objects as the basis for estimating the likely size of a product. Software The structure of the paper is as follows. Section 2 engineers have to determine the objects required to build the reviews the concept of PSP and PROBE, while existing tools product described by the conceptual design. They then that support PSP and PROBE are gathered and presented. Section 3 depicts the software architectural design and the development of ‘AppDOSI’. Section 4 presents the 285
2019 4th International Conference on Information Technology (InCIT2019) determine the likely type and number of methods for each 3 Private Dashboard Public object. They refer to historical data on the sizes of similar & Profile Profile objects they have previously developed and use linear Software Development regression to determine the likely overall size of the finished Expertise product. It is worth mentioning, at this point, that to use linear regression, the engineers must have historical data on 2 Software Development Tools estimated versus actual program size for at least three prior programs [4]. Software Development Gathering & Monitoring B. Related Tools Throughout our knowledge and what we explored, we 1 Team Organization 1 1 elicit four PSP-related tools that can provide some features Software that help and support PSP learning and developing the Development Personal process. First is the 'PSP Studio' [6]. This tool can provide a PSP Official Workbook and forms; however, it does not Repository come with the LOC (Lines Of Code) counter. Second is the 'Software Process Dashboard ' [7]. This software comes with Fig. 1. Framework Design several features of PSP, i.e., planning, monitoring, analyzing, and reporting; however, the software requires Java Runtime B. System Development Framework Environment (JRE) to be installed; therefore, it is restricted As mention earlier, AppDOSI is a web-based open to some operating system environments. sourced software, and consists of 6 modules. According to Next is 'Hackystat' [8]. This tool puts the emphasis the Use-case diagram (see fig.2), there are three actors, i.e., on the features of analyzing and reporting data in several students, staff, and course instructors. views; however, this tool cannot handle many PSP processes. Last is 'PSP DROPS' [9]. This is a web application service • Software Project Management: The system allows that cannot keep the Defect recording log. To conclude, these users to set up the project characteristics such as a tools appear a limited amount of features, thus they cannot project’s name, subject id & name, and team support the whole PSP developing process as illustrated in members. Users can identify task specifications, that the Table 1. is, task numbers together with the name, and number of PSP levels along with i.e. PSP0, PSP0.1, PSP1, TABLE I. THE COMPARISON OF PSP TOOLS PSP1.1, PSP2, and PSP2.1. Function/Feature [4] PSP Studio • Personal Software Process Management: This [5] PDash module provides time and defect recording tools. [6] Hackystat Once the project is launched, the user can select the [7] PSP DROPS timer button. In addition, when the defect is found, the user can click the defect/bug button. These tools Time recording: Record all of time you are available for all software process phases, spend on the project. including plan, design, design review, coding, code review, testing, and postmortem. The system also Defect recording: Record each defect --- - allows users to upload his/her source code file separately and completely. preparing for competency analysis in another module. -- - Size Accounting: Line of code counting. • Software Development Expertise/Competency - - Analysis: This module contains two parts i.e. data Print function: Exporting your report files. - - processing and data visualization. In the first part, the --- - following data, time, size, defect in phases, PROBE Estimate: PROxy-Based productivity (Yields), time and size estimation and Estimating. actual, are evaluated in order to analyze user Development profile: Graphical analysis competency. In this module, we use the PROxy- report. (Time, Defect, Size, and etc.) Based Estimating, called PROBE, in the data processing phase. The output is then performed in a Online: Available on the internet. format of personal dashboard. Its contains user expertise/competency such as LOCs/hr., time in Coding Standards: Naming Convention phase, defect in phases and defect removed time. Check Rrelevant data is illustrated in an appropriate format, for example, chart-like, tabular, and card. III. SYSTEM DESIGN AND DEVELOPMENT • Software Common Data Setting: This module AppDOSI was firstly designed and developed by the consists of system common data i.e. defect types, Department of Software Engineering, Burapha University phases in the software process, part additional type, since 2013. It was an open source software, and developed and complexity weight. This data is later used by the by PHP 7 and MySQL 5.6. This section depicts the system other as a common data. architectural design and their development framework. • Team Management: The system not only has a A. Architectural Design personal competency monitoring, it also provides AppDOSI is comprised of (1) Software Development team competency monitoring. So the system provides a team a setting tool. An instructor or staff member Repository: where the personal data, team data, and organizational data were stored, (2) Software Development Gathering & Monitoring: where the managing and monitoring data were kept, and (3) Software Development Expertise: where the analytical data was depicted in several formats (see fig.1). 286
2019 4th International Conference on Information Technology (InCIT2019) will use this tool to identify the group’s name, student In addition, they recommended that the system should be id and name in each group, starting and end dates for able to provide a more comprehensive summary, support each project. When an assignment is offered, relevant Agile methodology, and be presented in a ‘Kanban board’ users can then automatically use this team data. (To do, Doing, Done). In addition, the count Line of code feature should be able to check the automatic Naming • Overall Competency Control & Monitoring: This convention of variables. module performs in a similar way to a module called Software Development Competency Analysis. The V. SUMMARY AND CONCLUSION previous module performs the competency evaluation Managing and monitoring the entire process of the for each person but the later module accomplishes development life cycle is extremely important to software team evaluation. The system can be viewed in 3 engineers, while the novice developers seek to systematically different perspectives, that is, personal performance, analyze and evaluate their performance [3]. We had team performance, and project performance. There developed ‘AppDOSI’, a web application that allows the are two actors authorized to use this module i.e. team developer to manage and monitor their work performance. leader and instructor or coach. The data, i.e. The design and development of AppDOSI is in line with development time, project size, and defect found, will PSP, and its features allow software engineering be displayed in a dashboard format. Moreover, it undergraduates to understand the whole process of PSP. allows users to export some parts of dashboard as a PDF format. Although there are many tools in the market that can provide the basic functionalities and support the learning of PSP, AppDOSI has been proven to be one of the successful open-source systems. Feedback on the AppDOSI has been gathered, and this will assist guiding the developer to make AppDOSI become an optimal choice of software engineering experts. APPENDIX Examples of the AppDOSI’s screenshots are shown in the Figure 3 – Figure 7. Fig. 2. Use-case diagram IV. EVALUATION AND DISCUSSION Fig. 3. Stopwatch AppDOSI had been deployed at the Software Developing Fig. 4. Upload source code file Training Camp for sixteen weeks. After the camp, seventy- six enrolled students and three course instructors were asked to evaluate the tool . The evaluation was carried out by a questionnaire, which comprised of five activities; namely project creation, project scheduling, defect handling, project size, and a custom analytical report. TABLE II. A COMPARISON OF PRACTITIONERS AND INSTRUCTORS SATISFACTION Activity Create Practitioners Instructors projects (N=76) (N=3) Stopwatch 2.93 3.24 Defect log (neutral) (neutral) Count LOC 3.01 4.11 (neutral) (satisfied) Proficiency Reports 3.62 4.51 (satisfied) (highly satisfied) 2.77 4.68 (neutral) (highly satisfied) 3.41 4.09 (neutral) (satisfied) The finding reveals that both enrolled students and course Fig. 5. Time estimation instructors were satisfied with the software. They found several system functionalities useful, such as, project creation, project scheduling, and a custom report (see table 2). 287
2019 4th International Conference on Information Technology (InCIT2019) REFERENCES Fig. 6. Size (LOC) estimation [1] W. S. Humphrey, \"Using a defined and measured Personal Software Process,\" in IEEE Software, vol. 13, no. 3, pp. 77-88, May 1996. Fig. 7. PROBE [2] S. Mansoor, A. Bhutto, N. Bhatti, N. a. Patoli and M. Ahmed, ACKNOWLEDGMENT \"Improvement of students abilities for quality of software through Finally we wish to thank all staff in the following personal software process,\" 2017 International Conference on departments: Information Systems Engineering Research Innovations in Electrical Engineering and Computational Laboratory, Faculty of Informatics, together with Eastern Technologies (ICIEECT), Karachi, 2017, pp. 1-4. Software Park, Burapha University. [3] M. Raza and J. P. Faria, \"ProcessPAIR: A tool for automated performance analysis and improvement recommendation in software development,\" 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), Singapore, 2016, pp. 798- 803. [4] W. S. Humphrey, The Personal Software Process Handbook. Pittsburgh, PA, Carnegie Mellon University, 2000. [5] L. Prechelt and B. Unger, \"An experiment measuring the effects of personal software process (PSP) training,\" in IEEE Transactions on Software Engineering, vol. 27, no. 5, pp. 465-472, May 2001. [6] 97 Design Studio Team, \"PSP Studio,\" Available: http://www- cs.etsu.edu/psp/, Aug. 15, 2005. [7] Tuma Solutions LLC, Process Dashboard [online], vailable: http://www.processdash.com, Dec. 20, 2013. [8] P. M. Johnson et al., \"Beyond the Personal Software Process: Metrics collection and analysis for the differently disciplined,\" 25th International Conference on Software Engineering, 2003. Proceedings., Portland, OR, USA, 2003, pp. 641-646. [9] I. Syu, A. Salimi, M. Towbidnejad and T. Hilburn, \"A web-based system for automating a disciplined personal software process (PSP),\" Proceedings Tenth Conference on Software Engineering Education and Training, Virginia Beach, VA, USA, 1997, pp. 86-96. 288
2019 4th International Conference on Information Technology (InCIT2019) Automatic Celebrity Weight Estimation Jian Qu Chinorot Wangtragulsang Datchakorn Tancharoen School of Information and Technology School of Information and Technology School of Information and Technology Department of Engineering and Department of Engineering and Department of Engineering and Technology Technology Technology Panyapiwat Institute of Management Panyapiwat Institute of Management Panyapiwat Institute of Management Nonthaburi, Thailand Nonthaburi, Thailand Nonthaburi, Thailand [email protected] [email protected] [email protected] Abstract—Information of celebrity available on the unstructured information as website articles or paragraphs Internet includes nickname, date of birth, height, weight, might not be able to be segmented [2]. Gaizauskas et al. works, social media contact and relationship. However, many (2002) developed PASTA system that extracted protein celebrities do not have information such as weight online. We structures and other information from biological texts. The propose a novel method of celebrity weight estimation by system used 1,513 Medline abstracts from 20 major using the celebrity’s self-reported height and BMI. It is found scientific journals on macromolecular structures, which that our method is accurate for female celebrities, coming were annotated manually and determined that if within 3-4 kilograms. Height estimation of male celebrities is biochemical affixes that indicate candidate enzyme or less accurate, with errors ranged from 1 to 10 kilograms. protein names were found. The candidate terms were then Picture-based height estimation could be erroneous up to 10 compared against databases and parsed. After that the centimeters due to perspective issues. parsed data was processed. The system performed well for manually-annotated data, but there was no experiment on Keywords—information extraction, weight estimation, free-text, non-annotated data [3]. personal information, BMI, height information extraction The next group of information extraction work is from I. INTRODUCTION free-text or social media posts, and can be divided further into lexicon-based and non-lexicon-based groups. Kim et Information of a celebrity available on the Internet can al. (2013) developed a patient information extraction which range from nickname, date of birth, height, weight, works, captured texts from Tele-Health system in Canada. The social media contacts and who they are dating with. Unified Medical Language System (UMLS) was used as a However, some information such as weight of the celebrity lexicon for disease, while regular expression was used to can be difficult to obtain as weight can rapidly change over extract other information such as age, gender and time and celebrities, especially female, are understandably temporality [4]. Ku et al. (2008) developed a crime reluctant to share their information weight online. information extraction system that get information from police and witness narrative reports. The Uniform Crime This study aims to develop a system that can Reports, Wikipedia, MSN Encarta, FrameNet, Serious automatically extract celebrities’ height information from Wheels and some other thesauri were used. GATE was also unstructured webpages, determine typical BMI for used to tokenize, split sentences and tag part-of-speech [5]. celebrities, and then estimate the celebrities’ weight using As the systems still used a lexicon to get disease the extracted height information and the estimated BMI. information, this approach would not be useful for contexts with little infrastructure. This work has 6 sections. Section 2 is review on similar works. Section 3 describes the method we use in this work. Imran et al. (2013) used conditional random fields and Section 4 has result of the experiment. Section 5 has a dedicated Twitter tokenizer ARKNLP to get disaster discussions on the result and Section 6 has conclusion of information from Twitter. Twitter posts would be this study. categorized into groups (personal, informative, other) and conditional random fields was used to learn how to tag each II. LITERATURE REVIEW word in the post as related or not related with disaster or infrastructure. Due to the use of a dedicated tokenizer, this Information extraction works can be separated into 2 method was not really suitable for Thai texts outside of major groups: works on structured and unstructured data. Twitter [6]. Chen et al. (2009) developed PolyUHK to Extraction from structured text is done on resumes and extract personal name from web pages. This work used books for example. Kopparapu (2010) developed a rule- Beautiful Soup and MXTERMINATOR to prepare a based/machine-learning system to extract personal webpage for extraction. This approach applies only to information from free-structure resume that had precision English-language websites as MXTERMINATOR is and recall of 91% and 88% respectively. However, this limited to that language [7]. Li et al. (2014) suggested a approach is limited to resume [1]. Yu et al. (2005) user profile extraction (Education, job and spouse) from developed a cascaded system to extract information from Twitter with weak supervision. This approach was limited structured resume. The resume would be separated into blocks, and texts in each block would be segmented and labeled. However, this approach is not very suited for 289
2019 4th International Conference on Information Technology (InCIT2019) to Twitter and did not work really well for jobs as in Twitter Fig. 1. Flowchart of weight estimation people could mention companies in various contexts other than employment [8]. Emami et al. (2016) used pattern- Date of the picture is extracted from date of the source matching to obtain personal information from Farsi- article the picture appears in, along with age of the celebrity language website. This work used a manually-created at the time the picture is taken in case the celebrity is a child keyword list to extract various personal information actor/actress or is going through puberty (basically any including E-mail address, place of birth (or death), relative, celebrity under 18 years old for male and 16 years old for and occupation. Because only keywords were used, this female). approach could not extract information that was written in many variation [9]. Qu and Lu (2016) developed a system B. Extraction of height information. to translate words outside vocabulary from English to Chinese, using pattern-matching and association rules. In Height information of celebrities is extracted by pattern this work, Chinese translation “candidate” were extracted matching. If one celebrity has more than one height from website snippets on Google and then association rules information “candidates”, average of the height candidates were used to determine the winning “candidate” for an will be selected. English word [10]. If there is no height information available from our Many Thai-language information extraction systems system, height information is estimated from the 3 pictures are developed for Twitter possibly due to availability of a collected in the previous process using the following height Thai-language tokenizer such as Lexto. One example is formula: Klaithin and Haruechaiyasak (2016) proposed an information extraction system that got traffic information ℎ������������������ℎ������ ������������ ������ = ������������������������������������ ℎ������������������ℎ������ ������������ ������ × ℎ������������������ℎ������ ������������ ������ ������������ ������������������������������������������ (1) from Twitter [11]. ������������������������ℎ������ ������������ ������ ������������ ������������������������������������������ Human weight estimation from other known information, such as height, has been used in forensic and Where: a=object with known height, b=celebrities’ height medical domains for personal identification and treatment purposes. Young and Korotzer (2016) reviewed various Known objects can range from fellow celebrities with methods used in body weight estimation and suggested that known height (in case of group pictures) to objects such as: parent estimation and length-based methods are the most accurate to predict children’s total body weight. Such - Weapon props methods would require direct measurement which were - Chairs impractical with celebrities [12]. - Cars - Musical instruments Typically, human height becomes stable after a certain Height of the aforementioned objects are obtained from age, Roser et al. (2013) stated that boys reach their peak Wikipedia and/or direct measurement (if applicable). height at 18 years old and girls at 16 years old [13]. People The object or celebrity inside the picture is measured do lose height as they grow older, Sorkin et al. (1999) stated using a photo editing program’s tool, then the height of that height loss starts at 30 years of age, men lose 3 such object or celebrity inside the picture is extracted. After centimeters cumulatively once they reach 70 years of age that, actual height of the celebrity is calculated based on the while women could lose up to 5 centimeters once they are height formula. Example of height estimation based on 80 years old. Then, height loss could be 5 and 8 centimeters height of a known object is shown in Fig. 2. for men and women respectively [14]. C. Weight estimation using height information and BMI III. RESEARCH METHOD value Our method consists of 3 steps: 1) Website snippet In this step, height information and BMI are used to extraction using Google API and manual picture collection estimate weight. The formula is as follows: 2) extraction of height information 3) weight estimation using height information and BMI. For comparison, a ������������������ = ������������������������ ������������������������ℎ������(������������) (2) master student was employed to manually gather height and ������������������������ ℎ������������������ℎ������(������)2 weight information of the celebrities. Flowchart of our system is shown in Fig. 1. This is a standard BMI formula used in celebrity weight estimation. However, while the height was either obtained A. Website snippet extraction using Google API and manual picture collection. 317 Celebrity names listed on MThai.com are used as input in Google Custom Search API to collect website snippets related with them. Number of snippets for each celebrity is set to be 100 to ensure data inclusion rate as mentioned in Qu et al. (2016) [10]. Also, 3 pictures of each celebrity are manually collected for picture-based height estimation. Pictures older than 5 years would not be collected. 290
2019 4th International Conference on Information Technology (InCIT2019) Celebrity B’s Celebrity height is 4cm in A’s height is 4.5cm in picture and picture but 160cm in real unknown in real life life Every cm in picture is 40 cm in real life, thus 4.5cm means Fig. 3. Example of dataset or snippets as stored in the database. Title, 180cm. snippet, web URL and name of the celebrity in question are shown. Fig. 2. Example of height information extraction based on known height C. Extraction of height information of another object. From the collected snippets in the previous section, or estimated in the previous section, BMI still must be only 94 celebrities had height information candidates. 119 known to get weight. female and 104 male celebrities who did not have height information candidates had their pictures examined for Average BMI values were used for weight estimation height. Fig. 4 below shows the celebrities’ height manually of celebrities with known or estimated height. 10 male and obtained from the Internet compared with height obtained 10 female celebrities with self-reported weight and height by picture estimation. information are selected by random sampling and then two average BMI values are calculated, one for male and Fig. 4. Example of picture-based estimated height for celebrities with another for female celebrities. Then, the average BMI known (second column from right) and unknown height (first column values are used to estimate weight of celebrities. from right). Average BMI values obtained were 16.7 and 20.2 for D. Weight estimation using height information and BMI female and male celebrities respectively. Thus, the value modified BMI calculation formula is as follows: Fig. 5 shows female celebrity weight estimation based ������������������������,������ = ������������������������ ������������������������ℎ������(������������) (3) on automatically-extracted height compared with the ������������������������ ℎ������������������ℎ������(������)2 celebrities’ weight information manually obtained from the Internet. Where: BMIf = 16.7 and BMIm = 20.2 Fig. 5. Example of female celebrity weight estimation result After that, result from the experiment will be compared (BMIEstMod) compared with their body weight information available to weight estimation using the “optimally healthy” BMI or online (WeightBaseline). 21.7 (the middle number in the healthy BMI range between 18.5-24.9) for performance comparison. The experiment with female celebrities with height and weight information being available online showed that 12 IV. EXPERIMENTS female celebrities with automatically-obtained height information had estimated weight being close to their actual A. Experiment setup weight with less than 1 kilogram in difference. Average error for female celebrities is 2.370586 kilograms. A master student was hired to manually collect height and weight information of the celebrities from the Internet Male celebrities’ estimated weight compared with which will be used as a baseline to compare with result of weight information obtained manually from the Internet is the proposed method. shown in Fig. 6 below. BMI estimation was done twice, the first time was done using the proposed method, while the second time was done by using the optimally healthy BMI value (the middle number in the healthy BMI range, between 18.5-24.9). On accuracy, if the estimated weight was within 7% of the celebrity’s body weight, then the estimated weight would be deemed correct as human weight can fluctuate up to 5-6 pounds in a single day. B. Website snippet extraction using Google API and manual picture collection In all, 22,484 snippets on 317 celebrities were collected with average number of snippets for one celebrity being 71. 3 pictures for each of the 317 celebrities were also acquired for BMI estimation. Fig. 3 shows part of the snippets as collected and organized into a table. 291
2019 4th International Conference on Information Technology (InCIT2019) Fig. 6. Example of male celebrity weight estimation result. The estimated Fig. 8. Example of baseline weight, weight estimated by the proposed weight column is the first from right (BMIEstModNew), compared with system and weight estimated by optimally healthy BMI value for male the weight information available on the Internet (WeightBaseline) celebrities. According to Fig. 6, only 2 male celebrities have Male Female estimated weight being close to their actual weight. All but 60.00% 6 male celebrities actually weigh more than their estimated 75.59% 61.53% weight. Likewise, average error for male celebrities is 4.878979688 kilograms or almost three times the error rate 3.46% of female counterpart. PROPOSED METHOD OPTIMALLY HEALTHY Performance test was done by comparing the weight BMI estimated by our system with estimation using the “optimally healthy” BMI value (21.7). The column Fig. 9. Accuracy chart for celebrity weight estimation using the proposed “WeightBaseline” is the weight manually obtained from method and optimally healthy BMI-based method. the Internet, the column “BMIEstModNew” is the weight estimated by using our system, and the column Average error of optimally healthy BMI-based weight “WeightEstBase” is the weight estimated using the estimation for male celebrities is 4.76444 kilograms which optimally healthy BMI. is actually less than 4.87897 kilograms of the proposed system. Fig. 7. Example of baseline weight, weight estimated by the proposed system and weight estimated by optimally healthy BMI value for female It can be seen in Fig. 9 that for female celebrities the celebrities. proposed method is much more accurate than optimally healthy BMI-based estimation (75.59% vs 3.46%), while As seen in Fig. 7 above, average error of optimally for male celebrities the result is the opposite (60.00% vs. healthy BMI-based weight estimation for female celebrities 61.53%). is 13.12431 kilograms, outstripping the 2.370586 kilograms error rate of the proposed system. V. DISCUSSIONS On male celebrities, it is found that the estimation with A. Website snippet extraction using Google API and optimally healthy BMI actually had less error compared manual picture collection with our system as shown in Fig. 8. As with the previous figure, the column “WeightBaseline” is the weight It was found during the experiment that many manually obtained from the Internet, the column celebrities had less than 100 snippets collected. This was “BMIEstModNew” is the weight estimated by using our likely due to their obscurity compared to the more famous system, and the column “WeightEstBase” is the weight celebrities. Older celebrities who have not been in spotlight estimated using the optimally healthy BMI. for sometimes also had less than 100 snippets returned by Google API. Some pictures of celebrities extracted did not cover their entire bodies but only from the waist or chest up which made estimation more difficult. B. Extraction of height information. We found that many celebrities who had height information available online were former models or modeling contestants. In such case, their heights were reported by their modeling websites/agencies with 292
2019 4th International Conference on Information Technology (InCIT2019) relatively high accuracy due to modeling standards. On the AUTHOR’S CONTRIBUTION other hand, singers did not have much height information. The first and second authors conducted the It was also found that despite the snippets collected, experiment, drafted the manuscript and contributed 90% to inclusion rate of height information was rather low as only this research paper. The third author reviewed and made 98 celebrities had height candidates extracted comments on the paper. automatically by our system. On the other hand, we could obtain the height of up to 260 celebrities manually on the REFERENCES Internet. It is probable that as only the name of the celebrities was used as input, some information was not [1] S. K. Kopparapu, “Automatic extraction of usable information from included in the snippets. unstructured resumes to aid search,” in 20310 IEEE International Conference on Progress in Informatics and Computing, vol. 1, 2010, Comparison of celebrities’ heights in the picture also pp. 99–103. found that many celebrities with known height were included in group pictures and could have been used as [2] R. Gaizauskas, G. Demetriou, P. J. Artymiuk and P. Willett, \"Protein reference height, but as they were not one of the 317 structures and information extraction from biological texts: the celebrities used in this study this information could not be PASTA system,\" Bioinformatics, vol. 19, 2003, pp. 135-143. used. [3] K. Yu, G. Guan, and M. Zhou, “Resume information extraction with It was found that the height estimated from pictures cascaded hybrid model,” in Proceedings of the 43rd annual meeting could be erroneous due to picture perspective. on association for computational linguistics, 2005, pp. 499–506. C. Weight estimation using height information and BMI [4] M.-Y. Kim, Y. Xu, O. Zaiane, and R. Goebel, “Patient information value extraction in noisy tele-health texts,” in 2013 IEEE International Conference on Bioinformatics and Biomedicine, 2013, pp. 326–329. Higher weight prediction accuracy for female celebrities compared to male celebrities suggested that [5] C. H. Ku, A. Iriberri, and G. Leroy, “Crime information extraction female celebrities tended to had one single body type and from police and witness narrative reports,” in 2008 IEEE Conference be slightly underweight for their heights. On the other hand, on Technologies for Homeland Security, 2008, pp. 193–198. it was found from the experiment that the sampled male celebrities had distinct body types: muscular and slim. As [6] Y. Chen, S. Y. M. Lee, and C.-R. Huang, “PolyUHK: A Robust a result, the use of average BMI had more error compared Information Extraction System for Web Personal Names,” in 2nd to the female counterpart. One example is Rusameekae Web People Search Evaluation Workshop (WePS 2009), 18th Fagerlund whose estimated weight was only 68.38 WWW Conference, Madrid, Spain, 2009. kilograms compared to his actual weight of 89. The Thai- American Fagerlund is known for having a very large, [7] M. Imran, S. Elbassuoni, C. Castillo, F. Diaz, and P. Meier, muscular body frame. Conversely, Golf-Pichaya (of Golf- “Practical extraction of disaster-relevant information from social Mike duo) weighs only 53 kilograms compared to the media,” in Proceedings of the 22nd International Conference on estimated weight of 58 kilograms as his body frame World Wide Web, 2013, pp. 1021–1024. resembles more of a slim male K-pop singer. [8] J. Li, A. Ritter, and E. Hovy, “Weakly supervised user profile Also, comparison of performance between the proposed extraction from twitter,” in Proceedings of the 52nd Annual Meeting system and optimally healthy BMI-based weight of the Association for Computational Linguistics (Volume 1: Long estimation showed that the proposed system worked much Papers), vol. 1, 2014, pp. 165–174. better for female celebrity weight estimation compared to male. This might be attributable to weight distribution of [9] H. Emami, H. Shirazi, A. Abdollahzadeh, and M. Hourali, “A female celebrities as they were all underweight, while in Pattern-Matching Method for Extracting Personal Information in male celebrities, the optimally healthy BMI-based Farsi Content,” University Politehnica of Bucharest-Scientific estimation worked slightly better, likely because the Bulletin, Series C: Electrical Engineering and Computer Science, proposed method’s BMI is slanted towards slimmer vol. 78, 2016, pp. 125–139. celebrities. [10] J. Qu and Y. Lu, “Automatic identification and multi-translatable VI. CONCLUSION translation of vocabulary terms with a combined approach,” in 2016 Eighth International Conference on Advanced Computational Our weight estimation system based on height and Intelligence (ICACI), 2016, pp. 342–348. average BMI can be applied automatically and is accurate for female celebrities as long as their height information is [11] S. Klaithin and C. Haruechaiyasak, “Traffic information extraction available in the snippets. However, it is less accurate for and classification from Thai Twitter,” in 2016 13th International male celebrities due to more diverse body types compared Joint Conference on Computer Science and Software Engineering to female counterpart. Also, low inclusion of height (JCSSE), 2016, pp. 1–6. information severely impacts height candidate extraction and thus specific information keywords might have to be [12] K. D. Young and N. C. Korotzer, “Weight estimation methods in added to the input. children: a systematic review,” Annals of emergency medicine, vol. 68, no. 4, 2016, pp. 441–451. Picture-based height estimation had errors up to 10 centimeters due to perspective of the pictures. [13] M. Roser, C. Appel and H. Ritchie, \"Human Height\", Our World in Data, 2019. [Online]. Available: https://ourworldindata.org/human- In future works we might implement machine learning- height. [Accessed: 03- Aug- 2019]. based regular expression generation to improve recall, and increase the number of celebrities. [14] J. D. Sorkin, D. C. Muller, and R. Andres, “Longitudinal change in height of men and women: implications for interpretation of the body mass index: the Baltimore Longitudinal Study of Aging,” American journal of epidemiology, vol. 150, no. 9, 1999, pp. 969– 977. 293
2019 4th International Conference on Information Technology (InCIT2019) Key Factors of Usability of Science and Technology Faculties’ Website: Marketing Purpose Prajaks Jitngernmadan Prawit Boonmee Faculty of Informatics Faculty of Informatics Burapha University Burapha University Chon Buri, Thailand Chon Buri, Thailand [email protected] [email protected] Abstract—A website of a science and technology faculty is a II. RELATED WORKS platform offering decisive information acquired by high-school Zaphiris and Ellis [2] compared top fifty USA university students for further education. The representative of this in term of their accessibility and usability as well as website is also important in terms of marketing purpose and has investigated whether these two measures are correlated. They to be designed according to usability rules properly. As a collected the websites of the top fifty universities (based on platform, it can be used to implement marketing strategies of the 2001 college ranking of US-News (2001) and evaluated the organization in terms of new student recruitment. This work their accessibility and usability by using two automatic tools aimed to extract the key factors high-school students require for (Bobby and Lift). The results revealed that only 30% of the study-related information acquiring. The key factors can be websites studied are Bobby approved. The analysis also then used as the recommendations for science and technology showed a low usability rating for most of the university faculties’ website design. The websites are also stated based on websites. There was a significant correlation between the high-school students’ needs and usability requirements in terms accessibility approval and overall usability ratings for the of marketing purpose. university web sites. Keywords—usability, website, marketing, faculty, university Advanced statistical analysis can be used to further explore the underlying relationship between different I. INTRODUCTION measures of usability and accessibility evaluation. In addition, a thorough website evaluation such as formal usability Thai University nowadays may have a hard time for evaluation involving user testing can improve the automatic recruiting new students for the upcoming semester due to evaluation tools. various reasons. The 2 interesting possibilities are 1) the low birth rate in Thai society and 2) the aging society. According Aziz, Isa, and Nordin [3] investigated the accessibility and to the Council of University Presidents of Thailand, which is usability level of Malaysia Higher Education Website. The an organization arranging the standardized test for university websites came from 120 samples of higher education entrance in Thailand, more than 45% of the overall university institution websites from the online portal of the Ministry of seats are not taken [1]. As in every business sector, when the Higher Education (http://wwww.mohe.gov.my). The websites supply side is greater than the demand side, the suppliers have chosen comprised of public universities, private institutions, to try harder to stay competitive with their concurrences. One polytechnics and community college. of the possible solution is to improve marketing strategy on organization’s website. An automatic evaluation tool, EvalAccess 2.0 was used to evaluate the accessibility level according to WCAG 1.0 When it comes to the useful information and facts about guideline. The evaluation could be improved by employing study fields for an interested high-school student, the faculty’s automated evaluation tools that support WCAG 2.0. Usability website is the most reliable data source that he/she can trust. evaluation looked at page size, speed and broken links. The Therefore, interesting and important information have to be results showed that the private institution has the most number identified, prepared, and presented in the way that an of websites that exceed the size of 37 kb with 35 websites, interested high-school student can acquire easily. With these followed by Community College with 34 websites, reasons, a website of a faculty is important and has to be Polytechnic 21 websites and public university with 14 designed to suit high-school students’ needs. websites. The broken link analysis revealed that Polytechnic and Community College had the same numbers of broken Some works tried to evaluate universities’ websites in links in their website with 34, followed by private institution terms of usability and accessibility. However, none of them (32) and public university (14). Speed Analysis of 56k modem did the investigation of the faculty’s websites in depth and showed that private institution was in the first ranking with 35 analyzed them in respect of marketing purpose. of website, followed by Community College with 33 of website, followed by Polytechnic (21) and public university In this work, we aimed to find out what affects the decision (14). making of high-school students for choosing a faculty or study field for studying in higher education. Then we took a closer Mentes and Turan [4] evaluated and explored the usability look at content structure of faculties’ websites. Finally, we level of Namik Kemal University (NKU) website to provide analyzed the important factors that suit students’ needs, guidance for developing better and more usable web sites. matching with the existing website structure and elements. We The research measured the usability of the Namik Kemal then suggested what a faculty should have on its website in University (NKU) web site via the five assumed factors of terms of marketing strategy improvement. usability defined by WAMMI (Website Analysis and 294
2019 4th International Conference on Information Technology (InCIT2019) Measurement Inventory): attractiveness, controllability, main problem themes related to: Navigation, design, content helpfulness, efficiency and learnability. The research was and ease of use and communication. done by two different methods simultaneously. First, questionnaire was posted to the NKU website and some The results showed that all the tested websites had a large internal stakeholders (students, faculty members and the number of usability problems. The highest percentage of the administrative staff) were asked to respond to it. Second, the usability problems that were identified on each site is related link to access the questionnaire was passed to all internal to the design area, whereas the lowest percentage of the stakeholders via NKU email system. The results revealed that usability problems that were found on each site is related to gender and web experience had significant impacts on the ease of use and communication area. These usability usability perceptions of individual users. Male participants problems could be used as guidelines for universities in found NKU website less usable and participants with 5 years general to improve their universities’ websites. of more experience seemed to be less satisfied with the usability of NKU website. To gain continuous feedback from Wangpipatwong, Churimaskul and Papasratorn [7] the users, a site intercept survey to collect survey data could explored which factors influence the adoption of eGovenment be deployed. This would give administrators extensive websites. The explored factors are based on information and opportunities to improve the website. system-quality aspects. The characteristics of information quality include accuracy, timeliness, relevance, precision, and Maisak and Brown [5] examined level of web completeness. The characteristics of system quality which accessibility Thai universities against WCAG 2.0 guidelines. significantly influence the adoption of eGovernment websites They selected nine higher education websites in Thailand were discovered by a standard software quality model named from the top five ranked Thai universities, two open ISO/IED 9126. This study explored only functionality, universities, one special college for students with disabilities reliability usability, and efficiency excluding maintainability and one online institution. Seven representative webpages and portability. Questionnaires were given to Thai members were tested from within each of the university websites, of the faculty at an accredited private university in Bangkok. including the university homepage, library, webmail login There were 5 items and 10 items used to determine significant page, contact us, e-learning portal homepage, e-learning information quality and system quality factors respectively. forums and publically available e-learning content. The Each item is rated on a scale of 1 (least significant) to 5 (most webpages were evaluated by automated (SortSite and WAVE significant). The results revealed that information quality and )and manual testing based on WCAG2.0 guideline at level A system quality are significant factors that influence the and AA. The results were then categorized by POUR adoption of eGovernment websites. Accuracy, relevancy, and principles. completeness were more significant than timeliness and precision. Efficiency was the most significant factor. Other The analysis focused on the errors found overall and the aspects besides information quality and system quality aspects distribution of errors across the POUR principles. The results should be further investigated as well. indicate that most Thai institution websites have common accessibility problems related to providing information in III. EXPERIMENT DESIGN multiple formats and lack awareness of control over the web interface. A. Overview of the Experiment For finding the key factors of usability that affect the Hasan [6] applied the heuristic evaluation method to comprehensively evaluate the usability of three large public websites of science and technology faculties, we designed our university websites in Jordan (Hashemite University, the experiment in 3 parts. The first part is to find out what high- University of Jordan, and Yarmouk University). All pages school students who are intending to enroll a program would related to the selected universities’ faculties and their do with a website regarding information acquirement. The corresponding department were tested and a list of 34 specific second part is about analyzing the design, the content types of usability problems was identified. Two documents representation, and the element structure of existing websites were developed in order to evaluate the usability of the by a group of experts regarding information offering. This selected university websites: Heuristic guidelines, and a list of group of experts consist of 3 university lecturers who have tasks. The adopted heuristics were organized into five major expertise in Computer Science, Human Computer Interaction, categories. The list of tasks document included ten tasks, and Information Technology. Finally, the last part is to find which represented pages students visit usually on a university out how the behaviors and the needs of high-school students website. A questionnaire aimed to investigate the types of are served by the design of the websites. The key factors of pages visited by students on a university website was given to designing of a science and technology website in terms of undergraduate students from various faculties at one of the marketing purpose will then be analyzed and acquired. Figure universities in Jordan. The 237 students listed a total of 540 1. shows the overview of this experiment design. pages. For example, the available course page was the most frequently visited page listed by the students (21.11%). Five B. Conducting of the Experiment evaluators were asked to visit all pages included in the list of This work is conducted at Burapha University, Thailand tasks, and to use the heuristic guidelines while evaluating each website. The heuristic evaluators’ comments on the during an entrance interview for recruiting new students. The compliance of each site to each heuristic principle were 100 participants were high-school students who were grouped together for each site, and categorized under the intending to apply to science and technology faculties. They categories and sub-categories of the heuristic guidelines. were surveyed and interviewed with a questionnaire, which These problems were classified and grouped into 34 common contains 3 parts namely, 1) general information, 2) behavior areas of usability problems. These 34 problems suggested four of using the Internet, and 3) requirements in terms of faculty’s information. The general information part is used to acquire information about a high-school student e.g. Grade Point Average: GPA, gender, age, etc. The purpose is to estimate the 295
2019 4th International Conference on Information Technology (InCIT2019) general behaviors of the high-school students. The behavior of Typically, most of the participants own smartphones with using the Internet part is used to acquire information e.g. Internet access functionality. This emphasizes the assumption ownership of a computer or a smartphone, time spent on using that nowadays high-school students merely use the Internet the Internet, where they use the Internet, etc. Finally, the last for information searching. The closer look at the time they part is to acquire their needs for information in terms of further spent on the Internet reveals that most of them (42.86 %) education. spent more than 6 hours a day using the Internet. It is also important for a faculty considering using the Internet, A group of experts in accessibility and usability of especially the website, as a platform for information websites analyzed the science and technology faculties’ providing and marketing purpose. The social media platforms websites. The design of websites is exposed in terms of e.g. Facebook, Twitter, etc., can be used as a Question and content and information representation, and usability for high- Answer platform or Customer Relationship platform. Table school students to acquire useful information. Furthermore, III. shows the figure of used devices and number of hours the sitemap of a website, which is used to represent the they spent time on the Internet. element structure of a website is constructed and analyzed. The intended analysis is conducted in respect of the website TABLE III. DEVICE AND HOURS OF THE USE OF THE INTERNET. design with marketing strategy purpose. Devices (%) Use of Internet (hrs.) 1-3 4-6 >6 Smartphone Notebook/PC Tablet 17.14 40.00 42.86 5.71 High- - Behaviors 97.14 40.00 school - Requirements students Table IV. shows the required information, which a science and technology faculties should prepare and present inquiring - Key factors on their websites in the right way. The most of surveyed - Recommen- students want to know which subjects will be taught in the matching program (82.86 %), then the possible job opportunity (65.71 dations %), and the number of credits to be enrolled (62.86 %) respectively. The faculties have to keep this order in mind Experts - Website structure when designing a website in terms of information analyzing - Content signification and placement. representation - Information Design - Usability Fig. 1. Overview of the experiment design. TABLE IV. REQUIRED INFORMATION BY HIGH-SCHOOL STUDENTS. IV. RESULTS Information (%) A. High-school Students’ Requirements Subjects to be studied 82.86 Possible job opportunity 65.71 The general information reveals that the gender of high- Number of credits to be enrolled 62.86 school students does not play an important role. The Tuition fee 60.00 difference between the number of male and female Possible income after graduate 40.00 participants is not significant. However, the number of male List of lecturers 31.43 participant is slightly higher due to Thai culture where male Accommodation/Dormitory 28.57 high-school students typically choose science and technology education. The age range shows that the participants tend to Furthermore, the high-school students are asked to find be familiar with using the Internet as a handy tool for relevant information on the websites. More than 50 % of them information acquirement (Table I.). could not find that specific information and many of them did not know where to begin. Therefore, they could not complete TABLE I. GENDER AND AGE RANGE OF PARTICIPANTS. the task. In addition, they captured the irrelevant information for the further education from the websites e.g. pictures of Gender Age Range buildings, specific color theme, logo, pictures of activities, 15 - 17 Y/O (%) 18 - 20 Y/O (%) etc. M (%) F (%) 5.7 94.3 B. Faculties’ Website Analysis 54 46 The first analysis is the sitemap of a website. A sitemap The Table II. shows the percentage of the participants in represents the content structure of a website, and therefore its terms of the GPA range. Most of them have the high GPA logical arrangement of the information on the website. A (between 3.01 – 4.00), which preliminary indicates their sitemap should be designed to suit the requirements of the basics of logical thinking and enthusiasm. target group regarding usability and customer care. Table V. shows the level 1 of the sitemaps of 3 science and technology TABLE II. GPA RANGE OF THE PARTICIPANTS. websites. GPA Students (%) TABLE V. SITEMAPS LEVEL 1 OF FACULTIES’ WEBSITES. 1.00 - 1.50 3 1.51 - 2.00 3 Engineering Science Informatics 2.01 - 2.50 6 Home Home Home 2.51 - 3.00 20 About Faculty About Faculty Programs 3.01 - 3.50 40 3.51 - 4.00 29 296
2019 4th International Conference on Information Technology (InCIT2019) Engineering Science Informatics TABLE VII. DESIGN RECOMMENDATION FOR FACULTIES’ Study Interested Faculty Management International Study WEBSITES. Person Research Departments Student Services Key Factors Design Position Academic Services Human Resource Research/MoU Menu/Content Menu Level 1 Information Systems Programs/Admission About Faculty Subjects to be studied Menu/Content Menu Level 1 Menu Level 1 Dean Hotline Research/Academic Alumni Possible job Menu/Content Menu Level 1 Services opportunity Menu Level 2 Current Students Knowledge Number of credits to Menu/Content Menu Level 2 Management be enrolled Menu/Content Menu Level 2 Choose Language Tuition fee Menu/Content Top Left/Menu Level From Table V, these 3 websites have similar sitemap Possible income after Menu/Content 1 structure beginning from “Home”. Then, the websites tries to graduate represents the relevant information through, according to List of lecturers Use CSS for them, the logical information structure. Two of the websites Meaningful (Engineering and Science) place the “About Faculty” menu Accommodation/Dor Responsive Design right after the “Home” menu. Only the Informatics places the mitory Prioritization and “Programs” right next to the “Home” menu. The logic of the Responsive Design Sorting Information menu placement as a sitemap structure regarding requirements of the target group will be discussed in the section V. Relevant Information Furthermore, the usability of these websites is evaluated In addition, due to the high number of high-school according to Nielsen’s heuristic evaluation method [8]. students using smartphones as a device for information retrieving from the Internet, the design of meaningful TABLE VI. HEURISTIC EVALUATION OF WEBSITES. responsive websites is also recommended. This can be done using CSS technology. Criteria 543 2 1 Visibility of system status 9 21 5 0 0 V. DISCUSSION, CONCLUSION AND FUTURE WORK 0 0 Match between system and the 7 22 6 High-school students’ behavior changed disruptively real world 10 18 6 1 0 about the way to acquire necessary information for their 0 1 decision-making. The Internet becomes more important than User control and freedom 2 0 ever as the source of data that is available 24/7. Although the 1 0 search engines and different forums are preliminary tools for Consistency and standards 6 24 5 0 0 gathering information, the organization website is also the 1 0 reliable place for facts and information about studies. Error prevention 4 16 13 0 0 Furthermore, a website is a representative of an organization, and this could be used as a marketing tool with the right Recognition rather than recall 7 19 7 0 0 design. In a good website design, the following 3 design concepts should be implemented: Flexibility and efficiency of use 11 18 6 - interactive website design Aesthetic and minimalist design 21 9 4 - intuitive navigation design - informative content design Help users recognize, diagnose, 4 21 10 The combination of these will support decision making of and recover from errors high-school students. Help and documentation 11 15 9 The key findings of our work are the facts that science and technology faculties’ are not designed for suiting high-school Table VI. shows the heuristic evaluation of the websites. students’ needs. However, this target group is one of the most The evaluated websites are good in terms of aesthetic and important stakeholders who are the potential customers if design, user control and freedom, flexibility and efficiency of they get the right information. Designing faculties’ website to use, and help and documentation. The other criteria are to be suit their needs can be seen as a survival marketing strategy improved. The most interesting criterion for our purpose in aging society. Several works analyzed and evaluated seems to be “Match between system and real world”, which university websites. However, they do not study a website in approve whether the website can communication with the terms of marketing strategy tool. high-school students effectively. Furthermore, the Visibility of system status and the Consistency and standards are the Unfortunately, this work did not cover the website helping criteria for the users to complete the task easily. redesign of a faculty and its evaluation. Therefore, the future work will be website redesigning and evaluation. C. Recommendations for Faculties’ Website Design Furthermore, other aspects e.g. accessibility, designing concepts, etc. can be added as criteria for website evaluation. Faculties’ website design that can be used as a marketing strategy has to be developed based on high-school students’ The marketing strategies depend on how one designs the requirements. The information extracted from Table IV. websites to suit the high-school students needs the best in indicates that the most required information has to be placed in the most important area of a website. Assuming the language is read from Left to Right, the most important areas are Top Left, Top Right, Left Sidebar, Content, Right Sidebar, and Footer respectively. The recommendations for the science and technology faculties’ website design can be acquired from the Table VII. 297
2019 4th International Conference on Information Technology (InCIT2019) terms of visual design and content design. The logical layout and information structure are the key for placing the websites in the proper position. For the marketing purpose, the websites should also be redesigned accordingly. ACKNOWLEDGMENT This work was done at Digital Media and Interaction Laboratory, Faculty of Informatics, Burapha University, Thailand. We want to thank you for its grant and support that helped doing this work successfully. REFERENCES [1] Council of University Presidents of Thailand, Thai University Central Admission System. [2] P. Zaphiris and R. D. Ellis, “Website usability and content accessibility of the top USA universities,” 2001. [3] M. Abdul Aziz, W. A. R. Wan Mohd Isa, and N. Nordin, “Assessing the accessibility and usability of Malaysia Higher Education Website,” in 2010 International Conference on User Science and Engineering, Shah Alam, 2010, pp. 203–208. [4] S. A. Mentes and A. H. Turan, “Assessing the usability of university websites: An empirical study on Namik Kemal University,” Turkish Online Journal of Educational Technology-TOJET, vol. 11, no. 3, pp. 61–69, 2012. [5] R. Maisak and J. Brown, “Web accessibility on Thai higher education Websites,” in Proceedings of the Ninth International Conference on Software Engineering Advances, France, 2014. [6] L. Hasan, “Heuristic evaluation of three Jordanian university websites,” Informatics in Education-An International Journal, vol. 12, no. 2, pp. 231–251, 2013. [7] S. Wangpipatwong, W. Chutimaskul, and B. Papasratorn, “Factors influencing the adoption of Thai eGovernment websites: information quality and system quality approach,” in Proceedings of the Fourth International Conference on eBusiness, 2005, pp. 19–20. [8] J. Nielsen, 10 Heuristics for User Interface Design. [Online] Available: https://www.nngroup.com/articles/ten-usability-heuristics/. Accessed on: Jun. 23 2019. 298
2019 4th International Conference on Information Technology (InCIT2019) A Novel Hierarchical Edge Computing Solution Based on Deep Learning for Distributed Image Recognition in IoT Systems Nitis Monburinon Salahuddin Muhammad Salim Zabir Natthasak Vechprasit Faculty of Information Technology Department of Creative Engineering Faculty of Information Technology Thai-Nichi Institute of Technology National Institute of Technology, Tsuruoka College Thai-Nichi Institute of Technology Bangkok, Thailand Tsuruoka, Japan Bangkok, Thailand mo.nitis [email protected] [email protected] ve.vejprasit [email protected] Satoshi Utsumi Norio Shiratori Faculty of Symbiotic Systems Science Research and Development Initiative Fukushima University Chuo University Fukushima, Japan Tokyo, Japan [email protected] [email protected] Abstract—Traditionally, IoT systems utilize cloud computing information and communicate with each other or the central platforms for ease of deployment. However, such platforms server without any human intervention. [1] This concept of are resource-intensive, relatively expensive, and lead to longer ”machine talking to machine without direct human action” response time due to network latency. Hence, cloud computing [2] has inspired countless IoT applications that proved its may not always be suitable for unusually large scale deploy- usefulness in many fields such as household [3], transportation ment. Various edge computing solutions have been proposed to [4], or agriculture [5]. For example, a security camera could overcome the above issues. However, most of such solutions rely detect the facial features of a criminal in an area, and alert upon expensive edge servers, which make them unsuitable for nearby police officers or a smart refrigerator could notify distributed applications like image recognition in agricultural family members when they need to do some grocery. With fields. In this paper, we propose a hierarchical edge computing- an estimated 50 billion of IoT devices deployed worldwide based image recognition system in which the major processing by 2020 [6], we might be anticipating the Internet of things is carried out at low-cost gateway devices like Raspberry Pi. As revolution which affects everything from industry, economy, an example case, we address the issue of recognition of animals science, to our daily life [7]. intruding in agricultural fields. We implement a dynamic learning method, in which a convolutional neural network is dynamically Because of the ease of deployment and handling of the gen- trained to recognize potential target classes based on a specific erated data, cloud computing has so far widely been used for deployment environment. The AI detection module is then loaded IoT. However, many experts pointed out that cloud alone might on to the lowest level of edge servers on gateway devices for not be an ideal solution for IoT. [8] At the moment, billions detection of animals and providing feedback. Experiments show of things in IoT application generate a considerable amount of that our proposed recognition system can perform offline image data simultaneously. Those data are usually sent to process on classification tasks with up to seven times higher accuracy and the cloud, which is also responsible for most decision making more than two times faster evaluation time in comparison with afterward. This approach proved to be efficient until now. general-purpose cloud recognition systems. Besides, it consumes However, data produced from things are expected to reach less than 6% of the network bandwidth and only a fraction of 10.4 zettabytes by 2019 [1]. Transferring that massive amount the energy as well as other computational resources compared of data to the cloud would cause a massive bottleneck in that with existing approaches. particular network and consume overwhelmingly enormous computing resources on the cloud itself. Additionally, data Index Terms—Edge Computing, Image Recognition, Internet collected by IoT devices in households or work environment of Things might contain people private information (e.g., individual pic- ture or record of private conversation). Thus, raising security I. INTRODUCTION concerns in IoT applications. Corresponding to the advance in information and opera- Various edge computing solutions have been proposed to tional technologies, the Internet of things (IoT) has emerged overcome the above issues. Edge computing introduces the as one of the most prominent technologies in this era. In the concept of processing data at the edge of the network. Gener- current Internet of things paradigm, everything has computing capability and connects to the Internet. Thus, enabling conven- tional electronic devices such as a coffee maker, security cam- era, refrigerator or even toilet to generate sensor data, process 299
2019 4th International Conference on Information Technology (InCIT2019) ally, the ”Edge” could refer to anything capable of computation Arduino. As edge servers close to the locations where data is between the data source and the central cloud. [1] In edge being generated. Usually, some simple data handling or storage computing paradigm, the edge, which is operating close to operations are being targeted on such edge servers. the data source, is responsible for most data processing and decision-making. Thus, reducing both network traffic With In this paper, we addressed the issue of distributed image advance in software and hardware, Internet of things (IoT) recognition with gateway level edge servers at the core of has emerged as one of the most prominent concept. In the the process. As an example case, we address the problem of current Internet of things paradigm, everything has computing detection of animals intruding into agricultural fields. Since capability and connects to any existing network infrastructure animal intrusion leads to considerable damage to the agri- or the Internet. Thus, enabling common electronic devices cultural produce, it is essential to be able to detect animals such as a coffee maker or security camera to generate sensor as they enter the fields and drive them away. A traditional data, process information and communicate with each other approach is to deploy a scarecrow (or series of scarecrows) or or the central server without any human intervention. [1] human labor. While animals get used to traditional scarecrows, This concept of” machine talking to machine without human the use of human labor is tiresome or expensive. Hence we direct action” [2] has inspired countless IoT applications that have developed a low-cost IoT based system that would do proved its usefulness in many fields such as household [3], the task automatically. At the core of the system lies the transportation [4], or agriculture [5]. For example, a security detection of the animals. For the purpose, we design and camera could detect the facial features of a criminal in an develop an Internet of things (IoT) application framework with area, and alert nearby police officers or a smart refrigerator high resource efficiency and dynamic platform scalability. We could notify family members when they need to do some implement a highly optimized convolutional neural network grocery. With an estimated 50 billion of IoT devices deployed (CNN) to perform specific image recognition or classification worldwide by 2020 [6], we might be anticipating the Internet tasks on the edge of the network. In our framework, the of things revolution which affects everything from industry, recognition system is optimized only for a specific purpose economy, science, to our daily life [7]. and can dynamically adapt to the deployment environment. We name our framework as Deployment Environment Aware Because of the ease of deployment and handling of the gen- Learning (DEAL). To the best of our knowledge, no other erated data, cloud computing has so far widely been used for literature reports implementing optimized neural networks on IoT. However, many experts pointed out that cloud alone might low-end edge devices like Raspberry Pi for image recognition not be an ideal solution for IoT. [8] At the moment, billions applications. of things in IoT application generate a considerable amount of data simultaneously. Those data are usually sent to process on Our significant contributions in this manuscript are (i) the the cloud, which is also responsible for most decision making framework of a dynamically adaptable convolutional neural afterward. This approach proved to be efficient until now. network that is capable of adjusting the recognition module However, data produced from things are expected to reach dynamically depending on the target deployment environment 10.4 zettabytes by 2019 [1]. Transferring that massive amount (ii) the gateway level edge server implemented on Raspberry of data to the cloud would cause a massive bottleneck in that Pi. We believe it is one of the pioneering implementations of network and consume overwhelmingly enormous computing edge server using Android Things OS after its official release resources on the cloud itself. Additionally, data collected by on May 2018. (iii) the mechanism of detection of animals at IoT devices in households or work environment might contain the edge server and corresponding feedback to the higher level people private information (e.g., individual picture or record of the edge hierarchy, e.g., the cloud and (iv) the optimization of private conversation). Thus, raising security concerns in IoT technologies for saving resources like power consumption and applications. network bandwidth. Various edge computing solutions have been proposed to Experiments show that our proposed recognition system overcome the above issues. Edge computing introduces the can perform offline image classification tasks with up to concept of processing data at the edge of the network. Gener- seven times higher accuracy and more than two times faster ally, the edge could refer to anything capable of computation evaluation time in comparison with general-purpose cloud between the data source and the central cloud. [1] In edge recognition systems. Besides, it consumes less than 6 % of computing paradigm, the edge, which is operating close to the network bandwidth and only a fraction of the energy as the data source, is responsible for most data processing and well as other computational resources compared with existing decision-making. Thus, reducing both network traffic and approaches. response time delay. Edge computing can be an ideal solution to most IoT applications in the future. [9] However, most The rest of the paper is organized as follows. In Section of such solutions rely upon expensive edge servers, which II, we review some of the related works that have been make them unsuitable for distributed applications like image done previously. In Section III, we describe the design and recognition in agricultural fields. development of our proposed recognition system. In Section IV, we evaluate our proposed scheme. We conclude in Section Recently, researches are being carried out to use smart- V. phones [10] or small microcontrollers like Raspberry Pi or 300
2019 4th International Conference on Information Technology (InCIT2019) II. RELATED WORKS There are three major research areas related to our proposed Fig. 1. IoT Kakashi hierarchical system architecture solution. server. The first is the development of the animal image recognition system. The primary challenge in this research area is to III. IOT KAKASHI: A HIERARCHICAL EDGE COMPUTING improve the efficiency and effectiveness of the recognition SOLUTION FOR ANIMAL IMAGE RECOGNITION engine. Up until now, there have been several attempts to develop animals recognition system using computer vision and Traditionally, cloud computing platforms with high compu- machine learning technique, each with its objective and solu- tational capability handle most complex computing tasks, in- tion to the problem. In one research paper, authors implement cluding image recognition. Nowadays, however, the traditional a pattern matching method using normalized cross-correlation approach is not suitable anymore, considering the massive to detect animal presence from a video frame [11]. While the amount of IoT devices that are to be connected to the cloud. authors of [12] proposed a method for the top view animal In our proposed system, the recognition engine is deployed detection using thermal cameras for UAV technology, which on the edge device to optimize the execution time. Using a implemented discrete cosine transform for feature extraction combination of the Internet of things, deep learning and edge and k-nearest neighbor to classify the images. Another re- computing techniques, we develop an edge-based distributed search dedicated to the recognition of different types of cows animal recognition system that can detect animals presence by employing Content-Based Image Retrieval method [13] and determine the types of animals. From our point of view, each study herein has it own specific objective and requirement that make each solution criteria of A. System Architecture functional efficiency and effectiveness varied. Our proposed method aims to improve detection reliability, response time, Inspired by an edge computing framework called ”Dynamic and resource efficiency, which is crucial for on-device-animal- Hierarchical Edge Architecture” (DHEA) [9], our animal detection-system that operate mainly in agricultural fields. recognition system consists of three operational layers called Physical Interaction Layer, Edge Computing Layer, and Cloud Another related research field is the study of the deep Computing Layer. Fig. 1 illustrates our system architecture. neural network for image classification. Since the pioneering The Physical Interaction Layer handles tasks related to real- publication of LeNet-5 [14] and AlexNet, [15] outstanding world data, including data acquisition, motion detection, cap- performance in ImageNet [16] Large Scale Visual Recognition turing animal images, or physical feedback. Both sensors and Challenge in 2012, convolutional neural network (CNN) has actuators belong to this layer. The Edge Computing Layer been ubiquitous in solving complex computer vision challenge. is where core computation processes take place. This layer The major effort in this area, including attempts to design a handles high-frequency computing tasks, including image pro- network that can solve more complex image recognition task cessing, animal recognition, data processing, temporary data with higher accuracy and studies that aim to improve network storage, and also serve as a communication channel between efficiency by various algorithms. Several CNN based models Physical Interaction Layer and Cloud Computing Layer. IoT have created important milestones in this research areas such devices in our system belong to this layer, and thus called edge as ZFNet [17], VGG net [18], GoogLeNet [19], Microsoft devices. For animals detection in the agricultural fields, such ResNet [20], R-CNN [21], Fast R-CNN [22], Faster R-CNN edge servers are deployed at different locations of the fields. [23]. Our proposed model is inspired and based on MobileNets Each edge server can function independently and may have [24], which aim to provide efficient CNN for mobile and a different set of animals to be detected. Hence, the task of embedded devices. detection of animals is distributed among the edge servers. The Cloud Computing Layer handles occasional but complex tasks The last related area of study is the application of edge such as training CNN and data analytic. This layer also serves computing in the Internet of things application. Edge com- as a storage for a large amount of training data that require puting [25] [26], also known as Fog Computing [27], and preprocessing. High-performance cloud servers belong to this Cloudlet Computing [28], is still a relatively new area of research that aims to distribute the computing tasks to the edge of the network. As cloud-based computing paradigm becomes less practical for large-scale IoT application, edge computing rises to resolve the challenge of IoT application deployment. The authors of [1] refer to ”Edge” as any computing capable devices between the data source and the cloud server. Conven- tional edge devices we found in relevant studies have included smartphones [10], smart wearable devices, general computer servers, and embedded devices. Our proposed method deploys recognition engine on embedded edge devices such as Rasp- berry Pi to decentralize computing process from the cloud 301
2019 4th International Conference on Information Technology (InCIT2019) Fig. 2. Deployment Environment Aware Learning Process for CNN based model layer. The Edge Computing Layer and Cloud Computing Layer Fig. 3. Animal Recognition Process Overview often exchange information. However, the only information that can be used in a machine learning algorithm or data CPU TABLE I analysis is transmitted to the Cloud Computing Layer. GPU EDGE DEVICE INFORMATION Memory B. Deployment Environment Aware Learning (DEAL) Raspberry Pi 3 Model B Network 1.2 GHZ quad-core ARM Cortext A53 In this section, we described our proposed ”Deployment En- Broadcom VideoCore IV @ 400 MHz vironment Aware Learning (DEAL)” framework. Our recog- OS 1 GB LPDDR2-900 SDRAM nition engine is based on a MobileNets convolutional neural 10/100 MBPS Ethernet network [24]. We use Tensorflow open-source library to train 802.11n Wireless LAN CNN, and the results are in Tensorflow Lite format. Android Things 1.0 On, Cloud Computing Layer, we train a new optimized suitable for general image recognition on a wide range of CNN based on pre-trained MobileNets using transfer learning categories but tend to confuse and become unreliable when technique. This approach minimizes training time and amount dealing with complicated images. Our recognition engine takes of training data required, as well as encourages rapid continu- an entirely different approach by minimizing the number of ous improvement of the recognition engine after deployment. target classes based on the requirement of the location (or As shown in Fig. 2, the overall training process in DEAL environment). Using our training method called ”Deployment framework are the following. Environment Aware Learning,” we trained our recognition system with classes of animals commonly spotted around the 1) Deployed edge devices generate location data using GPS particular deployment environment. on Physical Interaction Layer and transmit that data to Cloud Computing Layer via the Internet. C. Animal Recognition Process 2) On Cloud Computing Layers, transmitted location data Fig. 3 illustrates the flow of our recognition process, which are processed and compared with animals database. The encompasses every layer in our system architecture. The over- results are a list of potential animals that can be found all animal recognition process is described as the following. at the target location. 1) The recognition process is triggered by a motion-sensing 3) The list of potential animals is used to generate unique module on the Physical Interaction Layer that sends training data by pulling image data from an animal signals to the edge device on Edge Computing Layer. image database and fetch more if required from the Internet with web crawlers. The results are training data 2) The edge device then initiates camera handling session consisting of images from animals that frequent the area. in collaboration with a camera module on Physical Interaction Layer to capture animal images. 4) The generated training data is used, together with the based CNN model, in the transfer learning algorithm. 3) The edge device pre-processes captured image which is The result is a unique model built specifically for initially a bitmap by re-scaling it to 224x224 resolution detecting animals in the target environment. and saves it in PNG format 5) The trained model is optimized by compressing algo- 4) The recognition engine embedded in the edge device rithm to make it suitable for deploying on edge devices then performs image recognition task locally using the over the Internet. The result is relatively small, and fast saved PNG image as an input. Reducing the size of the CNN optimized for particular edge devices. input image beforehand reduces the computing resource consumed by CNN as well as recognition process time. 6) The optimized model is distributed to all edge devices in the target location. Thus, conclude the training process 5) The results generated from CNN are processed into a in DEAL framework. more insightful format and portable formats such as JSON and XML. Cloud-based image recognition services, e.g., Google Cloud Vision, Amazon Rekognition, and Clarifai usually aim to 6) The converted results are sent to the cloud server on provide a universal general-purpose image recognition engine Cloud Computing Layer for further analysis. When trained to recognize countless many classes spanning across accumulated, these results can be used to improve our a vast number of categories. Such recognition engines are model in the future. 302
2019 4th International Conference on Information Technology (InCIT2019) IV. PERFORMANCE EVALUATION Finally, we conducted a long term experiment to estimate our system energy consumption. The experiment setup consists We developed a proof-of-concept system to evaluate the of our prototype system (a Raspberry Pi 3, a PIR sensor quality of IoT application using our proposed framework. For module, and a Pi Camera) and a portable 3g router, both of this experiment, we deployed our animal recognition system in which are powered by a 12v 45AH battery. The recognition an agricultural region of Tsuruoka city in Yamagata prefecture engine is configured to simulate a recognition process, as of Japan. In recent years, several farmers in Tsuruoka have shown in Fig. 3, every 5 minutes. As a result, our prototype reported losing their crops to raiding wild animals. From our lasts averagely 7 days before the battery run out of power. observation, common animal types spotted in this area are Further experiment with a solar panel attached to the prototype monkeys, birds, bears, and wild boars. As such, these types of will be carried out in the future. animals were used as target animal classes for DEAL enhanced training process. For this research, our primary objectives are TABLE II to provide fast response time and reliable detection result. ACCURACY RATE OF OUR PROPOSED SYSTEM COMPARED TO Accordingly, if there is a monkey in the area, our recognition system should report a monkey presence as soon as it enters CLOUD-BASED SOLUTION the motion detection range. Model Top-1 accuracy Top-3 accuracy Table I describes edge device system information. Due to Our Proposed System 0.9 0.96 its affordable price and high computing capability, we chose Google Cloud Vision 0.57 0.77 the Raspberry Pi 3 Model B as our prototype development Amazon Rekognition 0 0.7 platform. For motion detection, we employed a PIR sensor Clarifai 0.13 0.2 that detects infrared (IR) light radiating from objects in its field of view. Pi Camera is used to capture animal images. Fig. 4. System overview Lastly, we use Android Things OS as the operating system. Fig. 4 shows an overview of our developed system. TABLE III EVALUATION TIME OF OUR PROPOSED SYSTEM COMPARED TO We compare the performance of our system with that of cloud-based image recognition services, including Google CLOUD-BASED SYSTEM Cloud Vision, Amazon Rekognition, and Clarifai. To evaluate the performance of each of these engines, we run the same Model Min Eval. Time Max Eval. Time recognition tasks on every recognition system. Afterward, we Our System 0.3190 1.6400 evaluate our model performance Top-1 and Top-3 accuracy. Google Vision 0.7233 1.3768 [15]. Amazon Rekognition 1.2835 2.5770 Clarifai 0.9491 1.8757 In animal recognition system, high accuracy and fast re- sponse time are crucial because users always expect highly TABLE IV accurate results, and they also want to respond to animal EXECUTION TIME OF OUR PROPOSED SYSTEM COMPARED TO intrusion as soon as possible. CLOUD-BASED SYSTEM As observed in Table II, our proposed recognition engine, which has been trained using Deployment Environment Aware Model Min Exec. Time Max Exec. Time Learning (DEAL), shows up to seven times higher accuracy Our System 0.962 3.039 than other mechanisms in Top-1 accuracy category. It should Google Vision 2.456 12.571 be noted that since the correct detection result would allow the Amazon Rekognition 2.596 8.446 system to develop high precision animal repelling technique, Clarifai 2.147 8.803 Top-1 accuracy is essential for this animal recognition use case. Also, our proposed scheme scores higher in Top-3 V. CONCLUSION accuracy in comparison with other general-purpose cloud- based solution for animal image recognition. In this paper, we propose an edge-based Internet of things solution to detect the intrusion of wild animals in agricultural Tables III and IV show the evaluation time and execution time taken by each system under comparison. It can be observed that our proposed scheme leads to more than two times faster minimum evaluation time and more than two times faster minimum execution time. Table V shows the network bandwidth used by each system in comparison. It can be observed that our proposed system uses less than 6 % of network bandwidth compared with other existing schemes. This is important for a system that needs to be deployed in the field where internet connection might be slow. 303
2019 4th International Conference on Information Technology (InCIT2019) TABLE V Conference on Business and Industrial Research (ICBIR). Bangkok, BANDWIDTH USAGE OF OUR PROPOSED SYSTEM TO CLOUD-BASED Thailand: IEEE, 2018, pp. 7–12. [10] C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, M. Yunsheng, S. Chen, SOLUTION and P. Hou, “A New Deep Learning-Based Food Recognition System for Dietary Assessment on An Edge Computing Service Infrastructure,” Model Bandwidth Usage (kb/s) IEEE Transactions on Services Computing, vol. 11, pp. 249–261, 2018. Our Proposed System 4.3 [11] S. Sharma, D. Shah, R. Bhavsar, B. Jaiswal, and K. Bamniya, “Auto- Google Cloud Vision 71.8 mated detection of animals in context to Indian scenario,” Proceedings Amazon Rekognition 70.7 - International Conference on Intelligent Systems, Modelling and Simu- Clarifai 72.5 lation, ISMS, vol. 2015-Septe, pp. 334–338, 2015. [12] P. Christiansen, K. A. Steen, R. N. Jørgensen, and H. Karstoft, “Au- fields. We propose an efficient and effective animal recognition tomated detection and recognition of wildlife using thermal cameras,” system based on a convolution neural network model to Sensors (Switzerland), vol. 14, pp. 13 778–13 793, 2014. detect a specific type of animals in the deployment area. [13] T. Sutojo, P. S. Tirajani, D. R. I. M. Setiadi, C. A. Sari, and E. H. While cloud-based recognition service such as Google Vision, Rachmawanto, “CBIR for classification of cow types using GLCM and Amazon Rekognition, or Clarifai provide a universal general- color features extraction,” 2017 2nd International conferences on In- purpose model, we propose an optimized and specialized formation Technology, Information Systems and Electrical Engineering model trained using Deployment Environment Aware Learning (ICITISEE), pp. 182–187, 2017. (DEAL) method. With DEAL, the model is trained based on [14] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, specific types of animal populated in the target deployment W. Hubbard, and L. D. Jackel, “Backpropagation Applied to Handwritten area. Even though we use a high-performance cloud server for Zip Code Recognition,” Neural Computation, vol. 1, pp. 541–551, 1989. the training process, we deploy our recognition engine directly [15] A. Krizhevsky, I. Sutskever, and H. Geoffrey E., “ImageNet Classifica- on the relatively low performance and power-efficient edge tion with Deep Convolutional Neural Networks,” Advances in Neural devices to perform offline recognition tasks. This approach Information Processing Systems 25 (NIPS2012), pp. 1–9, 2012. reduces computation on the cloud, which allows for large-scale [16] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, deployment in the future and efficiently improves recogni- “ImageNet: A large-scale hierarchical image database,” 2009 IEEE tion performance. Experiments have shown that our proposed Conference on Computer Vision and Pattern Recognition, pp. 248–255, recognition system can perform offline image classification 2009. tasks with up to seven times high accuracy and more than [17] M. D. Zeiler and R. Fergus, “Visualizing and understanding convo- two times faster evaluation time in comparison with general- lutional networks,” in Lecture Notes in Computer Science (including purpose cloud recognition systems. Also, it consumes less subseries Lecture Notes in Artificial Intelligence and Lecture Notes in than 6% of the network bandwidth and only a fraction of Bioinformatics), vol. 8689 LNCS, 2014, pp. 818–833. the energy as well as other computational resources compared [18] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks with existing approaches. for Large-Scale Image Recognition,” in International Conference on Learning Representations (ICRL), 2014, pp. 1–14. REFERENCES [19] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” [1] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge Computing: Vision Proceedings of the IEEE Computer Society Conference on Computer and Challenges,” IEEE Internet of Things Journal, vol. 3, pp. 637–646, Vision and Pattern Recognition, vol. 07-12-June, pp. 1–9, 2015. 2016. [20] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Arxiv, vol. 7, pp. 171–180, 2015. [2] L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,” [21] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature Computer Networks, vol. 54, pp. 2787–2805, 2010. hierarchies for accurate object detection and semantic segmentation,” Proceedings of the IEEE Computer Society Conference on Computer [3] S. D. T. Kelly, N. K. Suryadevara, and S. C. Mukhopadhyay, “Towards Vision and Pattern Recognition, pp. 580–587, 2014. the implementation of IoT for environmental condition monitoring in [22] R. Girshick, “Fast R-CNN,” Proceedings of the IEEE International homes,” IEEE Sensors Journal, vol. 13, pp. 3846–3853, 2013. Conference on Computer Vision, vol. 2015 Inter, pp. 1440–1448, 2015. [23] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards [4] H. Movafegh and M. Rastgarpour, “Integeration challenges of intelligent Real-Time Object Detection with Region Proposal Networks,” IEEE transporation systems with connected vehicle,cloud computing and in- Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. ternet of things technologies,” IEEE Wireless Communications, vol. 22, 1137–1149, 2017. pp. 122–128, 2015. [24] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional [5] K. A. Patil and N. R. Kale, “A model for smart agriculture using IoT,” Neural Networks for Mobile Vision Applications,” in ArXiv, 2017, p. 9. 2016 International Conference on Global Trends in Signal Processing, [25] P. Garcia Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino, Information Computing and Communication (ICGTSPICC), pp. 543– A. Iamnitchi, M. Barcellos, P. Felber, and E. Riviere, “Edge-centric 545, 2016. Computing,” ACM SIGCOMM Computer Communication Review, vol. 45, pp. 37–42, 2015. [6] D. Evans, “The Internet of Things - How the Next Evolution of the [26] M. Satyanarayanan, P. Simoens, Y. Xiao, P. Pillai, Z. Chen, K. Ha, Internet is Changing Everything,” CISCO white paper, pp. 1–11, 2011. W. Hu, and B. Amos, “Edge analytics in the internet of things,” IEEE Pervasive Computing, vol. 14, pp. 24–31, 2015. [7] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of Things [27] T. H. Luan, L. Gao, Z. Li, Y. Xiang, G. Wei, and L. Sun, “Fog (IoT): A vision, architectural elements, and future directions,” Future Computing: Focusing on Mobile Users at the Edge,” in ArXiv, 2015, Generation Computer Systems, vol. 29, pp. 1645–1660, 2013. pp. 1–11. [28] K. Gai, M. Qiu, H. Zhao, L. Tao, and Z. Zong, “Dynamic energy-aware [8] B. Zhang, N. Mor, J. Kolb, D. S. Chan, K. Lutz, E. Allman, cloudlet-based mobile cloud computing model for green computing,” J. Wawrzynek, E. A. Lee, and J. Kubiatowicz, “The Cloud is Not Journal of Network and Computer Applications, vol. 59, pp. 46–54, Enough: Saving IoT from the Cloud.” 7th USENIX Workshop on Hot 2016. Topics in Cloud Computing, HotCloud ’15, Santa Clara, CA, USA, July 6-7, 2015., pp. 21–21, 2015. [9] S. M. S. Zabir, P. Yoosook, C. Khamsaeng, S. Kiatikitikul, and N. Shiratori, “DHEA: A dynamic hierarchical edge architecture for a participatory approach toward IoT evolution,” in 2018 5th International 304
2019 4th International Conference on Information Technology (InCIT2019) Extracting Components from Thai-Official Documents using Image Processing and Machine Learning Techniques Phatthanaphong Chomphuwiset Krisana Kotprom Computer Science Department, Mahasarakham University Computer Science Department, Mahasarakham University Maha Sarakham, Thailand Maha Sarakham, Thailand [email protected] [email protected] Panu Chaisrirak Khanabhorn Kawattikul Computer Science Department, Mahasarakham University Department of Information System Rajamangala University of Technology Tawan-ok Maha Sarakham, Thailand [email protected] Chanthaburi, Thailand [email protected] Abstract—This work proposes a technique for classifying Thai- such as titles, receiving dates and sender information etc., are official documents and extracting the document components extracted as the meta-information and collected in the archive (titles, official id, dates, from, attachments, references, others). system. With large and galore volume of the documents, The classification is performed using a two-stage approach, i.e. (i) manual operation of both classifying the document types initial classification and (ii) refinement process. The classification and is essentially tedious and prone to human errors. An is carried out on the specific regions (the headers) of the automated system, as a consequence, is introduced. There documents – as different types of the documents contain different are a number of studies on automatic document management information in the header sections. The results in the classification ranging from low-level processing to high-level techniques that process are then refined in a post-processing procedure, as to process the documents as image-document data and perform improve the classification results. Detecting and exacting the a classification task, such as using domain-specific and deep components of the documents are carried out by applying a learning [1], [2]. mathematics morphological operation method, detecting a set of the component candidates. The false candidate rejection is K. Andreas et.al. propose a deep learning-based approach proceeded using a classification-based technique, i.e. Random for classifying document images into different types [1].The Forest and CNN. The results of the experiments show that work applies a Convolutional Neural Network (CNN) architec- the proposed technique produces promising results for both the ture to extract a descriptive feature before the classification is document classification and component extraction. carried out using an extreme leaning method. Image augmenta- tion is used to escalate the variation of the data, as to enlarge Keywords—Thai-official document classification, docu- the training set to improve the performance. As opposed to ment component detection, Convolutional Neural Network, the standard CNN, the work uses an extreme learning method Random forest, template matching in the dense layer. In addition to CNN, region specific-based is one of applicable methods that has been proposed in the I. INTRODUCTION literature [3], [4]. The work constructs a two-level leaning approach . The first level extracts domain specific information The emergence of technology brings tremendous changes of the visual appearance of the document, trained by a VGG16- and effect to many sectors and organizations worldwide. based model [5]. Then, the output obtained by incorporating Automatic document management systems (ADMSs) are one the specific information will be used to identify distinct regions of the examples that have been implemented and used in many using visual clues that can expectedly discriminate document organizations recently. The systems are typically implemented types. Then, document domain is learned through a fine-tuned to transform the traditional paper-based work-flow scheme to network. The work reports promising results of classifying electronic-based systems, as to diminish process lines and different types of document images. archive information and, yet to increase productivity of the work-flow. Thai official documents are type of the documents This work proposes a technique for classifying Thai-official used to communicate among the official organizations in documents and extracting the document information. There are Thailand. The conventional approach of sending and receiving 2 steps in the classification task, i.e. initial classification and the documents relies on manual record procedures, which refinement process. The classification is carried out using the fundamentally entail separating the documents into different information of the specific region in the documents. The result types and recording meta-information from the documents. In recording the documents, in general, document components, 305
2019 4th International Conference on Information Technology (InCIT2019) Fig. 1: Example of document types (from left to right): and command that line the emblem in the center of the top internal, external, command and announcement. of the document header (see Figure 1). In addition, each type of the documents contains identical and unique patterns such as an alignment of titles. Therefore, the header section of the documents can used to separate or divide the document into different types. However, in this study, the proportion of header section region will be determined. Section III will demonstrate the proper proportion of header segments that can gain considerably promising results. in the classification process is refined in a post-processing B. Classifying the document type procedure – in order to improve the classification results. Extracting information from the documents is performed by The classification is performed by using a Convolution applying a mathematics morphological operation method to Neural Network (CNN). By this, the model is decomposed identify the candidate artefacts of the components. The false into 2 steps, i.e. (i) feature extraction and (ii) classification. In candidates are rejected using a classification method. the feature extraction step, image (I) is convoluted by a set of filters. Then, the features subsets are fed to a fully connected The rest of the paper is organized as follows: Section neural network to result the class of the documents as the II explains the proposed classification and the component output. extraction techniques. Section III demonstrates experiments and results of evaluation of the proposed technique and Section IV provides the conclusion and discussion. II. METHODOLOGY Fig. 2: CNN architecture use for the document classification. This paper presents a technique for Thai official-document In standard CNNs, it can be drawn problem of classification classification and document component extraction. The work is separated into two parts. The first part performs a document by modelling a learning function (or a model) that maps an classification. The classification is carried out to divide the input image I ∈ Rh×w×d (where h and w is the image document images into different types using specific document- resolution and d is the image depth) into some probability pattern information, as shown in Figure 1. Each document vectors y ∈ RC, C is the number of classes, in this work type has identical pattern located in some specific areas in the C = 4 (i.e. internal, external, command and announcement). documents. Therefore, The key idea of the classification is to find some dominant features that can describe and distinguish Each layer, (l), of CNNs derives a transformation process with different types of the documents. In the second part, the doc- ument component extraction is then carried out in the second adjustable parameters followed by non-linear operation: step. The results from the first step will be used in the extrac- tion process, as each type of the documents contain different Il = fl(Wl ∗ Il−1 + bl) (1) patterns of the document structure and information. In this section, the proposed method will be explained. The header where 1 ≤ l ≤ L is the layer index (where L is the section of the documents (as region specific) will be exploited number of layers in the networks), I0 is an input image, Wl, in the classification process. Therefore, the header section bl are model parameters at layer l, ∗ is a convolution operator of the images will be extracted and defined as sub-images. with pre-designated kernels for convolution layers or matrix These sub-images will be used to extract some discriminative multiplication for fully connected layers, and fl is a layer features before the classification is performed using learning specific non-linearity (i.e. activation functions), composed of algorithms. The component extraction will be performed by ReLU(I) = max(0, I), and max-pooling mechanism – a local applying a mathematics morphological operation method to response normalization – or drop-out. The output of the last identify the candidate artefacts of the components. Then, the layer, IL, is input to a softmax function, which outputs a false positive rejection is proceeded using a classification- probability vector over the target classes [5]. In this work, based technique. the header sections of the document is segmented. Then, they A. Header section segmentation Areas or region specific as visual clues are used in the classification process. In general format, all Thai-official doc- uments contain the emblem of Gadura in the header section of the documents (see Figure 4). Each type of the document, however, aligns the emblem differently, accept announcement 306
2019 4th International Conference on Information Technology (InCIT2019) are fed to an CNN (L = 5) for the classification task. The description of the CNN used in this work is demonstrated Fig. 2. C. Detecting the emblem of Gadura Detecting the emblem of Gadura is a straightforward task. An input image (P ) is converted to a grayscale image (I). Then, morphological dilation (δ) and erosion (δ−1) are per- formed to generate a set of candidates of the symbols. To reject the false positive detection, a rule-based is applied. There are 2 criteria used to determine the detection, i.e. location and size of the emblem. Based on the observation, the location of the emblem is aligned in the center or left side in the header region of the documents. (a) dilation operation (b) erosion operation Fig. 4: An example of Thai-official document (internal format). (c) final detection divided into 9 regions (illustrated in Fig. 5). From the visual observation, all the document components are in the top-row Fig. 3: Example of detection of emblem of Gadura using area. Therefore, these areas (top areas – containing 3 regions) mathematical morphology operation. will be used to define the candidate set. The candidate set localization is performed based-on the morphological opera- With the position information, false positive detection is tion on the top-row region (explained in the previous section C), P = {p1, p2, ...pn} where n is the number of component consequently carried out. The rejection is based on br ∈ candidates extracted in the area. [0, 1], br = a where a is the length of the short side of the In the second step, each of the candidate (p) will be classi- b fied into different component types (8 types–including others, K). Therefore, a mapping function is learned to map input bounding boxes of the detection objects, and b is the length candidates (P ) into component type (K), f : P → K. Random forest (as a classifier) will be used to classify the candidates of the longest side between the width and height. A pre- into different types [6]. Each of the candidate as an image (p) will be proceeded to generate a shape descriptor. This work determined threshold of the ratio (τ ) is posed. If the the ratio applies Histogram of Oriented Gradient (HOG) for extracting features from the candidate set before the classification is (br) of a bounding box is lees than the pre-defined threshold performed, [9]. (if br < τ ), the bounding box will be rejected (eradicated). III. EXPERIMENT AND RESULTS An example of the detection is demonstrated in Fig. 3. The previous sections explain the proposed technique for D. Document component extraction classifying the document into different types and extracting document components from the image data. This section, Document components are the key to extract meta- therefore, will demonstrate the performance of the proposed information from the document images. In this work, there technique. are 7 document components in the Thai-official documents and they will be identified and extracted in the document A. Image Data images, i.e. (i) title, (ii) to, (iii) official-id, (iv) date, (v) from, (vi) attachment and (vii) reference (demonstrated in Fig.6). Image data of Thai-official documents were collected from There are 2 sub-processes in the component extraction task. (i) on-line resources and (ii)the archived in the department. In the first step, the component candidates are identified. Spa- tial region-specific information is used to essentially identify the area containing the components. This information will diminish the search space, and yet reduce computational time and result marginally smaller candidate sets. An image (I) is 307
2019 4th International Conference on Information Technology (InCIT2019) TABLE I: Accuracy of the document classification by varying the proportion of the document header. proportion of document header (%) 15 20 25 30 Accuary 85.34 91.74 96.56 93.65 Fig. 5: Region-specific used for extracting document compo- result. Smaller proportion results marginal good outcomes nents. as the information contained in the area is not enough for differentiating the different types of the documents. In ad- (a) title (b) to (c) official id (d) date dition, the larger header portion contains much information, especially content of the document, which is generally broad and generic. Therefore, the performance of the classification drops when increase the number of the header proportion from the document images. (e) from (f) attachment (g) reference (h) other (a) Fig. 6: Document components detected in this study. There are 600 images of 4 document types, i.e. (i) internal (I) (ii) external (E) (iii) command (C) and (iv) announcement (A). Each of document type comprises with 150 image samples (balanced-data). In addition to the dataset, 120 images of the documents were separately collected. This separate data set will be used to determine an appropriate proportion of document ration (explained in Section II(A)) used in the classification stage. B. Experiment and Results Image data are divided into test and train set (30/70). Then, they are resized to the same resolution, for the shake of simplicity. In Section, II, the header section of the document is resected and used in the classification process. To determine an appropriate proportion of the header, the size (s) of proportion of document header is varied (an example is depicted in Fig. 7). An experiment was conducted and the result is demonstrated in Table I. (a) 15% (b) 20% (b) (c) 25% (d) 30% Fig. 8: Results from the classification (into 4 classes, i.e.(i)internal, (ii) external, (ii) command and (iv) announce- Fig. 7: Proportion of document headers using in the ment) (a) result from initial classification and (b) result ob- experiment–varying by 15%, 20%, 25% and 30% of the whole tained from refinement process. document image. Fig. 8(a) demonstrates the overall performance of the doc- From Table I, extracting 25% from the document image ument classification process (using 25% of header section). in the top region (as the header section) yields the best There are some mis-classified documents obtaining from the results, specifically falling into command and announcement document class. By visual observation, the header section of command and announcement are likely the same, excepting there are two distinct keyword explained the type of the 308
2019 4th International Conference on Information Technology (InCIT2019) document classes, i.e. ”command” in command document and (a) ”announcement” in announcement document. These words are located under the emblem of Gadura in the center of document (b) header (see Fig. 3(c)). To leverage this mis-classification Fig. 9: Results of the evaluation of document-component clas- problem, therefore, a post-process is carried out. Section II(D) sification, the confusion matrix demonstrates the classification explains a technique for detecting the emblem of Gadura in the of document components, (1) tile, (2) to, (3) official id, (4) document. Once the symbol is detection, a spatial replacement date, (5) from, (6) attachment, (7) reference and (8) other. (a) (l = 10px) down below the symbol is set to extract the fist the result obtained from Random Forest and (b) the result from line of the texts located under it (Fig. 3(c)). Then, this line of CNN. text will be fed to Tessaract [8] to perform optical character recognition (OCR). Ten first character of the OCR (T10) is Fig. 10: Example of the document component extraction. used. Template matching is then carried out by comparing the output of the OCR and the template words (”command” (Com) and ”announcement”(Ann)). For all the documents classified as command and announcement, if their Levenstein distance between the T10 and Com is less than Ann, they are classified to command; otherwise they are announcement documents class. Fig. 8(b) shows the result obtained from the post-processing process, which is superior to the original classification. In document component extraction, there are two classifi- cation techniques applied in this work, i.e. (i) Random forest (t = 50) and HOG and (ii) CNN. There are 1,283 data images used in the classification (8 classes). Random forest is applied in this work as to demonstrate an example of conventional classification methods using explicit feature extraction and learning algorithm. The numbers of parameter of Random Forest is minimal comparing to other learning algorithms, e.g. Artificial Neural Network (ANN). In addition, it can produce probability results for classification problems. The output of the extracting (as classification process) is shown in Fig. 9. The results, shown in Fig. 9, show that CNN can produce better outcome (achieving 0.99 of accuracy) than Random forest and HOG (0.61 of accuracy). An example of the results is depicted in Fig. 10. IV. CONCLUSION The objective of this work is divided into two folds, i.e. (i) Thai-official document classification and (ii) the document component extraction. In the document classification task, the classification is performed using a two-stage approach, i.e. initial classification and refinement process. The classification is performed using the specific location or region-specific information (the headers) of the documents – as different types of the documents have generally contain information in some areas. The result in the classification process is then refined in the post-processing procedure, in order to improve the classification results. Detecting and exacting the components from the documents is carried out by applying a mathematics morphological operation method to detection the component candidates. The false candidate rejection is proceeded using a classification technique, Random Forest and CNN. The results of the experiments show that the proposed technique produces promising results for both the classification and component extraction. 309
2019 4th International Conference on Information Technology (InCIT2019) In the future work, document analysis technique will be ap- plied to provide initial prior for the classification. In addition, OCR can be implemented on detected components in order to develop an archive system for Thai-official documents. REFERENCES [1] L. Andreas, A. Z. Muhammad, E. Markus, L. Marcus. ”Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines”, CoRR, abs/1711.05862, 2017. [2] T. Chris, M. Tony, ”Analysis of Convolutional Neural Networks for Document Image Classification”, in Document Analysis and Recognition (ICDAR, 2017. [3] D. Arindam, R. Saikat and B. Ujjwal, ”Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks”, 2018 24th International Con- ference on Pattern Recognition (ICPR), 2018, pp. 3180-3185. [4] A. W. Harley, A. Ufkes, and K. G. Derpanis, ”Evaluation of deep convolutional nets for document image classification and retrieval”, in Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015, pp. 991–995. [5] K. Simonyan and A. Zisserman, ”Very Deep Convolutional Networks for Large-Scale Image Recognition”, CoRR, abs/1409.1556, 2014. [6] D. Navneet and T. Bill. ”Histograms of Oriented Gradients for Human Detection”. In Proceedings of the 2005 IEEE Computer Society Confer- ence on Computer Vision and Pattern Recognition (CVPR’05) - Volume 1 - Volume 01 (CVPR ’05), Vol. 1. IEEE Computer Society. [7] B. Leo, ”Random Forests”. Mach. Learn. 45, 1 (October 2001), 5-32 [8] K. Anthony, ”Tesseract: an open-source optical character recognition engine”. Linux J. 2007, 159 (July 2007). [9] N. Dalal and B. Triggs, ”Histograms of oriented gradients for human detection”, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005,vol. 1, pp. 886-893. 310
2019 4th International Conference on Information Technology (InCIT2019) List of Reviewers) Adisak Suasaming Ferdin Joe John Joseph Akadej Udomchaiporn Hiroshi Tsunoda Albert Alexander Hutchatai Chanlekha Amorn Jiraseree-amornkun Jakkarin Suksawatchon Anantaporn Hanskunatai Jaree Thongkam Annop Monsakul Jian Qu Assadarat Khurat Jirabhorn Chaiwongsai Aziz Nanthaamornphong Kanakarn Ruxpaitoon Boonsit Yimwadsana Kanokwan Atchariyachanvanich Chadaporn Keatmanee Kanticha Kittipeerachon Chakadkit Thaenchaikun Kanuengnij Kubola Chakree Teekapakvisit Kanyarat Sriwistiyakun Chetneti Srisaan Karin Sumongkayothin Cholrit Luangjinda Karn Yongsiriwit Chutima Beokhaimook Kasem Thiptarajan Chuwong Phongcharoenpanich Kazuhiko Hamamoto Danai Phaoharuhansa Kiattisak Maichalernnukul Datchakorn Tancharoen Kingkarn Sookhanaphibarn Dechanuchit Katanyutaveetip Kitsiri Chochiang Duangjai Jitkongchuen Kitsuchart Pasupa Ekarat Rattagan Kosin Chamnongthai 311
2019 4th International Conference on Information Technology (InCIT2019) Krisana Chinnasarn Prajaks Jitngernmadan Kuntpong Woraratpanya Pramuk Boonsieng Kwankamon Dittakan Pranisa Israsena Maleerat Sodanil Preecha Tangworakitthaworn Nantika Prinyapol Pruegsa Duangphasuk Narungsun Wilaisakoolyong Pruet Putjorn Nilubon Kurubanjerdjit Ratchakoon Pruengkarn Nol Premasathian Rattana Wetprasit Nucharee Premchaiswadi Rojanee Khummongkol Nutchanun Chinpanthana Ruttikorn Varakulsiripunth Olarn Wongwirat Sakchai Tangwannawit Pakachart Puttipakorn Sakchai Thipchaksurat Pakapan Limtrairut Sakorn Mekruksavanich Pattarachai Lalitrojwong Saprangsit Mruetusatorn Phaisarn Sudwilai Sarayut Nonsiri Phayung Meesad Saromporn Charoenpit Pichit Sukchareonpong Sinchai Kamolphiwong Piyanuch Chaipornkaew Sirichai Hemrungrote Ponrudee Netisopakul Siriporn Supratid Pornavalai Chotipat Songsri Tangsripairoj Pornthep Rojanavasu Soontarin Nupap Prajak Chertchom Sophon Mongkolluksamee 312
2019 4th International Conference on Information Technology (InCIT2019) Srisupa Palakvangsa Na Ayudhya Vanvisa Chutchavong Suebtas Limsaihua Vasaka Visoottiviseth Sunantha Sodsee Vasin Chooprayoon Suppakarn Chansareewittaya Virach Sornlertlamvanich Suppat Rungraungsilp Vithida Chongsuphajaisiddhi Surangkana Rawungyot Voravika Wattanasoontorn Suttisak Jantavongso Warakorn Srichavengsup Takashi Mitsuishi Warangkhana Kimpan Tanapon Jensuttiwetchakul Wasuwath Pongkachorn Thana Sukvaree Watchareewan Jitsakul Thana Udomsripaiboon Werasak Kurutach Thanapon Noraset Wimol San-Um Theekapun Charoenpong Worapan Kusakunniran Thitinan Tantidham Worapat Paireekreng Thitiporn Lertrusdachakul Worawat Choensawat Thongchai Kaewkiriya Wudhichart Sawangphol Todsanai Chumwattana Yoko Nakajima Toshiaki Kondo 313
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338