4. Fit and train your model. 5. Evaluate the model. Loading your data This step is pretty trivial in our model but is often the most complex and difficult part of building your entire program. You have to look at your data (whether an XOR gate or a database of factors affecting diabetic heart patients) and figure out how to map your data and the results to get to the information and predictions that you want. Defining your neural-network model and layers Defining your network is one of the primary advantages of Keras over other frameworks. You basically just construct a stack of the neural layers you want your data to flow through. Remember TensorFlow is just that. Your matrices of data flowing through a neural network stack. Here you chose the configuration of your neural layer and activation functions. Compiling your model Next you compile your model which hooks up your Keras layer model with the efficient underlying (what they call the back-end) to run on your hardware. You also choose what you want to use for a loss function. Fitting and training your model This is where the real work of training your network takes place. You will determine how many epochs you want to go through. It also accumulates the history of what is happening through all the epochs, and we will use this to create our graphs. Our Python code using TensorFlow, NumPy, and Keras for the two-layer neural network of Figure 2-6 follows. Using nano (or your favorite text editor), open up a file called TensorFlowKeras.py and enter the following code: import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras.layers import Activation, Dense 384 BOOK 4 Using Artificial Intelligence in Python
import numpy as np Building a Neural Network in Python # X = input of our 3 input XOR gate # set up the inputs of the neural network (right from the table) X = np.array(([0,0,0],[0,0,1],[0,1,0], [0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]), dtype=float) # y = our output of our neural network y = np.array(([1], [0], [0], [0], [0], [0], [0], [1]), dtype=float) model = tf.keras.Sequential() model.add(Dense(4, input_dim=3, activation='relu', use_bias=True)) #model.add(Dense(4, activation='relu', use_bias=True)) model.add(Dense(1, activation='sigmoid', use_bias=True)) model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy']) print (model.get_weights()) history = model.fit(X, y, epochs=2000, validation_data = (X, y)) model.summary() # printing out to file loss_history = history.history[\"loss\"] numpy_loss_history = np.array(loss_history) np.savetxt(\"loss_history.txt\", numpy_loss_history, delimiter=\"\\n\") binary_accuracy_history = history.history[\"binary_accuracy\"] numpy_binary_accuracy = np.array(binary_accuracy_history) np.savetxt(\"binary_accuracy.txt\", numpy_binary_accuracy, delimiter=\"\\n\") print(np.mean(history.history[\"binary_accuracy\"])) result = model.predict(X ).round() print (result) CHAPTER 2 Building a Neural Network in Python 385
After looking at the code, we will run the neural network and then evaluate the model and results. Breaking down the code The first thing to notice about our code is that it is much simpler than our two- layer model strictly in Python used earlier in this chapter. That’s the magic of TensorFlow/Keras. Go back and compare this code to the code for our two-layer network in pure Python. This is much simpler and easier to understand. First, we import all the libraries you will need to run our example two-layer model. Note that TensorFlow includes Keras by default. And once again we see our friend NumPy as the preferred way of handling matrices. import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras.layers import Activation, Dense import numpy as np Step 1, load and format your data. In this case, we just set up the truth table for our XOR gate in terms of NumPy arrays. This can get much more complex when you have large, diverse, cross-correlated sources of data. # X = input of our 3 input XOR gate # set up the inputs of the neural network (right from the table) X = np.array(([0,0,0],[0,0,1],[0,1,0], [0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]), dtype=float) # y = our output of our neural network y = np.array(([1], [0], [0], [0], [0], [0], [0], [1]), dtype=float) Step 2, define your neural-network model and layers. This is where the real power of Keras shines. It is very simple to add more neural layers, and to change their size and their activation functions. We are also applying a bias to our activation function (relu, in this case, with our friend the sigmoid for the final output layer), which we did not do in our pure Python model. See the commented model.add statement below? When we go to our three-layer neural-network example, that is all we have to change by uncommenting it. 386 BOOK 4 Using Artificial Intelligence in Python
model = tf.keras.Sequential() Building a Neural Network in Python model.add(Dense(4, input_dim=3, activation='relu', use_bias=True)) #model.add(Dense(4, activation='relu', use_bias=True)) model.add(Dense(1, activation='sigmoid', use_bias=True)) Step 3, compile your model. We are using the same loss function that we used in our pure Python implementation, mean_squared_error. New to us is the opti- mizer, ADAM (a method for stochastic optimization) is a good default optimizer. It provides a method for efficiently descending the gradient applied to the weights of the layers. One thing to note is what we are asking for in terms of metrics. binary_accuracy means we are comparing our outputs of our network to either a 1 or a 0. You will see values of, say, 0.75, which, since we have eight possible outputs, means that six out of eight are correct. It is exactly what you would expect from the name. model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy']) Here we print out all the starting weights of our model. Note that they are assigned with a default random method, which you can seed (to do the same run with the same starting weights time after time) or you can change the way they are added. print (model.get_weights()) Step 4, fit and train your model. We chose the number of epochs so we would con- verge to a binary accuracy of 1.0 most of the time. Here we load the NumPy arrays for the input to our network (X) and our expected output of the network (y). The validation_data parameter is used to compare the outputs of your trained net- work in each epoch and generates val_acc and val_loss for your information in each epoch as stored in the history variable. history = model.fit(X, y, epochs=2000, validation_data = (X, y)) Here we print a summary of your model so you can make sure it was constructed in the way expected. model.summary() CHAPTER 2 Building a Neural Network in Python 387
Next, we print out the values from the history variable that we would like to graph. # printing out to file loss_history = history.history[\"loss\"] numpy_loss_history = np.array(loss_history) np.savetxt(\"loss_history.txt\", numpy_loss_history, delimiter=\"\\n\") binary_accuracy_history = history.history[\"binary_accuracy\"] numpy_binary_accuracy = np.array(binary_accuracy_history) np.savetxt(\"binary_accuracy.txt\", numpy_binary_accuracy, delimiter=\"\\n\") Step 5, evaluate the model. Here we run the model to predict the outputs from all the inputs of X, using the round function to make them either 0 or 1. Note that this replaces the criteria we used in our pure Python model, which was <0.1 = \"0\" and >0.9 = \"1\". We also calculated the average of all the binary_accuracy values of all the epochs, but the number isn’t very useful — except that the closer to 1.0 it is, the faster the model succeeded. print(np.mean(history.history[\"binary_accuracy\"])) result = model.predict(X ).round() print (result) Now let’s move along to some results. Evaluating the model When you run TensorFlow programs, you may see something like this: /usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.4 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5 return f(*args, **kwds) /usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: builtins.type size changed, may indicate binary incompatibility. Expected 432, got 412 return f(*args, **kwds) This is because of a problem with the way TensorFlow was built for your machine. These warnings can be safely ignored. The good folks over at TensorFlow.org say this issue will be fixed in the next version. 388 BOOK 4 Using Artificial Intelligence in Python
We run the two-layer model by typing python3 TensorFlowKeras.py in our termi- Building a Neural Network in Python nal window. After watching the epochs march away (you can change this amount of output by setting the Verbose parameter in your model.fit command), we are rewarded with the following: ... Epoch 1999/2000 8/8 [==============================] - 0s 2ms/step - loss: 0.0367 - binary_ accuracy: 1.0000 - val_loss: 0.0367 - val_binary_accuracy: 1.0000 Epoch 2000/2000 8/8 [==============================] - 0s 2ms/step - loss: 0.0367 - binary_ accuracy: 1.0000 - val_loss: 0.0367 - val_binary_accuracy: 1.0000 _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 4) 16 _________________________________________________________________ dense_1 (Dense) (None, 1) 5 ================================================================= Total params: 21 Trainable params: 21 Non-trainable params: 0 _________________________________________________________________ 0.8436875 [[1.] [0.] [0.] [0.] [0.] [0.] [0.] [1.]] We see that by epoch 2,000 we had achieved the binary accuracy of 1.0, as hoped for, and the results of our model.predict function call at the end matches our truth table. Figure 2-6 shows the results of the loss function and binary accuracy values plotted against the epoch number as the training progressed. Figure 2-2 shows what graphically we are implementing. A couple of things to note. The loss function is a much smoother linear curve when it succeeds. This has to do with the activation choice (relu) and the opti- mizer function (ADAM). Another thing to remember is you will get a different curve (somewhat) each time because of the random number initial values in the weights. Seed your random number generator to make it the same each time you run it. This makes it easier to optimize your performance. CHAPTER 2 Building a Neural Network in Python 389
BACKPROPAGATION IN KERAS With our first neural network in this chapter, we made a big deal about backpropaga- tion and how it was a fundamental part of neural networks. However, now we have moved into Keras/TensorFlow and we haven’t said one word about it. The reason for this is that the backpropogation in Keras/TensorFlow is handled automatically. It’s done for you. If you want to modify how it is doing it, the easiest way is to modify the optimi- zation parameter in the module.compile command (we used ADAM). It is quite a bit of work to dramatically modify the backpropogation algorithm in Keras, but it can be done. When you run your training for the network, you are using the backpropagation algo- rithm and optimizing this according to the chosen optimization algorithm and loss func- tion specified when compiling the model. FIGURE 2-6: Results of the two-layer training. Note when the binary accuracy goes to 1.00 (about epoch 1556). That’s when your network is fully trained in this case. Changing to a three-layer neural network in TensorFlow/Keras Now let’s add another layer to our neural network, as shown in Figure 2-7. Open your TensorFlowKeras.py file in your favorite editor and change the following: 390 BOOK 4 Using Artificial Intelligence in Python
model.add(Dense(4, input_dim=3, activation='relu', Building a Neural use_bias=True)) Network in Python #model.add(Dense(4, activation='relu', use_bias=True)) model.add(Dense(1, activation='sigmoid', use_bias=True)) Remove the comment character in front of the middle layer, and you now have a three-layer neural network with four neurons per layer. It’s that easy. Here is what it should look like now: model.add(Dense(4, input_dim=3, activation='relu', use_bias=True)) model.add(Dense(4, activation='relu', use_bias=True)) model.add(Dense(1, activation='sigmoid', use_bias=True)) Run the program and you will now have your results from the three-layer neural network, which will look something like this: 8/8 [==============================] - 0s 2ms/step - loss: 0.0153 - binary_ accuracy: 1.0000 - val_loss: 0.0153 - val_binary_accuracy: 1.0000 Epoch 2000/2000 8/8 [==============================] - 0s 2ms/step - loss: 0.0153 - binary_ accuracy: 1.0000 - val_loss: 0.0153 - val_binary_accuracy: 1.0000 _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 4) 16 _________________________________________________________________ dense_1 (Dense) (None, 4) 20 _________________________________________________________________ dense_2 (Dense) (None, 1) 5 ================================================================= Total params: 41 Trainable params: 41 Non-trainable params: 0 _________________________________________________________________ 0.930375 [[1.] [0.] [0.] [0.] [0.] [0.] [0.] [1.]] CHAPTER 2 Building a Neural Network in Python 391
WHY USE A GUI (GRAPHICAL USER INTERFACE) TO RUN TensorFlow? As you should know by now, you spend a lot of time coding in text editors to build your models. For simplicity’s sake, we exported our data out to Excel to produce the graphs in this chapter. Most of the time, we use our machines in a terminal window, but there is a big advantage to using your computer’s full GUI desktop to open these terminal windows for editing. That big advantage is called TensorBoard. TensorBoard is a part of TensorFlow and is available to you in a browser such as Chrome or Firefox. You point TensorBoard at your job directory and you suddenly can do all sorts of visual analysis of your neural network experiments. FIGURE 2-7: Results of the three-layer training. You now can see that you have three layers in your neural network. This is one reason why the TensorFlow/Keras software is so powerful. It’s easy to tinker with parameters and make changes. Notes on our three-layer run: First of all, it converges to a binary accuracy of 1.00 at about epoch 916, much faster than epoch 1556 from our two-layer run. The loss function is less linear than the two-layer run’s. Just for fun and giggles, we changed the number of neurons to 100 per each of the hidden layers. As expected, it converged to a binary accuracy of 1.00 at epoch 78, much faster than the earlier run! Run your own experiments to get a good feel for the way your results will vary with different parameters, layers, and neuron counts. Believe it or not, you now understand a great deal about how neural networks and machine learning works. Go forth and train those neurons! 392 BOOK 4 Using Artificial Intelligence in Python
IN THIS CHAPTER »»Teaching a machine to learn something »»How machines learn »»Understand the basics of machine learning »»Using TensorFlow to do machine learning 3Chapter Doing Machine Learning in Python What does it mean to learn something? One definition is “the acquisi- tion and mastery of what is already known about something and the extended clarification of meaning of that knowledge.” Another defini- tion is that “learning is a relatively permanent change in a person’s knowledge or behavior due to experience.” At the current (and most likely for some time in the future) state of machine learning, it is the second definition that best fits with the current state of AI. Our culture has developed algorithms and programs that can learn things about data and about sensory input and apply that knowledge to new situations. However, our machines do not “understand” anything about what they have learned. They have just accumulated data about their inputs and have transformed that input to some kind of output. However, even if the machine does not “understand” what it has learned, that does not mean that you cannot do some pretty impressive things using these machine-learning techniques that will be discussed in this chapter. Maybe these techniques that we are developing now may lead the way to some- thing much more impressive in the future. CHAPTER 3 Doing Machine Learning in Python 393
What does it mean for a machine to learn something? We’re going to use the rough idea that if a machine can take inputs and by some process transform those inputs to some useful outputs, then we can say the machine has learned some- thing. This definition has a wide meaning. In writing a simple program to add two numbers you have taught that machine something. It has learned to add two numbers. We’re going to focus in this chapter on machine learning in the sense of the use of algorithms and statistical models that progressively improve their performance on a specific task. If this sounds very much like our neural-network experiments in Chapter 2, you are correct. Machine learning is a lot about neural networks, but it’s also about other sophisticated techniques. Learning by Looking for Solutions in All the Wrong Places One of the real problems with machine learning and AI in general is figuring out how the algorithm can find the best solution. The operative word there is best. How do we know a given solution is best? It is really a matter of setting goals and achieving them (the solution may not be best but maybe just good enough). Some people have compared the problem of finding the “best” solution to that of a person wandering around on a foggy day trying to find the highest mountain in the area. You climb up a mountain and get to the top and then proclaim, “I am on the highest mountain.” Well, you are on the highest mountain you can see, but you can’t see through the fog. However, if you define your goal as being on the top of a mountain more than 1,000 feet high, and you are at 1,250 feet, then you have met your goal. This is called a local maxima, and it may or may not be the best maxima available. In this chapter, most of our goal setting (training the machine) will be done with known solutions to a problem: first training our machine and then applying the training to new, similar examples of the problem. There are three main types of machine-learning algorithms: »» Supervised learning: This type of algorithm builds a model of data that contains both inputs and outputs. The data is known as training data. This is the kind of machine learning we show in this chapter. »» Unsupervised learning: For this type of algorithm, the data contains only the inputs, and the algorithms look for the structures and patterns in the data. »» Reinforcement learning: This area is concerned with software taking actions based on some kind of cumulative reward. These algorithms do not assume 394 BOOK 4 Using Artificial Intelligence in Python
knowledge of an exact mathematical model and are used when exact models Doing Machine are unavailable. This is the most complex area of machine learning, and the Learning in Python one that may bear the most fruit in the future. With that being said, let’s jump into doing machine learning with Python. Classifying Clothes with Machine Learning In this chapter, we use the freely available training Fashion-MNIST (Modified National Institute of Standards and Technology) database that contains 60,000 fashion products from ten categories. It contains data in 28x28 pixel format with 6,000 items in each category. (See Figure 3-1.) This gives us a really interesting dataset with which to build a TensorFlow/Keras machine-learning application — much more interesting than the standard MNIST machine-learning database that contains only handwritten characters. FIGURE 3-1: A bit of the Fashion-MNIST database. Training and Learning with TensorFlow For this chapter, we are going to once again use TensorFlow/Keras to build some machine-learning examples and look at their results. For more about TensorFlow and Keras, refer to Chapter 2. CHAPTER 3 Doing Machine Learning in Python 395
Here we use the same five-step approach we used to build layered networks with Keras in Chapter 2. TensorFlow. 1. Load and format your data. 2. Define your neural network model and layers. 3. Compile the model. 4. Fit and train your model. 5. Evaluate the model. Setting Up the Software Environment for this Chapter Most of the action in this chapter is, as usual, in the command line, because you still have to type code and run software. However, we are going to display some graphics on the screen and use MatPlotLib to evaluate what your machine- learning program is doing, so please start a GUI (graphics user interface) if you haven’t already. If you are running on a headless Raspberry Pi, either add a keyboard, mouse, and monitor or stop now and bring up VNC (virtual n etwork computer). think of using your computer monitor as a display on a second c omputer — the Raspberry Pi, in this case. Many links on the web describe how to do this and how to bring up the GUI on your main computer. We are using VNC on a headless Raspberry Pi in this chapter. Feel free to connect a mouse, keyboard, and a monitor directly to the Raspberry Pi if you want. Figure 3-2 shows the GUI running on the Raspberry Pi (it is actually running on VNC on our Mac, but you can’t tell from this image). A great source for tutorials on setting up the software and connecting the Rasp- berry Pi are located at www.raspberrypi.org. If you are missing some of the libraries that we use in this example, then search the web for how to install them on your specific machine. Every setup is a little different. For example, if you’re missing seaborn, then search on “installing sea- born python library on [name of your machine].” If you do a search on seaborn for the Raspberry Pi, then you will find “sudo pip3 install seaborn.” 396 BOOK 4 Using Artificial Intelligence in Python
Doing Machine Learning in Python FIGURE 3-2: A full GUI on the Raspberry Pi. Creating a Machine-Learning Network for Detecting Clothes Types Our main example of machine learning in Python uses a MNIST-format (MNIST means that it is a collection of grayscale images with a resolution of 28x28 pixels) Fashion database of 60,000 images classified into ten types of apparel, as follows: »» 0 T-shirt/top »» 1 Trouser »» 2 Pullover »» 3 Dress »» 4 Coat »» 5 Sandal »» 6 Shirt CHAPTER 3 Doing Machine Learning in Python 397
»» 7 Sneaker »» 8 Bag »» 9 Ankle boot Getting the data — The Fashion-MNIST dataset Turns out this is pretty easy, although it will take a while to first load it to your computer. After you run the program for the first time, it will use the Fashion- MNIST data copied to your computer. Training the network We will train our machine-learning neural network using all 60,000 images of clothes: 6,000 images in each of the ten categories. Testing our network Our trained network will be tested three different ways: 1) a set of 10,000 training photos from the Fashion_MNIST data set; 2) a selected image from the Fashion_ MNIST data set; and 3) a photo of a woman’s dress. This first version of the program will run a test on a 10,000 set of files from the Fashion_MNIST database. Our Python code using TensorFlow, NumPy, and Keras for the Fashion_MNIST network follows. Using nano (or your favorite text editor), open up a file called FMTensorFlow.py and enter the following code: #import libraries import numpy as np import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns import tensorflow as tf from tensorflow.python.framework import ops from tensorflow.examples.tutorials.mnist import input_data from PIL import Image 398 BOOK 4 Using Artificial Intelligence in Python
# Import Fashion MNIST Doing Machine fashion_mnist = input_data.read_data_sets('input/data', Learning in Python one_hot=True) fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) \\ = fashion_mnist.load_data() class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] train_images = train_images / 255.0 test_images = test_images / 255.0 model = tf.keras.Sequential() model.add(tf.keras.layers.Flatten(input_shape=(28,28))) model.add(tf.keras.layers.Dense(128, activation='relu' )) model.add(tf.keras.layers.Dense(10, activation='softmax' )) model.compile(optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_images, train_labels, epochs=5) # test with 10,000 images test_loss, test_acc = model.evaluate(test_images, test_labels) print('10,000 image Test accuracy:', test_acc) Breaking down the code After you’ve read the description of the Tensform/Keras program in Chapter 2, this code should look much more familiar. In this section, we break it down into our five-step Keras process. CHAPTER 3 Doing Machine Learning in Python 399
First, we import all the libraries needed to run our example two-layer model. Note that TensorFlow includes Keras by default. And once again we see our friend NumPy as the preferred way of handling matrices. #import libraries import numpy as np import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns import tensorflow as tf from tensorflow.python.framework import ops from tensorflow.examples.tutorials.mnist import input_data from PIL import Image 1. Load and format your data. This time we are using the built-in data-set reading capability. It knows what this data is because of the import statement from tensorflow.examples. tutorials.mnist in the preceding code. # Import Fashion MNIST fashion_mnist = input_data.read_data_sets('input/data', one_hot=True) fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) \\ = fashion_mnist.load_data() Here we give some descriptive names to the ten classes within the Fashion_ MNIST data. class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] Here we change all the images to be scaled from 0.0–1.0 rather than 0–255. train_images = train_images / 255.0 test_images = test_images / 255.0 2. Define your neural-network model and layers. Again, this is where the real power of Keras shines. It is very simple to add more neural layers, and to change their sizes and their activation functions. We are also applying a bias to our activation function (relu), in this case with softmax, for the final output layer. 400 BOOK 4 Using Artificial Intelligence in Python
model = tf.keras.Sequential() Doing Machine Learning in Python model.add(tf.keras.layers.Flatten(input_shape=(28,28))) model.add(tf.keras.layers.Dense(128, activation='relu' )) model.add(tf.keras.layers.Dense(10, activation='softmax' )) 3. Compile your model. We are using the loss function sparse_categorical_crossentropy. This function is new to us in this book. It is used when you have assigned a different integer for each clothes category as we have in this example. ADAM (a method for stochastic optimization) is a good default optimizer. It provides a method well suited for problems that are large in terms of data and/or parameters. model.compile(optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_ crossentropy', metrics=['accuracy']) Sparse categorical crossentropy is a loss function used to measure the error between categories across the data set. Categorical refers to the fact that the data has more than two categories (binary) in the data set. Sparse refers to using a single integer to refer to classes (0–9, in our example). Entropy (a measure of disorder) refers to the mix of data between the categories. 4. Fit and train your model. I chose the number of epochs as only 5 due to the time it takes to run the model for our examples. Feel free to increase! Here we load the NumPy arrays for the input to our network (the database train_images). model.fit(train_images, train_labels, epochs=5) 5. Evaluate the model. The model.evaluate function is used to compare the outputs of your trained network in each epoch and generates test_acc and test_loss for your information in each epoch as stored in the history variable. # test with 10,000 images test_loss, test_acc = model.evaluate(test_images, test_labels) print('10,000 image Test accuracy:', test_acc) CHAPTER 3 Doing Machine Learning in Python 401
Results of the training and evaluation I ran my program on the Raspberry Pi 3B+. You can safely ignore the code m ismatch warnings and the future deprecation announcements at this point. Here are the results of the program: Epoch 1/5 60000/60000 [==============================] - 44s 726us/step - loss: 0.5009 - acc: 0.8244 Epoch 2/5 60000/60000 [==============================] - 42s 703us/step - loss: 0.3751 - acc: 0.8652 Epoch 3/5 60000/60000 [==============================] - 42s 703us/step - loss: 0.3359 - acc: 0.8767 Epoch 4/5 60000/60000 [==============================] - 42s 701us/step - loss: 0.3124 - acc: 0.8839 Epoch 5/5 60000/60000 [==============================] - 42s 703us/step - loss: 0.2960 - acc: 0.8915 10000/10000 [==============================] - 4s 404us/step 10,000 image Test accuracy: 0.873 Fundamentally, the test results are saying that with our two-layer neural machine- learning network, we are classifying 87 percent of the 10,000-image test database correctly. We upped the number of epochs to 50 and increased this to only 88.7 percent accuracy. Lots of extra computation with little increase in accuracy. Testing a single test image Next is to test a single image (see Figure 3-3) from the Fashion_MNIST database. Add this code to the end of your program and rerun the software: #run test image from Fashion_MNIST data img = test_images[15] img = (np.expand_dims(img,0)) singlePrediction = model.predict(img,steps=1) print (\"Prediction Output\") print(singlePrediction) print() NumberElement = singlePrediction.argmax() Element = np.amax(singlePrediction) 402 BOOK 4 Using Artificial Intelligence in Python
print (\"Our Network has concluded that the image number '15' is a \" Doing Machine +class_names[NumberElement]) Learning in Python print (str(int(Element*100)) + \"% Confidence Level\") FIGURE 3-3: Image 15 from the Fashion- MNIST test database. Here are the results from a five-epoch run: Prediction Output [[1.2835168e-05 9.9964070e-01 6.2637120e-08 3.4126092e-04 4.4297972e-06 7.8450663e-10 6.2759432e-07 9.8717527e-12 1.2729484e-08 1.1002166e-09]] Our Network has concluded that the image number '15' is a Trouser 99% Confidence Level Woohoo! It worked. It correctly identified the picture as a trouser. Remember, however, that we only had an overall accuracy level on the test data of 87 percent. Next, off to our own picture. Testing on external pictures To accomplish this test, John took a dress from his wife’s closet, hung it up on a wall (see Figure 3-4), and took a picture of it with his iPhone. Next, using Preview on our Mac, we converted it to a resolution of 28 x 28 pixels down from 3024x3024 pixels straight from the phone. (See Figure 3-5.) CHAPTER 3 Doing Machine Learning in Python 403
FIGURE 3-4: Unidentified dress hanging on a wall. FIGURE 3-5: The dress at 28x28 pixels. Okay, a few very important comments. First of all, 28x28 pixels does not result in a very clear picture. However, comparing Figure 3-6 to Figure 3-4 from the Fashion-MNIST database, our picture still looks better. Most of the following code has to do with arranging the data from our JPG picture to fit the format required by TensorFlow. You should be able to use this code to easily add your own pictures for more experiments: # run Our test Image # read test dress image imageName = \"Dress28x28.JPG\" 404 BOOK 4 Using Artificial Intelligence in Python
testImg = Image.open(imageName) Doing Machine testImg.load() Learning in Python data = np.asarray( testImg, dtype=\"float\" ) data = tf.image.rgb_to_grayscale(data) data = data/255.0 data = tf.transpose(data, perm=[2,0,1]) singlePrediction = model.predict(data,steps=1) print (\"Prediction Output\") print(singlePrediction) print() NumberElement = singlePrediction.argmax() Element = np.amax(singlePrediction) print (\"Our Network has concluded that the file '\" +imageName+\"' is a \"+class_names[NumberElement]) print (str(int(Element*100)) + \"% Confidence Level\") The results, round 1 We should start out by saying these results did not make us very happy, as you will see shortly. We put the Dress28x28.JPG file in the same directory as our program and ran a five- epoch training run. Here are the results: Prediction Output [[1.2717753e-06 1.3373902e-08 1.0487850e-06 3.3525557e-11 8.8031484e-09 7.1847245e-10 1.1177938e-04 8.8322977e-12 9.9988592e-01 3.2957085e-12]] Our Network has concluded that the file 'Dress28x28.JPG' is a Bag 99% Confidence Level So, our neural network machine learning program, after classifying 60,000 pictures and 6,000 dress pictures, concluded at a 99 percent confidence level . . . wait for it . . . that John’s wife’s dress is a bag. So the first thing we did next was to increase the training epochs to 50 and to rerun the program. Here are the results from that run: Prediction Output [[3.4407502e-33 0.0000000e+00 2.5598763e-33 0.0000000e+00 0.0000000e+00 CHAPTER 3 Doing Machine Learning in Python 405
0.0000000e+00 2.9322060e-17 0.0000000e+00 1.0000000e+00 1.5202169e-39]] Our Network has concluded that the file 'Dress28x28.JPG' is a Bag 100% Confidence Level The dress is still a bag, but now our program is 100 percent confident that the dress is a bag. Hmmmm. This illustrates one of the problems with machine learning. Being 100 percent certain that a picture is of a bag when it is a dress, is still 100 percent wrong. What is the real problem here? Probably the neural-network configuration is just not good enough to distinguish the dress from a bag. We saw that additional training epochs didn’t seem to help at all, so the next thing to try is to increase the number of neurons in our hidden level. What are other things to try to improve this? It turns out there are many. You can use CNN (convolutional neural networks), data augmentation (increasing the training samples by rotating, shifting, and zooming that pictures) and a variety of other techniques that are beyond the scope of this introduction to machine learning. We did do one more experiment. We changed the model layers in our program to use the following four-level convolutional-layer model. We just love how easy Keras and TensorFlow makes it to dramatically change the neural network. CNNs work by scanning images and analyzing them chunk by chunk, say at 5x5 window that moves by a stride length of two pixels each time until it spans the entire message. It’s like looking at an image using a microscope; you only see a small part of the picture at any one time, but eventually you see the whole picture. Going to a CNN network on my Raspberry Pi increased the single epoch time to 1.5 hours from a 10 seconds epoch previously. The CNN model code This code has the same structure as the last program. The only significant change is the addition of the new layers for the CNN network: #import libraries import numpy as np import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns import tensorflow as tf 406 BOOK 4 Using Artificial Intelligence in Python
from tensorflow.python.framework import ops Doing Machine from tensorflow.examples.tutorials.mnist import input_data Learning in Python from PIL import Image # Import Fashion MNIST fashion_mnist = input_data.read_data_sets('input/data', one_hot=True) fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) \\ = fashion_mnist.load_data() class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] train_images = train_images / 255.0 test_images = test_images / 255.0 # Prepare the training images train_images = train_images.reshape(train_images.shape[0], 28, 28, 1) # Prepare the test images test_images = test_images.reshape(test_images.shape[0], 28, 28, 1) model = tf.keras.Sequential() input_shape = (28, 28, 1) model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(tf.keras.layers.Dropout(0.25)) model.add(tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Dropout(0.25)) CHAPTER 3 Doing Machine Learning in Python 407
model.add(tf.keras.layers.Conv2D(128, kernel_size=(3, 3), activation='relu')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(tf.keras.layers.Dropout(0.25)) model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(512, activation='relu')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Dropout(0.5)) model.add(tf.keras.layers.Dense(128, activation='relu')) model.add(tf.keras.layers.BatchNormalization()) model.add(tf.keras.layers.Dropout(0.5)) model.add(tf.keras.layers.Dense(10, activation='softmax')) model.compile(optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(train_images, train_labels, epochs=5) # test with 10,000 images test_loss, test_acc = model.evaluate(test_images, test_labels) print('10,000 image Test accuracy:', test_acc) #run test image from Fashion_MNIST data img = test_images[15] img = (np.expand_dims(img,0)) singlePrediction = model.predict(img,steps=1) print (\"Prediction Output\") print(singlePrediction) print() NumberElement = singlePrediction.argmax() Element = np.amax(singlePrediction) print (\"Our Network has concluded that the image number '15' is a \" +class_names[NumberElement]) print (str(int(Element*100)) + \"% Confidence Level\") 408 BOOK 4 Using Artificial Intelligence in Python
The results, round 2 Doing Machine Learning in Python This run on the Raspberry Pi 3B+ took about seven hours to complete. The results were as follows: 10,000 image Test accuracy: 0.8601 Prediction Output [[5.9128129e-06 9.9997270e-01 1.5681641e-06 8.1393973e-06 1.5611777e-06 7.0504888e-07 5.5174642e-06 2.2484977e-07 3.0045830e-06 5.6888598e-07]] Our Network has concluded that the image number '15' is a Trouser The key number here is the 10,000-image test accuracy. At 86 percent, it was actually lower than our previous, simpler machine-learning neural network (87 percent). Why did this happen? This is probably a case related to “overfitting” the training data. A CNN model such as this can use complex internal models to train (many millions of possibilities) and can lead to overfitting, which means the trained network recognizes the training set better but loses the ability to recog- nize new test data. Choosing the machine-learning neural network to work with your data is one of the major decisions you will make in your design. However, understanding activa- tion functions, dropout management, and loss functions will also deeply affect the performance of your machine-learning program. Optimizing all these parameters at once is a difficult task that requires research and experience. Some of this is really rocket science! Visualizing with MatPlotLib Now that we have moved to a GUI-based development environment, we are going to run our base code again and do some analysis of the run using MatPlotLib. We are using a Raspberry Pi for these experiments, but you can use a Mac, PC, or another Linux system and basically do the same thing. If you can install Ten- sorFlow, MatPlotLib, and Python on your computer system, you can do these experiments. To install MatPlotLib on your Raspberry Pi, type pip3 install matplotlib. We add the history variable to the output of the model.fit to collect data. And then we add MatPlotLib commands to graph the loss and the accuracy from our epochs and then to add figure displays for our two individual image tests. Figure 3-6 shows the results of running this program. CHAPTER 3 Doing Machine Learning in Python 409
Using nano (or your favorite text editor), open up a file called FMTensorFlowPlot. py, and enter the following code: #import libraries import numpy as np import matplotlib.pyplot as plt import matplotlib.image as mpimg import seaborn as sns import tensorflow as tf from tensorflow.python.framework import ops from tensorflow.examples.tutorials.mnist import input_data from PIL import Image # Import Fashion MNIST fashion_mnist = input_data.read_data_sets('input/data', one_hot=True) fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist. load_data() class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] train_images = train_images / 255.0 test_images = test_images / 255.0 model = tf.keras.Sequential() model.add(tf.keras.layers.Flatten(input_shape=(28,28))) model.add(tf.keras.layers.Dense(128, activation='relu' )) model.add(tf.keras.layers.Dense(10, activation='softmax' )) model.compile(optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_crossentropy', metrics=['accuracy']) history = model.fit(train_images, train_labels, epochs=2) 410 BOOK 4 Using Artificial Intelligence in Python
# Get training and test loss histories Doing Machine training_loss = history.history['loss'] Learning in Python accuracy = history.history['acc'] # Create count of the number of epochs epoch_count = range(1, len(training_loss) + 1) # Visualize loss history plt.figure(0) plt.plot(epoch_count, training_loss, 'r--') plt.plot(epoch_count, accuracy, 'b--') plt.legend(['Training Loss', 'Accuracy']) plt.xlabel('Epoch') plt.ylabel('History') plt.show(block=False); plt.pause(0.001) test_loss, test_acc = model.evaluate(test_images, test_labels) #run test image from Fashion_MNIST data img = test_images[15] plt.figure(1) plt.imshow(img) plt.show(block=False) plt.pause(0.001) img = (np.expand_dims(img,0)) singlePrediction = model.predict(img,steps=1) print (\"Prediction Output\") print(singlePrediction) print() NumberElement = singlePrediction.argmax() Element = np.amax(singlePrediction) print (\"Our Network has concluded that the image number '15' is a \" +class_names[NumberElement]) CHAPTER 3 Doing Machine Learning in Python 411
print (str(int(Element*100)) + \"% Confidence Level\") print('Test accuracy:', test_acc) # read test dress image imageName = \"Dress28x28.JPG\" testImg = Image.open(imageName) plt.figure(2) plt.imshow(testImg) plt.show(block=False) plt.pause(0.001) testImg.load() data = np.asarray( testImg, dtype=\"float\" ) data = tf.image.rgb_to_grayscale(data) data = data/255.0 data = tf.transpose(data, perm=[2,0,1]) singlePrediction = model.predict(data,steps=1) NumberElement = singlePrediction.argmax() Element = np.amax(singlePrediction) print(NumberElement) print(Element) print(singlePrediction) print (\"Our Network has concluded that the file '\"+imageName+\"' is a \"+class_names[NumberElement]) print (str(int(Element*100)) + \"% Confidence Level\") plt.show() The results of running this program is shown in Figure 3-6. The window labeled Figure 0 shows the accuracy data for each of the five epochs of the machine learn- ing training, and you can see the accuracy slowly increases with each epoch. The window labeled Figure 1 shows the test picture used for the first recognition test (it found a pair of trousers, which is correct), and finally, the window labeled Figure 2 shows the dress picture, which is still incorrectly identified as a bag. Harumph. 412 BOOK 4 Using Artificial Intelligence in Python
FIGURE 3-6: Doing Machine Our Raspberry Learning in Python Pi GUI with MatPlotLib visualization. Learning More Machine Learning After seeing how useful Python is in building and experimenting with machine learning and neural networks, you can see how powerful it is. Even though you are just beginning to understand the theory behind a lot of the models you have use, you should feel that you now have a tremendous amount of ability to build and experiment with making machines learn. Next step? We recommend the following books: »» Machine Learning for Dummies, John Paul Mueller and Luca Massaron »» Deep Learning with Python, Francois Chollet »» And a great beginners overview of the whole AI field: Artificial Intelligence For Dummies, John Paul Mueller and Luca Massaron Next, we explore using Python with some other AI applications. CHAPTER 3 Doing Machine Learning in Python 413
IN THIS CHAPTER »»Limitations on doing AI on your Raspberry Pi »»Using the cloud to do AI »»Using AI on graphics cards 4Chapter Exploring More AI in Python After reading the previous three chapters, you have learned quite a bit about using some of the basics of artificial intelligence, specifically neural n etworks and machine learning. There is a lot more to AI than just these two things, though. We could look at advanced searching (not Google searching but rather looking at big problem spaces and trying to figure out solutions to the problem using AI). We could also look at the whole problem of autonomous robotics (which we touch upon in Book 7) but this topic is very complicated. Instead, in this chapter, we talk about other ways of doing AI software beyond the Raspberry Pi. Remember it took us seven hours to run five epochs of training on our large neural network? Sounds like we could use some bigger iron to accom- plish more training in less time. That’s what this chapter is about. Limitations of the Raspberry Pi and AI The Raspberry Pi is an inexpensive full-blown computing device. The Raspberry Pi 3B+, which we have used throughout this book, has the following major specifications: »» CPU: Broadcom quad-core 64bit processor @ 1.4GHz 415 »» GPU: Broadcom Videocore-IV CHAPTER 4 Exploring More AI in Python
»» RAM: 1GB SDRAM »» Networking: Gigabit Ethernet, 802.11b/g/n/ac WiFi »» Storage: SD card How does this stack up? For a $35 computer, very well. But for a dedicated AI computer, not so much. The problems are not enough RAM (1GB isn’t very much, especially for a Rasp- berry Pi to do AI) and not a very sophisticated GPU (graphics processing unit). Figure 4-1 shows the Raspberry Pi 3B+ processing chip. FIGURE 4-1: The Raspberry Pi processing chip containing the Videocore-IV. There are two mitigating circumstances that keep the Raspberry Pi in the running when it comes to doing and experimenting with AI. First of all, you can buy an AI Accelerator that can plug into the USB ports of the Raspberry Pi and secondly, you can use the Raspberry Pi to control processors and AI hardware up on the cloud for all the computationally heavy lifting. 416 BOOK 4 Using Artificial Intelligence in Python
THE BROADCOM VIDEOCORE-IV ON THE Exploring More AI in RASPBERRY PI 3B+ Python The Videocore-IV is a low-power mobile graphics processor. It is a two-dimensional DSP (digital signal processor) that is set up as basically a four-GPU core unit. These GPU core units are called slices and can be very roughly compared to GPU computer units, such as those used by AMD and Nvidia (which can have 256, 512, or more individual GPU units, far outclassing the Videocore 4 units) to power their GPU cards, which are now proving to be very popular with AI researchers and hobbyists. This processor is really designed to be used in video encoding and decoding applica- tions and not so much for AI use. However, some researchers have made use of the four slices to accelerate neural-network processing on the Raspberry Pi to achieve up to about three times the performance of the four-core main processor used alone. One of the main barriers to using the Videocore on the Raspberry Pi for AI type of appli- cations is that the specifications, development tools, and product details have only been available under NDA (non-disclosure agreements), which do not go along with open- source development. However, you can now get full documentation and the complete source code for the Raspberry Pi 3B+ graphics stack under a very nonrestrictive BSD license, which should provide a path forward. Remember from our previous chapter that the bulk of the computer time in build- ing any kind of machine-learning AI system is for training and that when that training is done, it doesn’t take a lot of processing to actually characterize an unknown or new picture. This means you can train on one big machine and then deploy on a simpler computer such as the Raspberry Pi in the application. This doesn’t work all the time (especially if you want to keep learning as the program runs) but if it does work, it allows you to deploy sophisticated machine-learning programs on much simpler and less expensive hardware. Performing AI analysis or training on the small computers that are connected to a network, is called edge computing or, phrased a different way, computing on the edge of the network. CHAPTER 4 Exploring More AI in Python 417
Adding Hardware AI to the Raspberry Pi It turns out that there have been a number of companies starting to build special- ized AI compute sticks, many of which can be used on the Raspberry Pi. Typically, you will find that there are Python libraries or wrappers, and often TensorFlow Python libraries, that support using these sticks. Two of the most interesting ones are »» The Intel Modvidius Neural Compute Stick (NCS): The Movidius NCS stick plugs into the USB port of the Raspberry Pi or other compute and provides hardware support for deep learning based analysis (refer to Chapters 1–3). For example, you can use the Amazon cloud to perform image analysis, process- ing and classification up in the cloud from your small computer system which moves your computationally expensive task from your Raspberry Pi to the cloud. This does cost money and bandwidth (and latency in your system) to do this. Doing the analysis with your trained deep learning neural network on the edge by using a NCS stick can help and can possibly allow you to disconnect your device running on the edge of the network from the Internet entirely. It runs around 60X faster than doing image analysis on the Raspberry Pi and costs less than $100. Figure 4-2 shows the Movidius Compute Stick. FIGURE 4-2: The Intel Movidius Neural Compute Stick 2. You can do facial recognition, text analysis, monitoring and maintenance using this NCS stick. Pretty cool! There is one concept that we need to emphasize here, however. The NCS stick is used to perform analysis and to conduct inferences on data, but it is not used for training models! You still need to build and train the models. It has a good interface with Keras and TensorFlow, so this is possible to do in a reasonable fashion. 418 BOOK 4 Using Artificial Intelligence in Python
Think of it as an accelerator for use by your final project when the training is Exploring More AI in done. Python »» The Google Edge TPU accelerator: The Google Edge TPU (tensor processing unit) has a USB Type-C socket that can be plugged into an Linux-based system to provide accelerated machine learning analysis and inferences. Does the word tensor sound familiar? Tensors are matrices like in our neural-network examples in Chapters 2 and 3. Figure 4-3 shows the Google Edge accelerator. FIGURE 4-3: The Google Edge TPU accelerator. Well, it turns out, much like the Intel NCS stick above, this device is all about executing trained machine-learning models. We still train the machine-learning networks using other techniques, and then we execute the model on the stick. A FINAL COMMENT ON MACHINE- LEARNING ACCELERATORS Oh, boy. In the next four years, this type of specialized hardware for running machine- learning models will explode. You will see multiple different architectures and solutions come from Google, Intel, Nvidia, AMD, Qualcomm, and a number of other smaller companies from around the world. Everybody is starting to climb on the AI accelerator hardware bandwagon. CHAPTER 4 Exploring More AI in Python 419
AI in the Cloud In the tech industry, everyone loves to use buzzwords such as the cloud. Often, the use of such language results in arbitrary and nebulous terms that leave consumers (or even sophisticated technical people) unsure what they When a company says, “your data is in the cloud” or that “you can work in the cloud,” this has nothing to do with being white, fluffy, or aboveground. Your “in the cloud” data is on the ground and is stored somewhere in a data center with a bunch of servers that are more similar to your PC or Mac than you may think. Some people define the cloud as software or services that run on the Internet rather than on your local machine. This is correct to a degree, but nothing really runs on the Internet; it runs on machines that are connected to the Internet. Under- standing that in-the-cloud software runs on servers and is not “just out there” tends to really quickly demystify the cloud and its functions. If you have two computers networked together and use the other computer for a data server, you have your own “cloud.” This goes for basic services like storing your data in the cloud, but there is much more than just storage available on the cloud and that is where it gets really interesting. The advantage of using the cloud is that you can use services and storage unavail- able to you on your local network and (in one of the most important game chang- ers of cloud computing) you can ramp your usage up and down depending on your computing needs on a dynamic basis. Using the cloud requires Internet access. Not necessarily 100 percent of the time (you can fire off a cloud process and then come back to it later), but you do need con- nections some of the time. This limits the cloud in applications such as self-driving cars that aren’t guaranteed to have good Internet access all the time. Interestingly, this “fire and forget” mode is useful for IOT (Internet of Things) devices where you don’t want to stay connected to the net all the time for power considerations. So, how do you use the cloud? That depends on the service and vendor, but in machine-learning applications, the most common way is to set up the Python on a computer that calls cloud-based functions and applications. All cloud vendors provide examples. What is a great consumer example of cloud usage? The Amazon Echo and Alexa. It listens to you, compresses the speech data, sends it to the Amazon AWS cloud, translates and interprets your data and then sends back a verbal response or com- mands to make your lights come on. 420 BOOK 4 Using Artificial Intelligence in Python
A number of cloud providers for storage and services exist and more are arriving Exploring More AI in all the time. The top four cloud providers for AI at the time of this writing are Python »» Google cloud »» Amazon Web Services »» IBM cloud »» Microsoft Azure Google cloud The Google cloud is probably the most AI-focused cloud provider. You can gain access to TPU (tensor processing units) in the cloud, which, like the Google TPU stick above, can accelerate your AI applications. Much of the Google cloud’s func- tionality reflects the core skill set of the company — that of search. For example, the Cloud Vision API can detect objects, logos, and landmarks within images. Some excellent students at the University of Idaho are building a Smart City application called ParkMyRide, which uses a Raspberry Pi–based solar- powered camera to take pictures of the street and determines street parking avail- ability by using the Google Cloud Vision API. The software sends a picture of the street to Google and gets back the number of cars found and where they are in the picture. They then supply this information to a smartphone app which displays it graphically. Pretty neat. Other featured services of the Google cloud are: Video content search applica- tions and speech-to-text/text-to-speech packages (think Google Home — very much like Amazon Alexa). Like Amazon and Microsoft, Google is using its own AI-powered applications to create new services for customers to use. Amazon Web Services Amazon Web Services (AWS) is focused on taking their consumer AI expertise and supplying this expertise to businesses. Many of these cloud services are built on the consumer product versions, so as Alexa improves, for example, the cloud services also improve. Amazon not only has text and natural language offerings, but also machine- learning visualization/creation tools, vision recognition, and analysis. CHAPTER 4 Exploring More AI in Python 421
IBM cloud The IBM cloud has gotten a bad rap over the past few years for being hard to use. One of the big reasons was that there were so many different options on so many different platforms that it was almost impossible to figure out where to start. In the past couple of years, it has gotten much better. IBM merged its three big divisions (IBM BlueMix cloud services, SoftLayer data services, and the Watson AI group) into one group under the Watson brand. There are still over 170 services available, so it is still hard to get going, but there is much better control and con- sistency over the process. Their machine-learning environment is called the Watson Studio and is used to build and train AI models in one integrated environment. They also provide huge searchable knowledge catalogs and have one of the better IOT (Internet of Things) management platforms available. One of the cool things they have is a service called Watson Personality Insights that predicts personality characteristics, needs, and values through written text. What would Watson Personality make of the authors of this book? We will run the text of the finished book through Watson and report back to you on the Wiley blog. Microsoft Azure Microsoft Azure has an emphasis on developers. They breakdown their AI offer- ings into three AI categories: »» AI services »» AI tools and frameworks »» AI infrastructures Similar to Amazon and Google, their AI applications are built on consumer prod- ucts that Microsoft has produced. Azure also has support for specialized FPGA (field programmable gate arrays — think hardware that can be changed by pro- gramming) and has built out the infrastructure to support a wide variety of accel- erators. Microsoft is one of the largest, if not the largest, customer of the Intel Movidius chips. They have products for machine learning, IOT toolkits, and management services, and a full and rich set of data services including databases, support for GPUs and custom silicon AI infrastructure, and a container service that can turn your inside applications into cloud apps. 422 Microsoft Azure is the one to watch for some pretty spectacular innovations. BOOK 4 Using Artificial Intelligence in Python
AI on a Graphics Card Exploring More AI in Python Graphics cards (see the Nvidia Graphics chip in Figure 4-4) have been an integral part of the PC experience for decades. People often hunt for the latest and great- est graphics card to make their PCs better gaming machines. One thing becomes obvious after a while: Although CPU speed is important, the quality and archi- tecture of the graphics card makes a bigger difference. Why? Because computing high-resolution graphics is computationally expensive, and the way to solve that is to build graphics cards out of computers that were designed to do graphics to share the burden. Thus was born the GPU (graphics processing unit), a specialized computer core that is designed to work with graphics. FIGURE 4-4: Nvidia 256 Core GPU chip. Nvidia and others started building graphics cards that had multiple GPUs on them, which dramatically improved video resolution and frame rates in games. One thing to remember is that graphics algorithms are constructed using data struc- tures called matrices (or tensors) that are processed in pipelines. Wait. Tensors? Matrices? This sounds suspiciously like the kind of data structures we use in AI and machine learning. Because of the way machine learning and deep learning are done and implemented, GPUs have proven to be useful and effective. Deep learning relies on a number of different types of neural networks, (see Chapter 2) and we train and use these networks by using tensors. Regardless of the type of neural network used, all the techniques rely on per- forming complex statistical operations. During the training (learning) operations, a multitude of images or data points are fed to the network and then trained CHAPTER 4 Exploring More AI in Python 423
with the correct classification or correct answer. You correlate millions of tensors (matrices) to build a model that will get the right result. To speed up the training, these operations can be done in parallel, which turns out to be a very good use of the GPUs on a graphics board. An individual GPU core is much simpler than a CPU core as it is designed for a specific, rather than general, purpose. This makes it cheaper to build multicore GPU chips than to build multicore CPU chips. The proliferation of graphics cards with many GPU cores has made these com- puters perfect for machine learning applications. The combination of a powerful multicore CPU and many GPUs can dramatically accelerate machine-learning pro- grams. TensorFlow in particular has versions of the software that is designed to work with GPU boards, removing a lot of the complication of using these boards. To put it in perspective, our Raspberry Pi 3B+ has 4 processor cores and in some sense, 4 GPU cores. One of the latest GPU boards from Nvidia has 3,584 cores. You can do a lot of fast training and executing machine learning networks using these large core count GPU boards. The GPU-based boards are not the last step in this evolution of specialized com- puters and hardware to support AI applications. There are starting to be even more specialized chips. At last count, there are over 50 companies working on chips that will accelerate AI functions. When we discussed the Microsoft Azure cloud offering earlier, we mentioned that Microsoft has built out infrastructure to support AI acceleration hardware in the cloud. This is one of the big reasons to watch what Microsoft is doing. The future is in more and more specialized hardware, especially as specialized hardware gets easier and easier to deal with from the user software side. Where to Go for More AI Fun in Python If you are interested in furthering your knowledge and abilities in machine learn- ing and AI, check out the following sources for project inspiration. The important thing is to actually build programs and modify other people programs to really learn the technology from experience. 424 BOOK 4 Using Artificial Intelligence in Python
»» “Is Santa Claus Real?,” Varun Vohra, https://towardsdatascience.com/ Exploring More AI in is-santa-claus-real-9b7b9839776c Python »» “Keras and deep learning on the Raspberry Pi,” Adrian Rosebrock, https://www. pyimagesearch.com/2017/12/18/keras-deep-learning-raspberry-pi/ »» “How to easily Detect Objects with Deep Learning on Raspberry Pi,” Sarthak Jain, https://medium.com/nanonets/how-to-easily-detect-objects- with-deep-learning-on-raspberrypi-225f29635c74 »» “Building a Cat Detector using Convolutional Neural Network,” Venelin Valkov, https://medium.com/@curiousily/tensorflow-for-hackers-part-iii- convolutional-neural-networks-c077618e590b »» “Real time Image Classifier on Raspberry Pi Using Inception Framework,” Bapi Reddy, https://medium.com/@bapireddy/real-time-image-classifier- on-raspberry-pi-using-inception-framework-faccfa150909 CHAPTER 4 Exploring More AI in Python 425
5Doing Data Science with Python
Contents at a Glance CHAPTER 1: The Five Areas of Data Science . . . . . . . . . . . . . . . . . . . . 429 Working with Big, Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 Cooking with Gas: The Five Step Process of Data Science. . . . . . . . 432 CHAPTER 2: Exploring Big Data with Python . . . . . . . . . . . . . . . . . . . 437 Doing Your First Data Science Project . . . . . . . . . . . . . . . . . . . . . . . . 440 CHAPTER 3: Using Big Data from the Google Cloud. . . . . . . . . . . 451 What Is Big Data?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Understanding the Google Cloud and BigQuery . . . . . . . . . . . . . . . 452 Reading the Medicare Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 Looking for the Most Polluted City in the World on an Hourly Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
IN THIS CHAPTER »»What is data science? »»What is big data? »»What are the five steps of data science? 1Chapter The Five Areas of Data Science Data science impacts our modern lives in far more ways than you may think. When you use Google or Bing or DuckDuckGo, you are using a very sophis- ticated application of data science. The suggestions for other search terms that come up when you are typing? Those come from data science. Medical diagnoses and interpretations of images and symptoms are examples of data science. Doctors rely on data science interpretations more and more these days. As with most of the topics in this book, data science looks intimidating to the uninitiated. Inferences, data graphs, and statistics, oh my! However, just as in our previous chapters on artificial intelligence, if you dig in and look at some examples, you can really get a handle on what data science is and what it isn’t. In this chapter we cover just enough statistics and “asking questions of data” to get you going and get some simple results. The purpose is to introduce you to the use of Python in data science and talk about just enough theory to get you started. If nothing else, we want to leave you with the process of data science and give you a higher level of understanding of what is behind some of the talking heads on television and the various press releases that come from universities. These people are always citing results that come from big data analysis and are often CHAPTER 1 The Five Areas of Data Science 429
overstating what they actually mean. An example of this is when one study says coffee is bad for you and the next month a study comes out saying coffee is good for you — and sometimes the studies are based on the same data! Determining what your results mean, beyond simple interpretations, is where the really hard parts of data science and statistics meet and are worthy of a book all their own. At the end of our data science journey, you will know more about the processes involved in answering some of these questions. There is a mystery to data science, but with just a little knowledge and a little Python, we can penetrate the veil and do some data science. Python and the myriad tools and libraries available can make data science much more accessible. One thing to remember is that most scientists (including data scientists) are not necessarily experts in computer science. They like to use tools to simplify the coding and to allow them to focus on getting the answers and performing the analysis of the data they want. Working with Big, Big Data The media likes to throw around the notion of “big data” and how people can get insights into consumer (and your) behavior from it. Big data is a term used to refer to large and complex datasets that are too large for traditional data processing software (read databases, spread sheets, and traditional statistics packages like SPSS) to handle. The industry talks about big data using three different concepts, called the “Three V’s”: volume, variety, and velocity. Volume Volume refers to how big the dataset is that we are considering. It can be really, really big — almost hard-to-believe big. For example, Facebook has more users than the population of China. There are over 250 billion images on Facebook and 2.5 trillion posts. That is a lot of data. A really big amount of data. And what about the upcoming world of IOT (Internet of Things)? Gartner, one of the world’s leading analysis companies, estimates 22 billion devices by 2022. That is 22 billion devices producing thousands of pieces of data. Imagine that you are sampling the temperature in your kitchen once a minute for a year. That is over ½ million data points. Add the humidity in to the measurements and now you have 1 million data points. Multiply that by five rooms and a garage, all with 430 BOOK 5 Doing Data Science with Python
temperature and humidity measurements, and your house is producing 6 million The Five Areas of Data pieces of data from just one little IOT device per room. It gets crazy very quickly. Science And look at your smartphone. Imagine how many pieces of data it produces in a day. Location, usage, power levels, cellphone connectivity spews out of your phone into databases and your apps and application dashboards like Blynk constantly. Sometimes (as we just recently found out from cellphone companies) location information is being collected and sold even without your consent or opt-in. Data, data, and more data. Data science is how we make use of this. Variety Note that photos are very different data types from temperature and humidity or location information. Sometimes they go together and sometimes they don’t. Photos (as we discovered in Book 4, “Artificial Intelligence and Python”) are very sophisticated data structures and are hard to interpret and hard to get machines to classify. Throw audio recordings in on that and you have a rather varied set of data types. Let’s talk about voice for a minute. In Book 4, I talked about Alexa being very good at translating voice to text but not so good at assigning meaning to the text. One reason is the lack of context, but another reason is the many different ways that people ask for things, make comments, and so on. Imagine, then, Alexa (and Amazon) keeping track of all the queries and then doing data science on them to find out the sorts of things that people are asking for and the variety of ways they ask for them. That is a lot of data and a lot of information that can be gathered. Not just for nefarious reasons, but to build a system that better services the con- sumer. It goes both ways. Data science has a much better chance of identifying patterns if the voice has been translated to text. It is much easier. However, in this translation you do lose a lot of information about tone of voice, emphasis, and so on. Velocity Velocity refers to how fast the data is changing and how fast it is being added to the data piles. Facebook users upload about 1 billion pictures a day, so in the next couple of years Facebook will have over 1 trillion images. Facebook is a high veloc- ity dataset. A low velocity dataset (not changing at all) may be the set of tempera- ture and humidity readings from your house in the last five years. Needless to say, high velocity datasets take different techniques than low velocity datasets. CHAPTER 1 The Five Areas of Data Science 431
THE DIFFERENCE BETWEEN DATA SCIENCE AND DATA ANALYTICS In a real sense, data analytics is a subset of data science — specifically, steps 3-5 in our data science list. (See “Cooking with Gas: The Five-Step Process of Data Science.”) There are a number of people that still like to differentiate between these two types of scientists, but the difference becomes less and less noticeable as time goes on. More and more techniques are being develop to do data analysis on big data (not surprisingly named “big data analytics”). Currently, data science generally refers to the process of working out insights from large datasets of unstructured data. This means using predicative analytics, statistics and machine learning to wade through the mass of data. Data analytics primarily focuses on using and creating statistical analysis for existing sets of data to achieve insights on that data. With these somewhat vague descriptions, you can see how the two areas are moving closer and closer together. At the risk of ridicule from my fellow academics, I would defi- nitely call data analytics a subset of data science. Managing volume, variety, and velocity This is a very complex topic. Data scientists have developed many methods for processing data with variations of the three V’s. The three V’s describe the dataset and give you an idea of the parameters of your particular set of data. The process of gaining insights in data is called data analytics. In the next chapters, we focus on gaining knowledge about analytics and on learning how to ask some data ana- lytics questions using Python. After doing data science for a few years, you will be VVVery good at managing these. Cooking with Gas: The Five Step Process of Data Science We generally can break down the process of doing science on data (especially big data) into five steps. I’ll finish out this introductory chapter by talking about each of these steps to give us a handle on the flow of the data science process and a feel for the complexity of the tasks. These steps are 432 BOOK 5 Doing Data Science with Python
1. Capture the data The Five Areas of Data 2. Process the data Science 3. Analyze the data 4. Communicate the results 5. Maintain the data Capturing the data To have something to do analysis on, you have to capture some data. In any real- world situation, you probably have a number of potential sources of data. Inven- tory them and decide what to include. Knowing what to include requires you to have carefully defined what your business terms are and what your goals are for the upcoming analysis. Sometimes your goals can be vague in that sometimes, “you just want to see what you can get” out of the data. If you can, integrate your data sources so it is easy to get to the information you need to find insights and build all those nifty reports you just can’t wait to show off to the management. Processing the data In my humble opinion, this is the part of data science that should be easy, but it almost never is. I’ve seen data scientists spend months massaging their data so they can process and trust the data. You need to identify anomalies and outliers, eliminate duplicates, remove missing entries, and figure out what data is incon- sistent. And all this has to be done appropriately so as not to take out data that is important to your upcoming analysis work. It’s not easy to do in many cases. If you have house room temperatures that are 170 degrees C, it is easy to see that this data is wrong and inconsistent. (Well, unless your house is burning down.) Cleaning and processing your data needs to be done carefully or else you will bias and maybe destroy the ability to do good inferences or get good answers down the line. In the real world, expect to spend a lot of time doing this step. Oh, and one more cleaning thing to worry about, budding data scientist con- sumers are giving more and more false and misleading data online. According to Marketing Week in 2015, 60 percent of consumers provide intentionally incorrect information when submitting data online. We humbly admit to doing this all the time to online marketing forms and even to political pollsters, especially when we sense a political agenda in the questions. Bad boys we are. CHAPTER 1 The Five Areas of Data Science 433
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367
- 368
- 369
- 370
- 371
- 372
- 373
- 374
- 375
- 376
- 377
- 378
- 379
- 380
- 381
- 382
- 383
- 384
- 385
- 386
- 387
- 388
- 389
- 390
- 391
- 392
- 393
- 394
- 395
- 396
- 397
- 398
- 399
- 400
- 401
- 402
- 403
- 404
- 405
- 406
- 407
- 408
- 409
- 410
- 411
- 412
- 413
- 414
- 415
- 416
- 417
- 418
- 419
- 420
- 421
- 422
- 423
- 424
- 425
- 426
- 427
- 428
- 429
- 430
- 431
- 432
- 433
- 434
- 435
- 436
- 437
- 438
- 439
- 440
- 441
- 442
- 443
- 444
- 445
- 446
- 447
- 448
- 449
- 450
- 451
- 452
- 453
- 454
- 455
- 456
- 457
- 458
- 459
- 460
- 461
- 462
- 463
- 464
- 465
- 466
- 467
- 468
- 469
- 470
- 471
- 472
- 473
- 474
- 475
- 476
- 477
- 478
- 479
- 480
- 481
- 482
- 483
- 484
- 485
- 486
- 487
- 488
- 489
- 490
- 491
- 492
- 493
- 494
- 495
- 496
- 497
- 498
- 499
- 500
- 501
- 502
- 503
- 504
- 505
- 506
- 507
- 508
- 509
- 510
- 511
- 512
- 513
- 514
- 515
- 516
- 517
- 518
- 519
- 520
- 521
- 522
- 523
- 524
- 525
- 526
- 527
- 528
- 529
- 530
- 531
- 532
- 533
- 534
- 535
- 536
- 537
- 538
- 539
- 540
- 541
- 542
- 543
- 544
- 545
- 546
- 547
- 548
- 549
- 550
- 551
- 552
- 553
- 554
- 555
- 556
- 557
- 558
- 559
- 560
- 561
- 562
- 563
- 564
- 565
- 566
- 567
- 568
- 569
- 570
- 571
- 572
- 573
- 574
- 575
- 576
- 577
- 578
- 579
- 580
- 581
- 582
- 583
- 584
- 585
- 586
- 587
- 588
- 589
- 590
- 591
- 592
- 593
- 594
- 595
- 596
- 597
- 598
- 599
- 600
- 601
- 602
- 603
- 604
- 605
- 606
- 607
- 608
- 609
- 610
- 611
- 612
- 613
- 614
- 615
- 616
- 617
- 618
- 619
- 620
- 621
- 622
- 623
- 624
- 625
- 626
- 627
- 628
- 629
- 630
- 631
- 632
- 633
- 634
- 635
- 636
- 637
- 638
- 639
- 640
- 641
- 642
- 643
- 644
- 645
- 646
- 647
- 648
- 649
- 650
- 651
- 652
- 653
- 654
- 655
- 656
- 657
- 658
- 659
- 660
- 661
- 662
- 663
- 664
- 665
- 666
- 667
- 668
- 669
- 670
- 671
- 672
- 673
- 674
- 675
- 676
- 677
- 678
- 679
- 680
- 681
- 682
- 683
- 684
- 685
- 686
- 687
- 688
- 689
- 690
- 691
- 692
- 693
- 694
- 695
- 696
- 697
- 698
- 699
- 700
- 701
- 702
- 703
- 1 - 50
- 51 - 100
- 101 - 150
- 151 - 200
- 201 - 250
- 251 - 300
- 301 - 350
- 351 - 400
- 401 - 450
- 451 - 500
- 501 - 550
- 551 - 600
- 601 - 650
- 651 - 700
- 701 - 703
Pages: