Jessica McKellar That ended up being a good investment for Twisted and for me, as I went on to contribute many more patches, become a core maintainer and write a book about the library. The enduring open source lesson from Twisted has thus been about the importance of establishing a culture that welcomes new contributors. This is both because it is the right thing to do and because attracting and retaining a diverse contributor base is critical for sustaining a large open source project, on which many people and companies depend. Jessica McKellar: 'The enduring open source lesson from Twisted has thus been about the importance of establishing a culture that welcomes new contributors.' Driscoll: Can you tell me more about Pilot, the company that you founded? McKellar: Pilot is a bookkeeping firm (http://pilot.com). Unlike existing bookkeeping services, we are using software to automate the heavy lifting and a small team of pros to handle the rest. This results in books that are more accurate (less work and worry for you) and cheaper. It has been a delight to build this company on Python 3! Driscoll: Thank you, Jessica McKellar. Page 190
11 Tarek Ziadé Tarek Ziadé is a French Python developer and author. Past roles have included R&D developer at Nuxeo and software engineer at Mozilla. Today Tarek is a staff application engineer at Mozilla, where he creates tools for developers. He has written several Python books, in both English and French, including Expert Python Programming and Python Microservices Development. Tarek is the founder of Afpy, a French Python user group and has delivered talks at both PyCon and EuroPython. He regularly contributes to open source Python projects. Discussion themes: AI, v2.7/v3.x, Afpy. Catch up with Tarek Ziadé here: @tarek_ziade
Tarek Ziadé Mike Driscoll: Why did you become a programmer? Tarek Ziadé: In hindsight, I became a programmer for two reasons: to become the god of my little world and to impress my mom, who is a programmer as well. When I was six years old, I was at a fair with my mom. There was a giant paper sheet on the floor, with a turtle that had a pen. You could program the turtle with cards to tell it where to go and when to put the pen down on the paper. I was obsessed with that turtle. The feeling of planning what would happen felt so good. Years later, my mom got us a serious computer (the Thomson TO8D), and I could program in BASIC and Assembly. I built incredible things. With my mom's help, I was driving robots. Driscoll: What sort of things did you do with the robots with your mom? Ziadé: Well, the computer we bought had a programmable serial port and extensions to get a parallel port, which was quite rare back then. We were driving step engines in BASIC or Assembly (with a cartridge), since the ports could be directly addressed. This was nothing fancy, but as a kid, being able to do at home something similar to what we were doing with that turtle amazed me. Page 192
Tarek Ziadé My mom also got one of those fancy Olivetti laptops, with a small needles printer that could print stuff in three colors. We were having fun printing fractals. My mom was doing the heavy lifting (as a math teacher) and I was just tweaking the colors. Driscoll: So how did you come across Python? Ziadé: When I started coding professionally in the nineties, I was using Borland tools (C++Builder and Delphi) which could use VCL components. My company bought some VCL components, but we were highly frustrated by the poor support from their authors and some bugs. That's when I discovered the Indy Project, which was developing and releasing open source VCL components that provided most network protocols. That library was to us, what Requests is to Python today. Tarek Ziadé: 'Communities built around open source projects struck me as the way to go in software computing.' I got intrigued by this open source concept. Communities built around open source projects struck me as the way to go in software computing. Through my online research, I found out about the Zope Project and eventually discovered Python through that. A few months later, I joined a company that was building a Zope CMS. Page 193
Tarek Ziadé Driscoll: Have you done anything with robots using Python? Ziadé: Not really. I hacked a bit on a Raspberry Pi when I first got one. I also hacked a Wireless Ghetto Blaster using a suitcase, some old car speakers and a Raspberry Pi, with a Wi-Fi dongle and Mopidy. Tarek Ziadé: 'I also hacked a Wireless Ghetto Blaster using a suitcase, some old car speakers and a Raspberry Pi.' I looked at the OpenCV library through Python to do some image processing. Most of the other electronics projects that I worked on were on Arduino and its pseudo C language. My most advanced project was a small RC car and that was about it. I got a little bit bored after that. Driscoll: Python is big in AI and machine learning at the moment. What do you think makes Python so popular? Ziadé: I think that Python has become popular for AI because the SciPy community has built some state-of-the-art frameworks and libs in the past few years (pandas, scikit-learn, IPython/Jupyter) that lower the bar for scientists to use Python instead of R or other tools. Tarek Ziadé: 'AI and machine lear ning innovation is spearheaded by academics... Python becomes a natural fit for them.' Page 194
Tarek Ziadé AI and machine learning innovation is spearheaded by academics. Since Python has steadily grown as one of the main languages for learning programming in academics, Python becomes a natural fit for them. Driscoll: What did you personally like about Python? Ziadé: I fell in love with Python and its community. Python is open source, versatile, and powerful, yet simple to code. Coming from a C++ and Delphi background, at first I thought that Python was a weak scripting language that could not be used to build serious applications. Eventually, I became impressed by how simple it was to create Python programs that were concise and straightforward to understand. C++ and Delphi looked over-engineered at that point for all of the network applications that I was building. I could just write Python scripts that followed the KISS principle and build serious web applications that way. Driscoll: What would you say are Python's strengths and weaknesses as a language? Ziadé: Today, with over a decade of Python programming behind me, I think that Python's biggest strength is how visionary Guido van Rossum and the Python-Dev team are. As far as I can tell, every decision that was made in the last 20 years was a good one. Tarek Ziadé: 'Python's biggest strength is how visionary Guido van Rossum and the Python-Dev team are.' Page 195
Tarek Ziadé From the memorandum (a CPython freeze designed so that other implementations like PyPy and Jython could catch up), to how asynchronous features were gradually added, Python got modernized in the right direction. Each time that Python was getting a little bit behind compared to other languages, another feature would be added. Unlike some other languages that had a stellar start, then faded again, Python is steadily getting bigger every year. One weakness for Python is the standard library. The fact that a package added in the stdlib is rarely removed is an issue. For instance, the stdlib currently has two classes named Future that are slightly different. One is in asyncio and one is in concurrent. I wish Python had a better story for its stdlib. I think the biggest weakness of Python is the Python 2 versus Python 3 never-ending debate. That issue drove away some developers, because of the uncertainty about which version to use. It looks like we're getting past that debate now, which is great. Tarek Ziadé: 'I think the biggest weakness of Python is the Python 2 versus Python 3 never-ending debate.' Driscoll: What is your opinion on the long life of Python 2.7? Ziadé: I think that the transition took a while but it is happening transparently now and it is a success. The Python 2 versus 3 days are over, since the Python 3 ecosystem is now mature enough for most projects. Page 196
Tarek Ziadé To my knowledge, there are no major libs or frameworks that are still lacking Python 3 support. So there's no good reason to start a new project using Python 2.7. People just use Python and for most of them it will happen to be Python 3. One day Python 2.7 will cease to exist and nobody will really miss it. Tarek Ziadé: 'One day Python 2.7 will cease to exist and nobody will really miss it.' Driscoll: How did you end up becoming an author of Python books? Ziadé: When I started programming in Zope and Python, I was the creator and maintainer of a French forum called Zopeur. I was spending a lot of time answering all of the questions. Zopeur started as a one-man project, so if I stopped answering questions, then nobody else answered them. I was also learning so much by actually searching for answers and by diving into the details. Tarek Ziadé: 'I was also learning so much by actually searching for answers and by diving into the details.' The first book that I wrote about Python came about because I wanted to dive deeply into Python and make my work useful to others. I was filling a gap too, since there were no original books in French about Python. Page 197
Tarek Ziadé Driscoll: What have you learned in the writing process? Ziadé: Writing a book is a long and exhausting project. The first book took me nine months and was very painful to finish. It's easy to quit. It's also common to get lost in details and forget about the big picture. I've learned how to organize my thoughts and keep the big picture in my head. When I wrote my first book in English, I also learned the hard way that it's difficult to write in a non-native language. You need to keep your sentences as straightforward and short as possible. I was also exposed to a larger community of readers, for better or worse. The last thing to note about writing books is that you need to accept that your book will never be perfect. By the time you have finished writing, and you have read back through the first chapters, you will want to rewrite things all over again. Tarek Ziadé: 'You need to accept that your book will never be perfect.' Driscoll: Have you learned anything from your readers? If so, what? Ziadé: I've learned a lot from feedback from my readers. I still get a few emails from readers wanting to share their thoughts. Sometimes readers want to point out some mistakes or share solutions that they think are better. I have received a few interesting threads that I wish had been available to me before my books were published. I think books that are published on the web in real time, allowing readers to send feedback as the writer delivers chapters, are superior for that reason. Page 198
Tarek Ziadé Driscoll: Are you aware of any other books about Python that have come out in French since yours were published? Ziadé: To be fair, there was a Zope book before mine. But as far as I know, mine was the first book entirely dedicated to Python which was written in French, by a native speaker. Since then, there have been dozens of books written in French about Python. I am the old guard now. Tarek Ziadé: 'Mine was the first book entirely dedicated to Python which was written in French, by a native speaker.' Driscoll: Why did you found the French Python User Group, Afpy? Ziadé: As I mentioned earlier, I was maintaining a Zope/Python forum called Zopeur. At some point I had the idea of having a meeting in real life in Paris, with a dozen of the active members. We met for beers and we founded a foundation around Python. After that, I shut down my forum and we built Afpy on the ground. Driscoll: What challenges did you face then and are there any challenges currently? Ziadé: The first few years of running Afpy were great. We were all good friends united around our passion for Python. Page 199
Tarek Ziadé The first challenge that we met was how to integrate French companies that wanted to be part of Afpy. That took us a few years, because enterprises wanted to use our foundation as a tool to promote their business (sometimes aggressively). We were risking losing the original spirit of Afpy. Tarek Ziadé: 'We were risking losing the original spirit of Afpy.' We were also a bit paranoid about what would happen if several developers from the same company were elected to the steering committee. But when we started to organize PyCon France, it became a natural fit for those companies to be sponsors. In hindsight, I think that we did the right thing by being protective. Another challenge was trying to have more diversity in Afpy. We were mostly men and I wanted to make our foundation more welcoming to women. I did some work around that and found that diversity was a very controversial topic. Eventually, I got burnt out from politics and the work was not fun anymore. Tarek Ziadé: 'Eventually, I got burnt out from politics and the work was not fun anymore.' I was Afpy president for seven years, so I felt that it was the right time to move on. I am not sure what the current status of Afpy is, since I'm not involved. Afpy still looks like a vibrant user group though, which is great. Page 200
Tarek Ziadé Driscoll: What made you choose Zope over some of the other alternatives? Ziadé: The standard was PHP-powered frameworks back then, but Zope was the cool stuff. Zope was very innovative and with Python it was more than web pages. Tarek Ziadé: 'Zope was very innovative and with Python it was more than web pages.' Plone was starting to take off and get very popular in France. Companies that specialized in building a CMS for government agencies often used Plone, because it had most features already built in. Plone, at one point, was at the top of the game for accessibility and groupware features. Driscoll: Which Python web framework do you use now and why do you use it? Ziadé: At Mozilla we do a lot of Django and Flask, and a bit of Pyramid. Occasionally we use some Twisted and Tornado. Since we're now shipping most stuff in Docker images, developers that start new projects are not tied to specific Python versions anymore. So asynchronous frameworks are starting to get used. When I can pick my framework of choice, I like to use Bottle, for very simple web services and Flask, for bigger projects that need a bit of UI. There are a large number of Flask libraries out there. That said, the next server-side project that I will start will be aiohttp, that's for sure. Page 201
Tarek Ziadé Driscoll: Are you working on any open source projects yourself that you would like to talk about? Ziadé: I work on several projects, but a project that I am obsessed with right now is molotov (http://molotov.readthedocs.io/). It's a small load testing tool, based on Python 3.5+ and aiohttp client, that we're using to test our web services. Tarek Ziadé: 'A project that I am obsessed with right now is molotov.' The design focuses on making it as straightforward as possible for developers to write a load test, by describing a scenario using simple Python coroutines. Once we have a set of those functions, then they are used to run simple smoke tests, load tests and distributed load tests. Thanks to asyncio and aiohttp, the tool can send a pretty amazing load on our services and we're able to break most services with a single molotov client. I am adding on the top of this tool some CI Helpers, so we can continuously test the performance of our service. One extension that I am going to add this quarter is the ability to deploy a stack with Docker images on AWS. This will happen prior to running the load test and grab back metrics once it's done. We also have a bigger project called Ardere, that drives AWS ECS for doing distributed tests. You can follow all of the work on those tools at https://github.com/loads. Page 202
Tarek Ziadé Mike Driscoll: What are you most excited about in Python today? Ziadé: Asynchronous programming. The addition of async/await in the language and projects like aiohttp are truly putting Python back into the game of building network apps. Of course, we have been able to do that with Twisted for over a decade, but now it's part of the core and implemented in a beautiful way. It's as easy as in Node.js to build async web apps in Python. Driscoll: What changes would you like to see in future Python releases? Ziadé: I'd love to see PyPy on a par with CPython (maybe we should have yet another memorandum so that PyPy catches up) and have the ability to run any of my projects with it (including C extensions). More anecdotally, I would love to see setup.py killed in our packaging system. It's the source of many issues. I've tried and failed (see PEP 390), but maybe one day it will happen. Driscoll: Thank you, Tarek Ziadé. Page 203
12 Sebastian Raschka Sebastian Raschka received his doctorate in Quantitative Biology and Biochemistry and Molecular Biology in 2017, from Michigan State University. His research activities included the development of new deep learning architectures to solve problems in the field of biometrics. Sebastian is the bestselling author of Python Machine Learning, which received the ACM Best of Computing award in 2016. He contributes to many open source projects including scikit- learn. Methods that Sebastian implemented are being used in real- world machine learning applications such as Kaggle. He is passionate about helping people to develop data-driven solutions. Discussion themes: Python for AI/machine learning, v2.7/v3.x. Catch up with Sebastian Raschka here: @rasbt
Sebastian Raschka Mike Driscoll: Could you give a little background information about yourself ? Sebastian Raschka: Of course! My name probably already gives it away, but I was born and raised in Germany, where I lived for more than two decades, until I had the urge to go on an adventure and study in the US. I received my undergraduate degree from Heinrich-Heine University in Düsseldorf. I remember one day walking to the cafeteria and stumbling upon a flyer regarding a study abroad program with Michigan State University (MSU). I was super intrigued and thought that this might be a worthwhile experience. So not long after that, I studied for two years at MSU and received a Bachelor Plus/ International degree. During those two semesters, I made many friends at MSU and thought that the scientific environment would provide an excellent opportunity for me to grow as a scientist, which is why I applied for grad school at MSU. I should say that this chapter of my life came with a happy ending, as I obtained my Ph.D. in December 2017. So that's my academic career. Sebastian Raschka: 'During my time as a graduate student, I got heavily involved in open source in the context of data science and machine learning.' Page 206
Sebastian Raschka During my time as a graduate student, I got heavily involved in open source in the context of data science and machine learning. Also, I am a passionate blogger and writer. Some people may have stumbled upon my book, Python Machine Learning, which was very well-received by both people from academia and the industry. With my book, I tried to bridge the gap between purely practical (that is, coding) books and purely theoretical (i.e., math-heavy) works. Based on all of the feedback that I received, Python Machine Learning turned out to be super useful to a broad audience. The book was translated into seven languages and is currently used as a textbook at the Loyola University Chicago, the University of Oxford, and many others. Driscoll: Do you contribute to any open source projects? Raschka: Yes, besides my writings, I am contributing to open source projects such as scikit-learn, TensorFlow and PyTorch. I also have my own little open source projects that I work on in my free time, including mlxtend and BioPandas. mlxtend is a Python library with useful tools for the day-to-day data science tasks. It aims to fill the gap in the Python data science system, by providing tools that are not yet available in other packages. For example, the stacking classifiers and regressors, as well as the sequential feature selection algorithms, are very popular in the Kaggle community. Page 207
Sebastian Raschka In addition, the frequent pattern mining algorithms, including Apriori and algorithms for deriving association rules, are super handy. Most recently, I added a lot of non-parametric functions, for evaluating machine learning classifiers from bootstrapping, to McNemar's tests. Sebastian Raschka: 'To stay most productive, I didn't want to learn a whole new API for each little side project.' The BioPandas project arose from the need to work with molecular structures from different file formats more conveniently. During my Ph.D., many projects involved working with protein structures, or structures of small (drug-like) molecules. There are many tools out there for that, but each has its own little sublanguage. To stay most productive, I didn't want to learn a whole new API for each little side project. The idea behind BioPandas is to parse structural files into pandas DataFrames, a library and format that most data scientists are already familiar with. Once the structures are in a DataFrame format, we can use all of the power of pandas that is at our disposal, including its super flexible selection syntax. Page 208
Sebastian Raschka A virtual screening tool that I recently developed, screenlamp, makes heavy use of BioPandas as its core engine. I could screen databases with more than 12 million molecules efficiently, which led to the successful discovery of potent G protein-coupled receptor signaling inhibitors, with applications to aquatic invasive species control, in collaboration with experimental biologists at MSU. Sebastian Raschka: 'Semi-adversarial networks are a deep learning architecture that I developed with my collaborators in the iPRoBe Lab at MSU.' Besides all of my involvement in computational biology, one of my other passion projects involves semi-adversarial networks. Semi-adversarial networks are a deep learning architecture that I developed with my collaborators in the iPRoBe Lab at MSU, which we successfully applied in the context of privacy concerns in the field of biometrics. In particular, we applied this architecture to perturb face images in such a way that they looked almost identical to the original input images, while soft biometric attributes, such as gender, were inaccessible by gender predictors. The overall goal is to prevent nasty things like profiling, based on soft biometric attributes, without a user's consent. Driscoll: So why did you become a programmer? Raschka: I would say that the primary driving factor for becoming a programmer was to be able to implement my 'crazy' research ideas. Page 209
Sebastian Raschka In computational biology, we already have many tools at our disposal that we can use without the need to program ourselves. However, using existing tools (depending on the research task) can also be a bit limiting. If we want to try something new, especially if we want to develop new methods, then there is no way around learning how to program. Like most people, I started with simple Bash scripting in a Linux shell. At some point, I realized that this wasn't quite enough, or not efficient enough. During my undergraduate studies in Germany, I took a bioinformatics class in Perl. When I saw what was possible with Perl, this was quite an eye- opening experience. Later, when I was conducting statistical analyses and preparing data visualizations based on the data that I collected, I also got into R. Not long after that, I got into Python. Driscoll: Why Python? Raschka: Well, I mentioned that I started with Perl and R. However, one thing that most programmers have in common is that we consult the internet on a regular basis to look for useful pointers, and other tips and tricks for achieving certain subtasks. Sebastian Raschka: 'I stumbled upon many different resources that were written in Python and I thought that it would be worthwhile learning this language.' Page 210
Sebastian Raschka Suffice it to say, I stumbled upon many different resources that were written in Python and I thought that it would be worthwhile learning this language. At some point, I moved away from Perl entirely and did all of my coding in Python: custom scripts for data collection, parsing and analysis. I also have to mention that I did all of the statistical analyses and plotting in R. Actually, not too long ago, when I was revisiting an old project, I stumbled upon my old Frankenstein-esque scripts (Bash scripts and makefiles), which were running Python and R in tandem. Now, back in 2012, when the scientific computing stack was growing quickly, I stumbled upon NumPy, SciPy, matplotlib and scikit-learn. I realized that everything that I did in R, I could also do in Python. I could avoid switching back and forth between languages in my projects. Sebastian Raschka: 'I really enjoy being part of and interacting with the vivid Python community.' Looking back, picking up Python was probably one of the best decisions that I made. Without Python, it wouldn't have been possible for me to be so productive. But besides research and work, I really enjoy being part of and interacting with the vivid Python community. Whether I am interacting with people via Twitter, or meeting people at conferences like PyData and SciPy, it's always a fun experience. Page 211
Sebastian Raschka Driscoll: Python is one of the languages that is being used in AI and machine learning right now. Could you explain what makes it so popular? Raschka: I think there are two main reasons, which are very related. The first reason is that Python is super easy to read and learn. I would argue that most people working in machine learning and AI want to focus on trying out their ideas in the most convenient way possible. The focus is on research and applications, and programming is just a tool to get you there. The more comfortable a programming language is to learn, the lower the entry barrier is for more math and stats-oriented people. Sebastian Raschka: 'I would argue that most people working in machine learning and AI want to focus on trying out their ideas in the most convenient way possible.' Python is also super readable, which helps with keeping up-to- date with the status quo in machine learning and AI, for example, when reading through code implementations of algorithms and ideas. Trying new ideas in AI and machine learning often requires implementing relatively sophisticated algorithms and the more transparent the language, the easier it is to debug. Page 212
Sebastian Raschka The second main reason is that while Python is a very accessible language itself, we have a lot of great libraries on top of it that make our work easier. Nobody would like to spend their time on reimplementing basic algorithms from scratch (except in the context of studying machine learning and AI). The large number of Python libraries which exist help us to focus on more exciting things than reinventing the wheel. Sebastian Raschka: 'The large number of Python libraries which exist, help us to focus on more exciting things than reinventing the wheel.' By the way, Python is also an excellent wrapper language for working with more efficient C/C++ implementations of algorithms and CUDA/cuDNN, which is why existing machine learning and deep learning libraries run efficiently in Python. This is also super important for working in the fields of machine learning and AI. To summarize, I would say that Python is a great language that lets researchers and practitioners focus on machine learning and AI and provides less of a distraction than other languages. Driscoll: Were there any moments where things may have gone another way, but surreptitiously ended up the way that they did? Raschka: That's a good question. Maybe the fact that Python was popular among the Linux community, but worked very well on Windows as well. This was likely a big contributor to Python becoming so popular today. Page 213
Sebastian Raschka There are relatively similar languages out there like Ruby. The Ruby on Rails project was (and still is) super popular. If projects like Django hadn't started, Python might have become less popular as an all-rounder, which may have led to fewer resources and open source contributions being devoted to developing Python. In turn, Python may have been less popular as a language for machine learning and AI. Sebastian Raschka: 'If Travis Oliphant hadn't started the NumPy project...I think fewer scientists would have picked up Python as a scientific programming language.' If Travis Oliphant hadn't started the NumPy project (it was called Numeric back then in 1995), I think fewer scientists would have picked up Python as a scientific programming language early in their careers. We would all still be using MATLAB. Driscoll: So is Python just the right tool at the right time, or is there another reason that it's become so important in AI and machine learning? Raschka: I think that's a bit of a chicken or the egg problem. To untangle it, I would say that Python is convenient to use, which led to its wide adoption. The community has developed many useful packages in the context of scientific computing. Many machine learning and AI developers prefer Python as a general programming language for scientific computing, and they have developed libraries on top of it, like Theano, MXNet, TensorFlow and PyTorch. Page 214
Sebastian Raschka On an interesting side note, having been active in the machine learning and deep learning communities, there was one thing that I heard very often: \"The Torch library is awesome, but it is written in Lua, and I don't want to spend my time learning yet another language.\" Note that we have PyTorch now. Mike Driscoll: Do you think this opens the door for any Python programmer to start experimenting with AI? Raschka: I do think so! It depends on how we interpret AI, but regarding deep learning and reinforcement learning, there are many convenient packages with Python wrappers out there. Probably the most popular example at the moment would be TensorFlow. Personally, I use both TensorFlow and PyTorch in my current research projects. I have been using TensorFlow since it was released in 2015 and like it overall. However, it is a bit less flexible when trying out unusual research ideas, which is why I recently got more into PyTorch. PyTorch itself is more flexible and its syntax is closer to Python; in fact, PyTorch describes itself as \"a deep learning framework that puts Python first.\" Driscoll: What could be done to make Python a better language for AI and machine learning? Raschka: While Python is a language that is very convenient to use and nicely interfaces with C/C++ code, we have to keep in mind that it is not the most efficient language. Page 215
Sebastian Raschka Computational efficiency is why C/C++ is still the programming language of choice for several machine learning and AI developers. Also, Python is not supported on most mobile and embedded devices. Here we have to distinguish between research, development and production. Sebastian Raschka: 'The convenience of Python comes at a price, which is performance.' The convenience of Python comes at a price, which is performance. On the other hand, speed and computational efficiency comes with a trade-off in terms of productivity. In practice, I think that it's usually best to split tasks when working in a team, for instance, having people who specialize in research and trying new ideas, and people who specialize in taking prototypes to production. I am mainly a researcher and haven't run into this problem yet, but I have also heard that Python is not good for production. I think this is mainly due to existing infrastructure, however, and the tools that are supported by the servers, so it's not really Python's fault per se. Sebastian Raschka: 'Python doesn't scale as well as other languages such as Java or C++.' Page 216
Sebastian Raschka In general, due to its nature as a high-level and general-purpose programming language, Python doesn't scale as well as other languages such as Java or C++, although they are more tedious to use. For instance, spending too much time in the Python runtime, when working with TensorFlow, can be a real performance killer. Improving the general efficiency of Python (I don't think this is really possible though while keeping Python as convenient as it is) would be beneficial to AI and machine learning. Sebastian Raschka: 'Improving the general efficiency of Python...would be beneficial to AI and machine learning.' While Python provides a great environment for rapid prototyping, it is sometimes a little bit too forgiving and dynamic types allow you to make mistakes more easily. I think the recent introduction of type hints may help to improve this issue to some extent. Also, keeping type hints optional is a great idea, because while it helps with larger code bases, it can also be an annoyance for smaller coding projects. Driscoll: What are you most excited about in Python today? Raschka: I am super excited that I can do anything that I need in Python. I can spend my time efficiently on research and problem solving, without the need to spend most of my days learning new tools and programming languages. Sebastian Raschka: 'I am super happy with the status quo of Python. I am excited about the continued development of the fundamental data science libraries like NumPy.' Page 217
Sebastian Raschka Sure, sometimes it's good to look beyond the Python ecosystem, to see what's out there and what could potentially be useful. However, overall, I am super happy with the status quo of Python. I am excited about the continued development of the fundamental data science libraries like NumPy, which received a large grant from the Moore Foundation to focus on improving the library even further. Also, I recently saw a conference talk on the redesign of pandas, pandas 2, which will make this already great library even more efficient, without changing the user interface. The one thing I am probably most excited about, though, is the great community around Python. It's great to feel part of the Python community and to be in the same boat regarding advancing the landscape of tools and science. I can share knowledge, learn from others and share my excitement with likeminded people. Sebastian Raschka: 'It's great to feel part of the Python community and to be in the same boat regarding advancing the landscape of tools and science.' Driscoll: What do you think about the long life of Python 2.7? Should people move over? Raschka: That's a good question. Personally, I always recommend using the latest version of Python. However, I also realize that this is not always possible for everyone. Page 218
Sebastian Raschka If your project involves working on or with an older Python 2.7 code base, then it may not be feasible to make the switch in terms of resources. Regarding the long life of Python 2.7, we all know that Python 2.7 will not be officially maintained after 2020. One thing that might happen is that a subcommunity will take over the maintenance of Python 2.7. Sebastian Raschka: 'One thing that might happen is that a subcommunity will take over the maintenance of Python 2.7.' I also wonder whether it would be worthwhile to spend the energy and resources maintaining Python 2.7 after 2020 as a side project, versus taking the time to port Python 2.7 code bases over to Python 3.x. The long-term maintenance of Python 2.7 will always remain uncertain. Personally, I always install the latest version of Python when it comes out and do all of my coding in Python 3. However, most of my projects also support Python 2.7. The reason is that there are still many people using Python 2.7 who cannot switch, and I don't want to exclude anyone. So if it does not require any major hassle or clunky workarounds, then I write my code in a way that is compatible with both Python 2.7 and 3.x. Sebastian Raschka: 'There are still many people using Python 2.7 who cannot switch and I don't want to exclude anyone.' Page 219
Sebastian Raschka Driscoll: What changes would you like to see in future Python releases? Raschka: My apologies, but my answer is a rather boring one: I am quite happy with Python's current set of features and don't have anything significant on my wish list. One thing that I and multiple other people are sometimes complaining about is Python's Global Interpreter Lock (GIL). However, for my needs, it's typically not an issue. For instance, I like control over when to do multithreading or multiprocessing. I wrote my little multiprocessing wrappers (in the mputil package) to evaluate Python generators lazily, which was an issue concerning memory consumption when I was working with vanilla Pool classes from Python's multiprocessing standard library. Besides, there are great libraries out there, like joblib, which make multiprocessing and threading super convenient. On top of that, most libraries that I use for the heavy lifting when it comes to doing computations in parallel (Dask, TensorFlow, and PyTorch) already support multiprocessing and use Python more as a glue language as I mentioned earlier, so that computational efficiency is never really an issue. Driscoll: Thank you, Sebastian Raschka. Page 220
13 Wesley Chun Wesley Chun is an American software engineer who has worked at Google for the past eight years. In his role as a senior developer advocate, Wesley encourages developers to adopt Google tools and APIs. He previously worked for Yahoo! and was one of the original Yahoo! Mail engineers. Wesley is a fellow of the Python Software Foundation (PSF) and runs CyberWeb Consulting, which specializes in Python training and technical courses. He is the bestselling author of the Core Python Programming book series and co-authored Python Web Development with Django. Wesley has also contributed to Linux Journal, CNET, and InformIT. Discussion themes: Yahoo! Mail, Python books, v2.7/v3.x. Catch up with Wesley Chun here: @wescpy
Wesley Chun Mike Driscoll: So why did you become a programmer? Wesley Chun: I've been fascinated by the ability to write code to solve problems for a long time now. My interest probably started during the latter years of high school. My programming teacher showed us how to write code implementing Gauss-Jordan elimination and have a computer solve systems of equations automatically. This demonstrated how code could be used to automate tedious work that previously required inefficient human power to compute. While we were only using Commodore BASIC, being able to implement that algorithm and watch it work successfully, was one of the factors that motivated me to become a professional developer. Wanting to make people and processes more efficient has led to my multi-decade career as a software engineer. Wesley Chun: 'Wanting to make people and processes more efficient has led to my multi- decade career as a software engineer.' Driscoll: So how did you come across the Python programming language? Chun: Finding Python was not by choice. I had experience with C/C++ programming, as well as popular shell languages such as Tcl and Perl. Then I began working at a start-up company where Python became the primary development language. I learned Python and helped to build what was eventually to become Yahoo! Mail in the late 1990s. Page 222
Wesley Chun Driscoll: How was Yahoo! Mail created? Chun: In 1997, I was working at a start-up called Four11. True to its name, the first product released by the company was one of the first online versions of the telephone white pages directories. The Four11 service, while being a web app, was written entirely in C++, a monolithic binary that was burdensome to build and cumbersome to maintain. The CTO and co-founder began to look for a way to develop more nimbly. After researching a variety of scripting languages, the CTO discovered that if you left all of the hardcore work as C++, Python was a language that you could drop in as the front-end, as well as replace the middleware with. Our next product, RocketMail, was developed with this modified stack. We created our own web framework before that term even existed. Using this framework, our core team was able to launch a successful mail service, which caused Yahoo! to acquire our company. RocketMail became Yahoo! Mail and the rest is history. Driscoll: So how did you end up becoming an author? Chun: Becoming an author was also accidental. During one of my summer internships at college, I was given the task of writing a user manual for customers. Page 223
Wesley Chun I learned how to write using Ventura Publisher and with that experience under my belt, my coding and writing have been paired together ever since. Wesley Chun: 'When I was exposed to Python in the workforce, there were only two Python books on the market.' When I was exposed to Python in the workforce, there were only two Python books on the market. One was a large case study book, while the other was the first Python book, which was already somewhat outdated. The need for a book about Python, for developers coming from languages like C and shell scripts, drove me to craft the first Core Python Programming book. Driscoll: What have you learned from writing Python books? Chun: If I wasn't already a developer, then I could probably say that I learned Python from writing books. Any time that you write a book, you need to do some research into the subject matter. You should learn more information about your subject than is really necessary. In order to take a thorough look at a programming language, you must become familiar with both commonly-used features and corner cases. Page 224
Wesley Chun Driscoll: How have your readers impacted your writing? Chun: Having readers come up to me and let me know that I was one of their primary sources for learning Python, always brings a smile to my face. Wesley Chun: 'Readers come up to me and let me know that I was one of their primary sources for learning Python.' Whenever possible, I ask for direct feedback from my readers so that I can make my books even better. Readers love the exercises after a chapter, which help to reinforce what they learned. They also appreciate the wide variety of topics covered. Driscoll: Could you explain the idea behind CyberWeb Consulting? Chun: Yes, my home business is meant to consolidate all of the freelance work that I perform for the Python community. CyberWeb Consulting incorporates magazine articles, the technical Python training courses that I teach and other Python-related consulting opportunities that come my way. Page 225
Wesley Chun Driscoll: What projects are you working on now? Chun: To this day, I still help people to discover how mundane and laborious tasks, which were once performed by humans, can now be automated. This frees people up to have higher pursuits. Wesley Chun: 'I still help people to discover how mundane and laborious tasks, which were once performed by humans, can now be automated.' I'm currently a developer advocate at Google. I show developers how to integrate Google technologies into their apps, web or mobile. I started by advocating Google Cloud Platform products, but have since moved to the familiar G Suite productivity applications: Gmail, Google Drive, Calendar, Sheets, etc. While people are familiar with these well-known apps, I focus on teaching programmers about the developer platforms and APIs behind each of those tools. You'll often find me on the G Suite Developers blog or hosting the G Suite Developer Show (http:// goo.gl/JpBQ40). On the Python side of the house, I'm working on the third edition of Core Python Programming, which was my first book. Readers familiar with Core Python Programming will know that the book is being broken up into two volumes. The third part of the second half, Core Python Applications Programming, was published back in 2012. Now I'm writing the third edition of the first half. This latest book will be called Core Python Language Fundamentals, to better reflect its content. Page 226
Wesley Chun I also have a Python blog, which I've honestly been neglecting. Fortunately for me, work has provided content for the blog because any of my work on Google developer products features a good deal of Python code. Driscoll: What most excites you about Python at the moment? Chun: Believe it or not, I'm most excited that people even know what Python is today. Back in the old days, nobody had ever heard of Python before. Python was such a great tool, so we hoped that the world would one day find out about it. I think we're there now. Wesley Chun: 'Python was such a great tool, so we hoped that the world would one day find out about it. I think we're there now.' I'm also excited that we are near the end of the crossroads of having both Python 2 and 3. Python 3 adoption has taken off and most packages are now available. Driscoll: What do you think about the long life of Python 2? Chun: Soon Python 2 will be in the rear-view mirror. Those who are skeptical of Python 3.x may remain that way, but that group is slowly disappearing. Python moving from 2 to 3 is not the same as moving from Perl 5 to 6. Page 227
Wesley Chun The long life of Python 2 was necessary because of the backwards incompatibility of Python 3. However, Python 2.6 and 2.7 are great migration tools. They are the only 2.x versions that have 3.x features backported to them, to help with the overall migration. Wesley Chun: 'I proclaimed that it would take a decade for the world to move to Python 3, due to its lack of compatibility with Python 2.' I have been writing and speaking about the longevity of Python 2 for some time. Back in 2008, when 3.0 launched, I proclaimed that it would take a decade for the world to move to Python 3, due to its lack of compatibility with Python 2. Based on the momentum that I'm seeing today, I think that I'm going to be more accurate in my prediction than I thought was possible. My original statement was mostly a flippant and abstract one, which has gradually become more concrete and realistic over the past few years. But Python 3.6 is a great version to move over to! Wesley Chun: 'I think that I'm going to be more accurate in my prediction than I thought was possible.' Driscoll: Python is being increasingly used today for AI and machine learning. Why do you think this is? Page 228
Wesley Chun Chun: Python makes a great language, regardless of the field that it is applied to. Python does not require its users to be computer scientists in order to be able to solve problems. The language syntax does not get in the way for those who want a tool to build solutions with. Python is also great at encouraging group collaboration because of its understandable syntax. Driscoll: So how do you think that Python could be made a better language for AI and machine learning? Chun: The continued development of existing Python libraries and the creation of new libraries would make working in the AI field even easier. That would help everyone. Wesley Chun: 'The continued development of existing Python libraries and the creation of new libraries would make working in the AI field even easier.' Driscoll: What changes would you like to see in future Python releases? Chun: I'd love to see fewer Python releases and fewer new features. I think what the language has today (Python 3.6) is great. Wesley Chun: 'I'd love to see fewer Python releases and fewer new features.' Page 229
Wesley Chun Sure, we need to have bug and security fixes. Additional performance improvements would also be welcome, along with the solving of the Global Interpreter Lock issue. However, I'd like to see the release schedules stretched out. Eventually, I'd like to see development mostly stop with Python, so that it could be recognized as a standard like C or C++. If further improvements need to be made, then they can come as revisions to the standard. Being recognized as a standard will bring about Python's legitimacy and its greater adoption, especially in larger corporations. Driscoll: Thank you, Wesley Chun. Page 230
14 Steven Lott Steven Lott is an American software developer and author. He is an associate for the bank holding company Capital One and uses Python to build APIs for new products. Previously, he worked as a solution architect for CTG, which provides IT services. In 2003, Steven started using his talent for solving problems with Python to write books. He has since authored titles including Modern Python Cookbook, Python for Secrets Agents, and Functional Python Programming. Steven creates educational content for the Python community and writes a tech blog. Discussion themes: P ython pros and cons, Python books, v3.6. Catch up with Steven Lott here: @s_lott
Steven Lott Mike Driscoll: So why did you become a programmer? Steven Lott: I started programming in the 1970s, when computers were rare. My school had two Olivetti Programma 101 calculators and an IBM 1620 computer. It was empowering being able to create useful behavior on these machines, such as simulating random events, drawing things and trying to design new kinds of games. A responsive and autonomous device was the ultimate toy, even when doing math homework. The idea of building things that were new and useful via software was compelling. Also, I had a bunch of friends who hung around in the computer room. Driscoll: How did you start using Python? Lott: In the late 90s, as object-oriented programming was building momentum, I started tracking the popular languages. I had a Macintosh with the port of Smalltalk-80, the THINK C++ compiler and a JDK 1.1. I made regular searches for emerging object-oriented programming technology and eventually found Python. Steven Lott: 'The barriers to entry for Python were so much lower than the other languages that I had learned.' Page 232
Steven Lott The barriers to entry for Python were so much lower than the other languages that I had learned. There was only a runtime and no complex toolchain required to build software. Python was replacing Perl, AWK, sed, and grep with one tool that handled a variety of use cases. By 2000, I was trying to build useful and working applications in Python. Driscoll: What did you like about Python? Lott: At first, I was drawn to the elegant simplicity of Python. The standard library provided an amazing array of tools. As I learned more, the vast ecosystem of modules and packages outside of the standard library showed me how much could be done. I used Python at work because I could solve a problem quickly. The language was wonderful for complex data wrangling problems. In many cases, success stemmed from getting started quickly and discovering the nuances and complications of a problem as early as possible. Python encourages you to fail quickly and start again on a new course. Steven Lott: 'Python encourages you to fail quickly and start again on a new course.' The more that I learn about NumPy, the more that I see Python as a kind of universal container for code. The NumPy libraries are based on C (and Fortran), so having a Python wrapper makes them widely available and useful. Page 233
Steven Lott The underlying reason for using Python wasn't clear to me until Guido van Rossum's keynote speech at PyCon 2016. Python's biggest strength stems from the community. Python's open source nature creates and encourages a community effort to build cool new things. Steven Lott: 'Python's biggest strength stems from the community.' Python has numerous other strengths, such as its wide adoption as a language. Python is used in numerous contexts: scientists are using it to analyze truly gigantic datasets and it's used to build scalable web services too. Python is also used recreationally by home hackers who are integrating their Alexa, Nest, and Arduino-based temperature sensors. Another strength of Python is sometimes called batteries included. With a single download, you have all the tools you want. If you want to learn the language, then you can start with the distribution for your computer. If you want to do data science, then you can start with the Anaconda distribution, which is where lots of packages are bundled. The Python Software Foundation (PSF) makes active steps to be as inclusive as possible. The philosophy is that everyone should be able to learn and share their findings. Python's community believes that nobody should be excluded. We're all using Python to solve problems, so we all need help. Page 234
Steven Lott Driscoll: What are Python's weaknesses as a language? Lott: I've collected a few lists of Python's weaknesses. Some of them are utterly farcical and I've seen many sentiments which make no sense at all. A few complaints about Python are meaningful. Overall, I've learned that most problems that are blamed on the Python language being slow are more often than not due to ineffective algorithm and data structure choices. Steven Lott: 'Python's core runtime is remarkably fast.' Python's core runtime is remarkably fast. Fortran and C are considerably faster because they have optimizing compilers, that produce code focused on the underlying chipset. The SciPy and NumPy use of binary code wrapped in Python addresses this concern nicely. Another issue is the opportunity for confusion when using Python. The orthogonality between language statements and data structures means that lists, sets, and dictionaries have some overlapping features. The immensely sophisticated implementation of Python data structures makes it possible to make a bad choice and get correct answers, but have horribly inefficient code. Lastly, a weakness for Python is the possibility of creating inheritance problems. Everything is dynamic, so it can be difficult for tools like Pylint to discern meaningful method redefinitions from spelling mistakes with similar-looking method names and plain bad design. Page 235
Steven Lott The collections.abc module has decorators that can be used to organize code and provide some help with checking redefinitions. The type definitions in the typing module allow mypy to locate potential problems. Driscoll: So how did you end up becoming an author of Python books? Lott: Most roles in my career more or less just happened to me, but becoming a writer was a conscious decision. In this case, I had decided that there could be value in teaching the Python language and the associated software engineering skills. I started to collect notes for a book in 2002. By 2010, I had tried self-publishing several books on Python. Steven Lott: 'Over a few years, I answered thousands of questions about Python and somehow built up a large reputation.' When Stack Overflow started, I was an early participant. There were many interesting Python questions. The questions showed gaps where more information was needed about Python specifically and software engineering in general. Over a few years, I answered thousands of questions about Python and somehow built up a large reputation. Driscoll: What have you learned in the writing process? Lott: I've learned about the difficulty of creating meaningful and interesting examples. An example needs to have a story arc and a problem that requires a solution. Page 236
Steven Lott Stories require drama and conflict, and that doesn't often surface when thinking about data structures and algorithms. I spend more time wandering around trying to think of examples than doing any other part of the writing process. A lot of the problems that I come up with are too large and complex. A snippet of code is difficult to describe if it doesn't solve a problem. For example, the traveling salesman problem has a compelling story arc that characterizes graph traversal. Having a story provides a framework for remembering the essential problem and seeing how the solution works. Pure code doesn't help anyone to understand why the language construct is important. Code only exists to solve a problem, so it's imperative to describe the problem. Steven Lott: 'Pure code doesn't help anyone to understand why the language construct is important. Code only exists to solve a problem, so it's imperative to describe the problem.' Creating stories requires the time to view the problem from a distance, which is essential for summarizing and abstracting out needless details. Finding the right details requires a deep understanding. I know that I've failed when the description of the code becomes long and complex, involving tangential topics. Page 237
Steven Lott Driscoll: What are the pros and cons of self-publishing your books versus using a regular publisher? Lott: The difference between self-publishing and using a publisher is editing. The way that Python handles documentation testing (via the doctest module) means that the technical aspects of the content can be validated automatically. I've become better at this, but there are still some testing gaps in my published code. Other challenges are grammar, usage, clarity, precision, color, unity, coherence, and concision. With Packt Publishing, there's a pipeline of editors who ask questions and notice the incomprehensible parts, long before my book lands in the hands of a reader. When I self-published, I did what seemed best to me. Publishers manage costs, prices, and revenue streams adroitly. My job is to know Python and Packt Publishing handles the rest. Driscoll: Have you learned anything from your readers? If so, what? Lott: My readers have taught me the importance of using the Python doctest tool for checking each example in the body of a book. Readers have spotted numerous errors from code that I didn't check properly. Driscoll: What has been your favorite interaction with a reader? Lott: I work for a tech company in Northern Virginia. A co-worker was surprised to find out that I'd written Mastering Object-Oriented Python. They had bought the book based on recommendations and read the outline, without really looking at the author's name. Page 238
Steven Lott Driscoll: So which of your books has been the most popular? Why do you think that people buy one book over another? Lott: My most successful book has been Python for Secret Agents. It seems like the fun factor is part of that. If a book has a wide variety of fun exercises and problems, then readers can see how Python applies to the problems that they know and want to solve. If the book is too narrowly focused on one problem domain, or too abstract, then the practical applications become hard to envision. Driscoll: What new and exciting trends are you seeing in Python? Lott: Python 3.6 is fast and getting faster. The developers working on foundational algorithms have done impressive things. Steven Lott: 'Python 3.6 is fast and getting faster. The developers working on foundational algorithms have done impressive things.' The new internal data structures for the dict save memory and run faster. This kind of internal re-engineering is exciting. There are huge benefits that come from having an upgrade with few visible changes to the language. Another exciting direction that Python is going in is connected to the mypy project and the type hints. You have a handy quality tool that doesn't involve a profound change to the language, or the development tools. This can help you to write more reliable code, without introducing significant overheads. If mypy becomes part of Pylint or Pyflakes, then that will help even more. Page 239
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
- 254
- 255
- 256
- 257
- 258
- 259
- 260
- 261
- 262
- 263
- 264
- 265
- 266
- 267
- 268
- 269
- 270
- 271
- 272
- 273
- 274
- 275
- 276
- 277
- 278
- 279
- 280
- 281
- 282
- 283
- 284
- 285
- 286
- 287
- 288
- 289
- 290
- 291
- 292
- 293
- 294
- 295
- 296
- 297
- 298
- 299
- 300
- 301
- 302
- 303
- 304
- 305
- 306
- 307
- 308
- 309
- 310
- 311
- 312
- 313
- 314
- 315
- 316
- 317
- 318
- 319
- 320
- 321
- 322
- 323
- 324
- 325
- 326
- 327
- 328
- 329
- 330
- 331
- 332
- 333
- 334
- 335
- 336
- 337
- 338
- 339
- 340
- 341
- 342
- 343
- 344
- 345
- 346
- 347
- 348
- 349
- 350
- 351
- 352
- 353
- 354
- 355
- 356
- 357
- 358
- 359
- 360
- 361
- 362
- 363
- 364
- 365
- 366
- 367