Member-only story
Async IO in Python: A Practical Guide to Efficient API Calls
For a Python developer, handling data requests efficiently is crucial when working with large data sets, especially when these requests come from an external API. Today, we’re going to delve into Python’s async.io library and how we can use it to connect to an API — Wikipedia, for instance — to speed up information retrieval.
What is Async IO?
Python’s asyncio is a library used for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources, running network clients and servers, and other related primitives.
Before we get into the practical examples, it’s worth mentioning that asyncio is not a panacea for all speed-related problems. Its main strength lies in situations where a program is mostly waiting for I/O, which happens often in web scraping tasks or when interfacing with APIs.
Now, let’s dive into some code!
Connect to Wikipedia API using Async IO
To illustrate the usage of async IO, we will first write a script to fetch the top 100 most viewed Wikipedia pages for yesterday. Then we will get the categories for each page and finally make a set of categories that removes all the duplicates.