Sessions and (Python) HTTP modules

Some of the documentation best-practices around certain technologies are written with the best intentions in mind, but they can lead to unforseen side-effects or bring users to make bad decisions.

One of my favourite examples are package installation guidelines in the Linux world. It is not uncommon for Stack overflow or even the package docs themselves to tell you to

yum install MY_PACKAGE This looks very easy, but if you try that and it turns out that you do not have sufficient permissions, you might be tempted to make your user Root instead of just add the sudo in front of it all.

In the FOLIO Migration tools, we used the over-documented getting-started way to make HTTP requests like this:

r = requests.get('https://api.github.com/events')

What I did not know at the time, is that these calls has a lot of overhead if you want to call the same server multiple times. It also potentially eats up resources at the server level. I thought it was the server that was slow. Not my code. When my colleagues started talking about session objects, things got way faster very quickly. Turns out this is fundamental best practices, and something I should have known... Diving into the requests module's Advanced section of the docs, the first thing that is mentioned is the "Session" object:

The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance, and will use urllib3’s connection pooling. So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection).

The httpx package is less subtle:

If you do anything more than experimentation, one-off scripts, or prototypes, then you should use a Client instance

These things are indeed annoying, and I know it is hard to decide between a good Getting started one-liner and something more long-term useful. I guess it is just part of the game.

For some time now, my former colleagues at EBSCO has made sure things are moving way faster in the FOLIO Client with the use of HTTPX Client objects, and I must say the FOLIO Migration tools have benefited tremendously from the improvement.