A Python Functional Programming Example
So I was working on my ref-man emacs package and I had made some additions which incorporate a local python flask server running on it which fetches data from from dblp API, arXiv API and also I’m adding semantic scholar support. Originally there was only dblp calls so I had to modify the functions and I took a functional programming approach, by generating partial functions and using them to modify the existing code as I modified my existing implementation.
The old dblp function was dblp_old
which is still in the file
def dblp_old():
# checking request
if not isinstance(request.json, str):
= request.json
data else:
try:
= json.loads(request.json)
data except Exception:
return json.dumps("BAD REQUEST")
= 0
j = {}
content
# The requests are fetched in parallel in batches which can be configured
while True:
= data[(args.batch_size * j): (args.batch_size * (j + 1))].copy()
_data for k, v in content.items():
if v == ["ERROR"]:
_data.append(k)if not _data:
break
= Queue()
q = []
threads for d in _data:
=dblp_fetch, args=[d, q],
threads.append(Thread(target={"verbose": args.verbose}))
kwargs-1].start()
threads[for t in threads:
t.join()
content.update(_dblp_helper_old(q))+= 1
j return json.dumps(content)
So a function dblp_fetch
is given to the threads with the data and a queue as args
. All it’ll do is push responses to HTTP requests on to the queue and a helper function _helper
will process the results after fetching from the queue.
def _helper(q):
= {}
content while not q.empty():
= q.get()
query, response if response.status_code == 200:
= json.loads(response.content)["result"]
result if result and "hits" in result and "hit" in result["hits"]:
= []
content[query] for hit in result["hits"]["hit"]:
= hit["info"]
info = info["authors"]["author"]
authors if isinstance(authors, list):
"authors"] = [x["text"] for x in info["authors"]["author"]]
info[else:
"authors"] = [authors["text"]]
info[
content[query].append(info)else:
= ["NO_RESULT"]
content[query] elif response.status_code == 422:
= ["NO_RESULT"]
content[query] else:
= ["ERROR"]
content[query] return content
Fairly simple as you can see, but now if I want to add more APIs, I either write separate _fetch
and _helper
functions for each of them or modify the existing interface. Adding new functions is easier and hackier but error prone in the long run because:
- You’ll end up copy pasting the code which will lead to errors
- Changes in the implementation where that code interacts with the server will have to be replicated across
So I changed the functions in three ways:
Removed the multithreaded part so that the functions is executed parallely and responses handled by
_helper
for arbitrary functions. I could have used a threadpool or processpool (See), but I already had the threads implementation and it was working fine.q_helper
now takes three functions as parameters in addition to the queuedef q_helper(func_success, func_no_result, func_error, q): = {} content while not q.empty(): = q.get() query, response if response.status_code == 200: func_success(query, response, content)elif response.status_code == 422: func_no_result(query, response, content)else: func_error(query, response, content)return content
So it’s entirely indpendent of whether we fetch from dblp or arxiv or any other source. It doesn’t even handle any of the responses itself but passes them on to the functions. Data is shared with the
dict
content
. But that leaves us with a tiny problem: thehelper
function inpost_json_wrapper
only takes theq
as an argument while our newq_helper
requires 3 functions and theq
.Now there are two ways to handle that: we can either pass all the functions to the
post_json_wrapper
or pass a new function which already has the three functions set. After all there will be a separate helper for each API. So this is where the final piece of the functional puzzle falls in.= partial(q_helper, _dblp_success, _dblp_no_result, _dblp_error) _dblp_helper
what we’re doing is defining a new function composed of existing functions
q_helper
,_dblp_success
etc. but with everything but theq
already set.arxiv_helper
is defined similarly and results in fewer paramters being sent topost_json_wrapper
which is almost always a good thing. See functools module for information aboutpartial
.
So we’ve moved from an API specific threaded function and helper to an API agnostic one while still keeping the original ideas of the implementation intact! Pretty neat! I might write about how functional programming can help reduce software complexity in another post, as even though it looks complex you should keep in mind that most of the functions are stateless. They don’t change over time. That leaves us with fewer things to worry about.