Category Archives: python

iter_words

I would like to iterate over a stream of words, say, from STDIN or a file (or any random input stream). Typically, this is done like this, And then one can simply, For a more concrete example, let's say we … Continue reading

Posted in python

Punycode

I would like a webapp that supports UTF-8 URLs. For example, https://去.cc/叼, where both the path and the server name contain non-ASCII characters. The path /叼 can be handled easily with %-encodings, e.g., Note: this is similar to the raw … Continue reading

Posted in python, software arch.

Graph Search

I would like to discover paths between two nodes on a graph. Let's say we have a graph that looks something like this: The graph object contains a collection of nodes and their corresponding connections. If it's a bi-directional graph, … Continue reading

Posted in python, software arch.

python unittest

I would like to setup unit tests for a python application. There are many ways to do this, including doctest and unittest, as well as 3rd-party frameworks that leverage python's unittest, such as pytest and nose. I found the plain-old … Continue reading

Posted in python

locking and concurrency in python, part 2

Previously, I created a "MultiLock" class for managing locks and lockgroups across a shared file system. Now I want to create a simple command-line utility that uses this functionality. To start, we can create a simple runone() function that leverages … Continue reading

Posted in python, shell tips, software arch.

locking and concurrency in python, part 1

I would like to do file-locking concurrency control in python. Additionally, I would like to provide a "run-once-and-only-once" functionality on a shared cluster; in other words, I have multiple batch jobs to run over a shared compute cluster and I … Continue reading

Posted in python, software arch.

zip archive in python

I would like to create zip archives within a python batch script. I would like to compress individual files or entire directories of files. You can use the built-in zipfile module, and create a ZipFile as you would a normal … Continue reading

Posted in python, shell tips

timeout command in python

I would like to add a timeout to any shell command such that if it does not complete within a specified number of seconds the command will exit. This would be useful for a any long-running command where I'd like … Continue reading

Posted in python, shell tips

python slice and sql every Nth row

I would like to retrieve every Nth row of a SQL table, and I would like this accessed via a python slice function. A python slice allows access to a list (or any object that implements a __getitem__ method) by … Continue reading

Posted in data arch., mysql, oracle, python

python, finding recurring pairs of data

I would like to find all pairs of data that appear together at least 10 times. For example, given a large input file of keywords: >>> foo, bar, spam, eggs, durian, stinky tofu, ... >>> fruit, meat, vinegar, sphere, foo, … Continue reading

Posted in python

python, analyzing csv files, part 2

Previously, we discussed analyzing CSV files, parsing the csv into a native python object that supports iteration while providing easy access to the data (such as a sum by column header). For very large files this can be cumbersome, especially … Continue reading

Posted in data arch., python, software arch.

python, analyzing csv files, part 1

I would like to analyze a collection of CSV (comma-separated-values) files in python. Ideally, I would like to treat the csv data as a native python object. For example, >>> financial_detail = Report('financial-detail.csv') >>> transactions = {} >>> for row … Continue reading

Posted in data arch., python, software arch.

python, unique files by content

I would like to retrieve a list of unique files by content rather than by filename. That is, if spam.txt and eggs.txt both contained the same contents I want only one of them to return. A very simple approach is … Continue reading

Posted in python, shell tips

python daemon

I would like to create a python daemon, completely detaching from the terminal or parent process, and yet retaining any log handlers through the python logging module. There is a wonderful example at cookbook-278731 of a well-behaved daemon, and see … Continue reading

Posted in python, software arch.

python logging

I would like customizable logging in python applications, and I would like to easily send log messages to multiple handlers without any modification of the application code. The built-in logging module provides a very robust and easy-to-use logging capability. In … Continue reading

Posted in python, software arch.