Category Archives: python

Extremely Large Numeric Bases with Unicode

Previously, we discussed using Punycode for non-ASCII domain names with internationalized URLs, e.g., https://去.cc/叼 I would like to use this approach to create a URL Shortening Service, where we can create shortened URLs that use UTF-8 characters in addition to … Continue reading

Posted in python

PseudoForm, a Bot-resistant Web Form

I would like to create an HTML form that is resistant to bots. This could be a classic comment form or really any web-accessible form. In most cases, requiring authentication and an authorization service (such as OAuth) would be sufficient … Continue reading

Posted in css, html, javascript, python, software arch.

Trie or Set

Given a grid or input stream of characters, I would like to discover all words according to a given dictionary. This could be a dictionary of all English words or phrases (say, for an autocomplete service), or for any language. … Continue reading

Posted in data arch., python

iter_words

I would like to iterate over a stream of words, say, from STDIN or a file (or any random input stream). Typically, this is done like this, And then one can simply, For a more concrete example, let's say we … Continue reading

Posted in python

Punycode

I would like a webapp that supports UTF-8 URLs. For example, https://去.cc/叼, where both the path and the server name contain non-ASCII characters. The path /叼 can be handled easily with %-encodings, e.g., Note: this is similar to the raw … Continue reading

Posted in python, software arch.

Graph Search

I would like to discover paths between two nodes on a graph. Let's say we have a graph that looks something like this: The graph object contains a collection of nodes and their corresponding connections. If it's a bi-directional graph, … Continue reading

Posted in python, software arch.

python unittest

I would like to setup unit tests for a python application. There are many ways to do this, including doctest and unittest, as well as 3rd-party frameworks that leverage python's unittest, such as pytest and nose. I found the plain-old … Continue reading

Posted in python

locking and concurrency in python, part 2

Previously, I created a "MultiLock" class for managing locks and lockgroups across a shared file system. Now I want to create a simple command-line utility that uses this functionality. To start, we can create a simple runone() function that leverages … Continue reading

Posted in python, shell tips, software arch.

locking and concurrency in python, part 1

I would like to do file-locking concurrency control in python. Additionally, I would like to provide a "run-once-and-only-once" functionality on a shared cluster; in other words, I have multiple batch jobs to run over a shared compute cluster and I … Continue reading

Posted in python, software arch.

zip archive in python

I would like to create zip archives within a python batch script. I would like to compress individual files or entire directories of files. You can use the built-in zipfile module, and create a ZipFile as you would a normal … Continue reading

Posted in python, shell tips

timeout command in python

I would like to add a timeout to any shell command such that if it does not complete within a specified number of seconds the command will exit. This would be useful for a any long-running command where I'd like … Continue reading

Posted in python, shell tips

python slice and sql every Nth row

I would like to retrieve every Nth row of a SQL table, and I would like this accessed via a python slice function. A python slice allows access to a list (or any object that implements a __getitem__ method) by … Continue reading

Posted in data arch., mysql, oracle, python

python, finding recurring pairs of data

I would like to find all pairs of data that appear together at least 10 times. For example, given a large input file of keywords: >>> foo, bar, spam, eggs, durian, stinky tofu, ... >>> fruit, meat, vinegar, sphere, foo, … Continue reading

Posted in python

python, analyzing csv files, part 2

Previously, we discussed analyzing CSV files, parsing the csv into a native python object that supports iteration while providing easy access to the data (such as a sum by column header). For very large files this can be cumbersome, especially … Continue reading

Posted in data arch., python, software arch.

python, analyzing csv files, part 1

I would like to analyze a collection of CSV (comma-separated-values) files in python. Ideally, I would like to treat the csv data as a native python object. For example, >>> financial_detail = Report('financial-detail.csv') >>> transactions = {} >>> for row … Continue reading

Posted in data arch., python, software arch.