Category Archives: Research

Fuzzy match sentences in Python

Let’s imagine you have a sentence of interest.  You’d like to find all occurrences of this sentence within a corpus of text.  How would you go about this? The most obvious answer is to look for exact matches of the sentence.  You’d search through every sentence of your corpus,...
Read More

Isotonic Regressions in scikit-learn

Isotonic regression is a great tool to keep in your repertoire; it’s like weighted least-squares with a monotonicity constraint.  Why is this so useful, you ask?  Take a look at the example relationship below. (You can follow along with the Python code here).       Let’s imagine ...
Read More

AWS EC2 hs1.8xlarge Oracle ORION benchmark results

Benchmarking I/O with Oracle ORION is an important part of planning, baselining, and performance-tuning Oracle environments.  I’ve previously provided ORION results for the hi1.4xlarge SSD-backed instance class, and based on some recent work, I wanted to provide an update for the newer hs1.8x...
Read More

Measuring the Complexity of the Law: The U.S. Code

  Four years ago, Dan Katz and I began working on a project to measure the complexity of the law.  Its genesis was, in every sense, an accident; in order to properly identify citations to the IRC in our VTR empirical review of U.S. Tax Court decisions, we had to deal with the informal, non-Blue [&...
Read More

Grexit stage left: visualizing the online discussion around Greece’s possible Euro exit

  While Tsipras and his Syriza coalition have been busy in Greek parliament, the Internet has been a-buzz with speculation that their platform will result in a Greek exit from the Euro currency.  This prospect, affectionately dubbed “Grexit” by Citi in February, has been making the rou...
Read More