Four years ago, Dan Katz and I began working on a project to measure the complexity of the law.  Its genesis was, in every sense, an accident; in order to properly identify citations to the IRC in our VTR empirical review of U.S. Tax Court decisions, we had to deal with the informal, non-Blue Book citation standard used by Tax Court judges.  To do this, we scraped and parsed out all possible subsection citations to the IRC  and identified these citations in written opinions (if you’re interested, there’s more on this supervised methodology in the VTR appendix).  When we were done, we had a directory hierarchy that corresponded to titles, parts, chapters, and sections of the U.S. Code.  Immersed in our NSF Fellowships at the University of Michigan Center for the Study of Complex Systems, we asked the contextually obvious question – could we measure the complexity of this system?

  Since then, this question has taken many forms: text and XML, LaTeX and Word, Physica A and CELS.  The full paper, intended for broader consumption, has gone through hundreds of hours of major and minor revisions.  We kept sitting on the paper because it was never perfect, because there was always another refinement, because we always had another idea.

  Last Thursday, this paper finally saw its first “official” release.  It’s still a draft, and we still haven’t decided which publication venue to pursue, but we felt it was important to contribute to the recent discussion around open Federal legislative material (see, e.g., this FWC article).

  As part of the paper, we’ve released all necessary replication source (Python) and data on Github.  You can view the repository here:


If you’re interested in reading the paper, you can get it here:

Measuring the Complexity of the Law: The United States Code
Daniel Martin Katz, Michael James Bommarito II

Abstract below for reference:

Einstein’s razor, a corollary of Ockham’s razor, is often paraphrased as follows: make everything as simple as possible, but not simpler. This rule of thumb describes the challenge that designers of a legal system face — to craft simple laws that produce desired ends, but not to pursue simplicity so far as to undermine those ends. Complexity, simplicity’s inverse, taxes cognition and increases the likelihood of suboptimal decisions. In addition, unnecessary legal complexity can drive a misallocation of human capital toward comprehending and complying with legal rules and away from other productive ends.

While many scholars have offered descriptive accounts or theoretical models of legal complexity, empirical research to date has been limited to simple measures of size, such as the number of pages in a bill. No extant research rigorously applies a meaningful model to real data. As a consequence, we have no reliable means to determine whether a new bill, regulation, order, or precedent substantially effects legal complexity.

In this paper, we address this need by developing a proposed empirical framework for measuring relative legal complexity. This framework is based on “knowledge acquisition,” an approach at the intersection of psychology and computer science, which can take into account the structure, language, and interdependence of law. We then demonstrate the descriptive value of this framework by applying it to the U.S. Code’s Titles, scoring and ranking them by their relative complexity. Our framework is flexible, intuitive, and transparent, and we offer this approach as a first step in developing a practical methodology for assessing legal complexity.