Four years ago, Dan Katz and I began working on a project to measure the complexity of the law. Its genesis was, in every sense, an accident; in order to properly identify citations to the IRC in our empirical review of U.S. Tax Court decisions, we had to deal with the informal, non-Blue Book citation standard used by Tax Court judges. To do this, we scraped and parsed out all possible subsection citations to the IRC and identified these citations in written opinions. When we were done, we had a directory hierarchy that corresponded to titles, parts, chapters, and sections of the U.S. Code. Immersed in our NSF Fellowships at the University of Michigan Center for the Study of Complex Systems, we asked the contextually obvious question — could we measure the complexity of this system?
Since then, this question has taken many forms: text and XML, LaTeX and Word, Physica A and CELS. The full paper, intended for broader consumption, has gone through hundreds of hours of major and minor revisions. We kept sitting on the paper because it was never perfect, because there was always another refinement, because we always had another idea.
Last Thursday, this paper finally saw its first "official" release. It's still a draft, and we still haven't decided which publication venue to pursue, but we felt it was important to contribute to the recent discussion around open Federal legislative material.
As part of the paper, we've released all necessary replication source (Python) and data on Github. You can view the repository here: mjbommar/us-code-complexity.
Einstein's razor, a corollary of Ockham's razor, is often paraphrased as follows: make everything as simple as possible, but not simpler. This rule of thumb describes the challenge that designers of a legal system face — to craft simple laws that produce desired ends, but not to pursue simplicity so far as to undermine those ends. Complexity, simplicity's inverse, taxes cognition and increases the likelihood of suboptimal decisions.
While many scholars have offered descriptive accounts or theoretical models of legal complexity, empirical research to date has been limited to simple measures of size, such as the number of pages in a bill. In this paper, we address this need by developing a proposed empirical framework for measuring relative legal complexity. This framework is based on "knowledge acquisition," an approach at the intersection of psychology and computer science, which can take into account the structure, language, and interdependence of law.