## Course material for Complex Systems 530 – Computer Modeling for Complex Systems

This term, I'm teaching Complex Systems 530 - Computer Modeling for Complex Systems at the University of Michigan Center for the Study of Complex Systems.  In the spirit of open science, all course material will be available online at Github.  You can browse the repository here: https://github.com/mjbommar/cscs-530-w2015.   In the course, we're exploring why and

By |2015-01-27T17:07:11-05:00January 27th, 2015|Consulting, Machine Learning, Programming, Training|1 Comment

## Advanced approximate sentence matching in Python

In our last post, we went over a range of options to perform approximate sentence matching in Python, an import task for many natural language processing and machine learning tasks.  To begin, we defined terms like: tokens: a word, number, or other "discrete" unit of text. stems: words that have had their "inflected" pieces removed based on

## Fuzzy match sentences in Python

Let's imagine you have a sentence of interest.  You'd like to find all occurrences of this sentence within a corpus of text.  How would you go about this? The most obvious answer is to look for exact matches of the sentence.  You'd search through every sentence of your corpus, checking to see if every character of the

## Isotonic Regressions in scikit-learn

Isotonic regression is a great tool to keep in your repertoire; it's like weighted least-squares with a monotonicity constraint.  Why is this so useful, you ask?  Take a look at the example relationship below. (You can follow along with the Python code here).       Let's imagine that the true relationship between x and y is characterized piece-wise by a sharp

By |2014-06-08T22:30:37-04:00June 8th, 2014|Consulting, Machine Learning, Programming, Research|1 Comment

## Is the Tax Code the longest Title?

Last week, I shared that Dan Katz and I had finally published a draft of our paper, Measuring the Complexity of the Law: The U.S. Code.  We'd previewed this research on Computational Legal Studies years ago.  Since then, we've received great feedback and a number of questions.   The most common question, even among legal professionals,

By |2013-08-19T08:52:29-04:00August 19th, 2013|Law, Programming, Technology|0 Comments

## Measuring the Complexity of the Law: The U.S. Code

Four years ago, Dan Katz and I began working on a project to measure the complexity of the law.  Its genesis was, in every sense, an accident; in order to properly identify citations to the IRC in our VTR empirical review of U.S. Tax Court decisions, we had to deal with the informal, non-Blue

By |2013-08-13T12:22:04-04:00August 13th, 2013|Law, Research, Society|0 Comments

## Revisiting text processing with R and Python

Back in 2011, I covered the relative performance difference of the most popular libraries for text processing in R and Python.   In case you can't guess the answer, Python and NLTK  won by a significant margin over R and tm.  Text processing with R seemed simple on paper, but performance and flexibility limitations have

By |2013-05-25T21:19:25-04:00May 25th, 2013|Consulting, Programming|0 Comments

## Generating SSH config from AWS hosts using boto

As a consultant and advisor to many firms running on or investigating AWS, I find SSH host and key management to be a constant struggle.  From IAM credentials to default OS logins, it's easy to lose time with constant lookups.  What we'd really like is to get a custom SSH config file for AWS.

By |2013-03-09T09:16:07-05:00March 9th, 2013|Cloud, Consulting, Programming|1 Comment

## Git Repository for Congressional Bill Statistics

After a nice twitter conversation this morning, I finally got the impetus to release the source for my Congressional Bill Statistics data.   You can find the source at this Github repository.  I haven't taken the time to review licensing yet, but I won't be asserting anything more than CC3 Attribution on my code.

By |2012-12-22T12:00:12-05:00December 22nd, 2012|Law, Programming, Research|0 Comments

## Summary of community detection algorithms in igraph 0.6

Based on Launchpad traffic and mailing list responses, Gabor and Tamas will soon be releasing igraph 0.6.  In celebration, I’ll be publishing a number of helpful lists and tables I’ve put together to organize information about igraph.   In this post, we’ll cover the community detection algorithms (~i.e., clustering, partitioning, segmenting) available in 0.6

By |2012-06-17T09:26:51-04:00June 17th, 2012|Consulting, Programming|0 Comments

#### Top Sliding Bar

This Sliding Bar can be switched on or off in theme options, and can take any widget you throw at it or even fill it with your custom HTML Code. Its perfect for grabbing the attention of your viewers. Choose between 1, 2, 3 or 4 columns, set the background color, widget divider color, activate transparency, a top border or fully disable it on desktop and mobile.