This term, I'm teaching Complex Systems 530 - Computer Modeling for Complex Systems at the University of Michigan Center for the Study of Complex Systems. In the spirit of open science, all course material will be available online at Github. You can browse the repository here: https://github.com/mjbommar/cscs-530-w2015. In the course, we're exploring why and
One of the more exciting and public projects we've been working on lately has finally come to light - our Supreme Court prediction project with Dan Katz and Josh Blackman. This project is exactly what you'd expect - a framework for predicting the Supreme Court, though meant to span the Court's entire history, unlike previous projects.
In our last post, we went over a range of options to perform approximate sentence matching in Python, an import task for many natural language processing and machine learning tasks. To begin, we defined terms like: tokens: a word, number, or other "discrete" unit of text. stems: words that have had their "inflected" pieces removed based on
Let's imagine you have a sentence of interest. You'd like to find all occurrences of this sentence within a corpus of text. How would you go about this? The most obvious answer is to look for exact matches of the sentence. You'd search through every sentence of your corpus, checking to see if every character of the
Isotonic regression is a great tool to keep in your repertoire; it's like weighted least-squares with a monotonicity constraint. Why is this so useful, you ask? Take a look at the example relationship below. (You can follow along with the Python code here). Let's imagine that the true relationship between x and y is characterized piece-wise by a sharp