Fuzzy match sentences in Python
Let's imagine you have a sentence of interest. You'd like to find all occurrences of this sentence within a corpus of text.
Isotonic Regressions in scikit-learn
Isotonic regression is a great tool to keep in your repertoire; it's like weighted least-squares with a monotonicity constraint. Why is this so useful, you
Featured in Wired: Measuring the Complexity of the Law
Thanks to Sam Arbesman (@arbesman) for featuring Dan and my paper, Measuring the Complexity of the Law: The United States Code, on his
Automating Oracle ORION I/O testing
Oracle ORION is a powerful tool for evaluating realistic OLTP and DSS/DW I/O performance. ORION should be a part of
AWS EC2 hs1.8xlarge Oracle ORION benchmark results
Benchmarking I/O with Oracle ORION is an important part of planning, baselining, and performance-tuning Oracle environments. I've previously provided ORION
ipython notebook for R: Quickstart for Ubuntu
If you're like me, you love ipython notebook but often write R. RStudio's integrated RMarkdown is nice, but for some
Is the Tax Code the longest Title?
Last week, I shared that Dan Katz and I had finally published a draft of our paper, Measuring the Complexity of
Slides from ReInvent Law Silicon Valley Talk
Live from ReInvent Law Silicon Valley, where I gave an Ignite-style talk drawing analogy to law's future from finance's past. Slides embedded below and video
Automating Oracle Database deployment with Amazon Web Services, fabric, and boto – SEMOP Talk, Feb 12, 2013
I'll be giving a talk tonight on automated Oracle database deployment at the SouthEast Michigan Oracle Professionals (SEMOP) Meetup Group. While I'll be following up on this
Git Repository for Congressional Bill Statistics
After a nice twitter conversation this morning, I finally got the impetus to release the source for my Congressional Bill Statistics data. You
Connecting R to an Oracle database with RJDBC
In many circumstances, you might want to connect R directly to a database to store and retrieve data. If the source database is an Oracle
Retrieving the VIX term structure in R
Much of my time lately has gone into analyzing and trading products in the volatility complex. As a result, I regularly watch the VIX
Natural Language Processing and Machine Learning for e-Discovery – Slides from guest lecture at MSU College of Law
Fellow Computational Legal Studies blogger and MSU law prof Dan Katz invited me to give an expert guest lecture for his e-Discovery seminar. This