The project I am working on is the Freelaw Project and more specifically Juriscraper. The Freelaw Project can roughly broken up into two distinct parts. Part one is courtlistener.com. This part contains everything that goes into the website, its administration and all the supporting software for search such as Apache Nutch. Part two is Juriscraper and everything that goes into scraping, crawling and indexing court opinions. They're logically separate and can operate independently, but are managed by the same people since they're so interrelated. I am working explicitly on Juriscraper and the scraping of websites for court opinions. Sometimes I may reference Juriscraper or courtlistener.com but it's important to realize they are sub-parts of the Freelaw Project hosted at freelawproject.org.
The Freelaw Project is open for both ideological and practical reasons. It's clear that the founders of the Freelaw Project, Brian Carver and Michael Lissner, have ideological reasons for having the project open. The purpose of the project is to provide easier access to court opinions and provide positive benefit to civil society. So the project is well grounded in ideological foundations. However, there also exists practical reasoning behind the openness of the Freelaw Project. By leveraging collaborative involvement, the Freelaw Project benefits from having multiple contributors and diverse ideas. For example, Krist Jin is working on the Citation Ferret, a Firefox plugin for citations that utilizes courtlistener.com. Without the Freelaw Project being open this plugin might never have been conceived or thought of.
The community of the Freelaw Project is relatively small and the community of juriscraper is even smaller. As far as I can tell, only Michael Lissner, Brian Carver, Krist Jin and myself have contributed to juriscraper. I say this based on pull requests and submissions to juriscraper as recorded in the juriscraper bitbucket.org site. The Freelaw Project as a whole is a bit larger but still relatively small. Brian Carver established a mailing list on September 9, 2013 and introduced everyone currently involved in the project. At the time this was 10 people including myself.
The Freelaw Project consists of the website courtlistener.com and the associated software that courtlistener.com requires. I hesitate to call it a product since it's not actually productized in any sense. It is never 'shipped' nor does it have fixed release dates. Juriscraper is licensed under the BSD license while Courtlistener is licensed under version 3 of the GNU Affero General Public License.
All contributors to the Freelaw Project must sign a copy of the Freelaw Project's own contributor agreement before their code will be merged into the Freelaw Project. This is mainly meant to protect the Freelaw Project from issues that may arise from contributors contributing code which they are not legally permitted to contribute, such as copyrighted code. The agreement places a burden on the contributor that they are legally permitted to provide the Freelaw Project any submissions they may make to the Freelaw Project. It also grants patent and copyright license to the Freelaw Project for any and all code contributed by the contributor.
In August of 2013 Brian Carver and Michael Lissner incorporated The Free Law Project as a non-profit corporation with the sole purpose of managing and growing the Freelaw Project. Their bylaws and corporate meeting minutes can be viewed online. The Free Law Project non-profit has even received funding to allow Michael Lissner to work full-time on the project.
The Freelaw Project has their main website at freelawproject.org for news and information regarding the project. In addition there is a mailing list, Trello site, juriscraper bitbucket.org site and courtlistener bitbucket site. There is also an extremely useful Virtual Machine available online to help people like me get started.
Infrastructure is an important part of open source projects and infrastructure choices can have important consequences for a project's success. The Freelaw Project chose Bitbucket instead of Github because of its support for Mercurial as well as Git on the client side. I find Bitbucket a bit more intuitive than Github and some members found Mercurial easier to use than Git. The mailing list was necessary once the number of people working on The Freelaw Project grew past just 3-4 people. Once the number of people collaborating increased past that limit it was easier to have everyone use a mailing list instead of personal mails to one another.
Last weekend I made my first contribution to Juriscraper by writing a scraper for the Virginia state appellate courts. It was a bite sized task that I accomplished with only around 30 lines of code. I used theVirtual Machine as a pre-configured development environment which greatly sped up my ability to write the scraper. Once I had written the code I pushed it to my fork of Juriscraper then initiated a pull-request and sent a mail to the mailing list. Michael Lissner merged my pull-request and gave me some good feedback with hints for the next time. I find the infrastructure easy and standard to make use of. The experience was unremarkable and relatively straightforward.