Interactive Natural Language Processing for Legal Text

Update: We received the best student paper award for our paper at JURIX’15!

In an earlier post, I talked about my work on Natural Language Processing in the clinical domain. The main idea behind the project is to enable domain experts to build machine learning models for analyzing text. We do this by designing usable tools for NLP without really having the need to send datasets to machine learning experts or understanding the inner working details of the algorithms. The post also features a demo video of the prototype tool that we have built.

I was presenting this work at my program’s bi-weekly meetings where Jaromir, a fellow ISP graduate student, pointed out that such an approach could be useful for his work as well. Jaromir also holds a degree in Law and works on building AI systems for legal applications. As a result, we ended up collaborating on a project on using the approach for statutory analysis. While, the main topic of discussion in the project is on the framework in which a human experts cooperate with a machine learning text classification algorithm, we also ended up augmenting our approach with a new way of capturing and re-using knowledge. In our tool datasets and models are treated separately and our not tied together. So, if you were building a classification model for say statutes from the state of Alaska, when you need to analyze laws from Kansas you need not start from scratch. This allows us to be in a better starting place in terms of all the performance measures and build a model using fewer training examples.

The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, 1p and 2p for the first (ML model-oriented) and second (interaction-oriented) evaluation perspectives, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.
The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.

We will be presenting this work at JURIX’15 during the 28th year of the conference focusing on legal information systems. Previously, we had presented portions of this work at the AMIA Summit on Clinical Research Informatics and at the ACM IUI Workshop on Visual Text Analytics.

References

Jaromír Šavelka, Gaurav Trivedi, and Kevin Ashley. 2015. Applying an Interactive Machine Learning Approach to Statutory Analysis. In Proceedings of the 28th International Conference on Legal Knowledge and Information Systems (JURIX ’15). Braga, Portugal. [PDF] – Awarded the Best Student Paper (Top 0.01%).

Clinical Text Analysis Using Interactive Natural Language Processing

Update: Here’s our full paper announcement with source-code release…

I am working on a project to support the use of Natural Language Processing in the clinical domain. Modern NLP systems often make use of machine learning techniques. However, physicians and other clinicians, who are interested in analyzing clinical records, may be unfamiliar with these methods. Our project aims to enable such domain experts make use of Natural Language Processing using a point-and-click interface . It combines novel text-visualizations to help its users make sense of NLP results, revise models and understand changes between revisions. It allows them to make any necessary corrections to computed results, thus forming a feedback loop and helping improve the accuracy of the models.

Here’s the walk-through video of the prototype tool that we have built:

At this point we are redesigning some portions of our tool based on feedback from a formative user study with physicians and clinical researchers. Our next step would be to conduct an empirical evaluation of the tool to test our hypotheses about its design goals.

We will be presenting a demo of our tool at the AMIA Summit on Clinical Research Informatics and also at the ACM IUI Workshop on Visual Text Analytics in March.

References

  1. Gaurav Trivedi. 2015. Clinical Text Analysis Using Interactive Natural Language Processing. In Proceedings of the 20th International Conference on Intelligent User Interfaces Companion (IUI Companion ’15). ACM, New York, NY, USA, 113-116. DOI 10.1145/2732158.2732162 [Presentation] [PDF]
  2. Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, Harry Hochheiser. 2015. An Interactive Tool for Natural Language Processing on Clinical Text. Presented at 4th Workshop on Visual Text Analytics (IUI TextVis 2015), Atlanta. http://vialab.science.uoit.ca/textvis2015/ [PDF]
  3. Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, and Harry Hochheiser. 2015. Bridging the Natural Language Processing Gap: An Interactive Clinical Text Review Tool. Poster presented at the 2015 AMIA Summit on Clinical Research Informatics (CRI 2015). San Francisco. March 2015. [Poster][Abstract]

Ugly Pic Tweet

Lately I have observed the twitterrati follow a trend of tweeting “text” as images. My timeline was completely filled with such tweets today.

This is even encouraged by twitter as it expands all picture tweets by default.

So to further spread this epidemic (to convince Twitter to do something about it), I re-purposed one of my Interactive System Design class assignments [1] into a Ugly-Pic-Tweeter.

Go ahead, start posting your own ugly pic tweets. May you fill your followers timelines with them!

 

Footnotes

  1. Thanks Julio for teaming up for the original assignment 🙂 ^

Kivy wrap for the summer

As I conclude my summer work on Kivy and Plyer, here’s a post to summarize all the contributions I have made. It would also be useful to start from here when I wish to revisit any of this in future.

To draw a comparison to the current state of Plyer development, this table shows a list of supported facades before the summer started:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X
Email (open mail client) X

If you have been following the updates, you would have come across my weekly progress posts over the last couple of months. Here’s a list of all such posts since mid-summer for easy access (also check out my mid-summer summary post):

  1. I can haz commit access and other updates
  2. Maintenance work in progress
  3. Plyer on iOS
  4. More, more facades

And in comparison to the table above, this is how the Plyer support looks like as of today after all these changes:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X X
Email (open mail client) X X X X X
Vibrator X
Sms (send messages) X X
Compass X X X
Unique ID (IMEI or SN) X X X X X X
Gyroscope X X X
Battery X X X X X X

Of course there’s more than what meets the eye. There has been a lot of background work that went into writing them. This included understanding the individual platforms APIs and working with other Kivy projects — Pyjnius and Pyobjus that support this work. Some of these changes called for a re-write of old facades in order to follow a consistent approach. Since Plyer is at an early stage of development, I also contributed some maintenance code and writing build scripts.

In the beginning of August, I took a break from facade development for two weeks and made recommendations on making Kivy apps more accessible. I looked into existing projects that could be useful for us and pointed at a possible candidate that we could adapt for our purposes. Here are the two posts summarizing my investigations:

  1. Towards Making Kivy Apps Accessible
  2. Towards Making Kivy Apps Accessible – 2

At this point, I would also include a thank you note to everyone on #kivy and #plyer on freenode for helping me out whenever I got stuck. This was the first time I actively participated in IRC discussions over an extended period. I also tried to return the favor by offering help, when I could, to other new users. Apart from getting a chance to work with the Kivy community from all around the world (with so many timezones!), there were couple of other firsts as well that I experienced while working on the project. Those served as good learning experiences and a motivation for making contributions to open source.

Overall, it was a quite a fun experience contributing to kivy over the summer and I hope to continue doing so every now and then. Now as Kivy is gaining more popularity everyday, I hope to see many more users diving into writing code for it and be a part of this community. Hope these posts could also serve to point them to relevant development opportunities.

More, more facades

This week I added many more facade implementations in Plyer. It was only a few days ago that I had started working on iOS and I am happy that the list has grown quite a bit this week.

I also added Plyer in the kivy-ios tool-chain, i.e. it is now a part of the build-all script and would be available for use in apps packaged with Kivy for iOS.

Apart from that I also did a couple of maintenance fixes to close the holes that I noticed with the checked in code and fix style problems with other contributions.

Although this update was a short one, it did involve a considerable amount of coding effort.

As the summer is coming to a close, I will be spending the next week wrapping up my work, polishing the rough edges in the contributions till now, and of course write the “obvious” bits and pieces that I may have ignored from the documentation till now.

Plyer on iOS

I finally had access to an iOS development device this week. Unlike the other platforms, I didn’t have any prior experience developing for it. So I spent some time familiarizing myself with the tools and stuff. This process was mostly painless but did end up consuming some time.

I also setup my kivy-ios tool chain for the first time. After a couple hello-world programs and fixing minor typos in the examples code, I then moved on to further explore pyobjus for writing Plyer facades. I worked on a new version of the accelerometer example that was not dependent on bridge.m supplied with all kivy-ios packages by default. When I am done moving all the sensors we could do away with the classes contained in the bridge (note to self!).

While playing with this code, I also noticed something interesting on the Xcode dashboard:

Turns out that we had a hit a major bug in Pyobjus that was causing the memory allocated for accelerometerData to leak. In fact this would happen everywhere you’d otherwise need to use something like @autoreleasepool in your Objective C code. Pyobjus objects didn’t account for cases like these. Tito suggested a fix for the issue but is still working on finalizing it.

Meanwhile, I also created another iOS facade for retrieving battery status. Although it was a fairly short one to code but I did have to learn many background things before I could finish that. I hope that this will make it easy for me to finish the other facades in the coming weeks.

Towards Making Kivy Apps Accessible, Part – 2

In this second part of my post of on making kivy apps accessible I would like to describe some of the existing libraries and APIs we can model our accessibility features upon. Last week we identified that the main obstacle for creating accessible kivy apps is a missing module that could communicate the widget states to screen-readers. I explored other frameworks that may have tried to solve these problems before us and will discus one such project in particular.

Kivy includes pygame as one of the supported window providers. While looking for accessible apps in Python I found a GUI engine for pygame called OcempGUI. As a part of this project they also worked upon an accessibility module named Papi:

Papi, the Python Accessibility Programming Interface, is a Python wrapper around the GNOME ATK toolkit. It allows a developer to make python objects and applications easily accessibility aware without the need to install PyGTK and the GNOME accessibility components. Instead it only depends on ATK and – on the developers behalf – the ATK/AT-SPI bridge shipped with AT-SPI.

There is also some support for Microsoft Active Accessibility (MSAA) on Microsoft Windows.

Papi is not limited to the apps you can build with OceampGUI graphical user interfaces but can help support accessibility for any python object. Here’s an example accessible app using Papi:


I created a gist as I couldn’t find a way to embed a Sourceforge file but you can find the complete project there. You can also read more about Papi and OcempGUI on their website. Unfortunately, the project looks like that it is no longer in active development. It’s last release was in 2008.

Assuming that we can build upon this module we’d be able to support accessible Kivy apps on Windows and Linux. We are still left with taking care of OS X Accessibility and also on the mobile devices.

MacOS has a very good accessibility support but only when you are writing Cocoa or Carbon apps. In order to provide a similar level of accessibility support you would likely need to hack around a bit. With inputs from our folks at #macdev on freenode, I set out to do just that. They suggested that I could subclass NSApplication and implement the NSAccessibility protocol within the Kivy app. This would involve creating an hierarchy of fake UI objects that provide accessibility implementations as on a Cocoa app. I did make some progress with by using our in-house project — PyObjus to access AppKit frameworks and subclass objective-c classes. But the situation became a little to overwhelming for me to handle within this one week and I haven’t succeeded in creating a working proof of concept as yet.

Fortunately, Apple folks have recently launched a new API starting OS X 10.10 that includes a NSAccesibilityElement class. I am hoping that this would help avoid creating fake UI objects to implement their accessibility protocols. Here are a couple of examples demonstrating that but I haven’t tried them out yet. Need to get access to their beta versions first. You can also watch their WWDC 2014 session videos on Accessibility on OS X describing it.

We are still left with discussing about the mobile platforms. Here are links to the relevant docs pages for Android (http://developer.android.com/guide/topics/ui/accessibility/apps.html#custom-views) and iOS (https://developer.apple.com/library/ios/documentation/UserExperience/Conceptual/iPhoneAccessibility/Accessibility_on_iPhone/Accessibility_on_iPhone.html). I would like use these bookmarks when I dive deeper into exploring mobile accessibility APIs. However this week, I plan to go back to my original plan and continue the Plyer work from where I left off.