Update: We received the best student paper award for our paper at JURIX’15!
In an earlier post, I talked about my work on Natural Language Processing in the clinical domain. The main idea behind the project is to enable domain experts to build machine learning models for analyzing text. We do this by designing usable tools for NLP without really having the need to send datasets to machine learning experts or understanding the inner working details of the algorithms. The post also features a demo video of the prototype tool that we have built.
I was presenting this work at my program’s bi-weekly meetings where Jaromir, a fellow ISP graduate student, pointed out that such an approach could be useful for his work as well. Jaromir also holds a degree in Law and works on building AI systems for legal applications. As a result, we ended up collaborating on a project on using the approach for statutory analysis. While, the main topic of discussion in the project is on the framework in which a human experts cooperate with a machine learning text classification algorithm, we also ended up augmenting our approach with a new way of capturing and re-using knowledge. In our tool datasets and models are treated separately and our not tied together. So, if you were building a classification model for say statutes from the state of Alaska, when you need to analyze laws from Kansas you need not start from scratch. This allows us to be in a better starting place in terms of all the performance measures and build a model using fewer training examples.
Jaromír Šavelka, Gaurav Trivedi, and Kevin Ashley. 2015. Applying an Interactive Machine Learning Approach to Statutory Analysis. In Proceedings of the 28th International Conference on Legal Knowledge and Information Systems (JURIX ’15). Braga, Portugal. [PDF] – Awarded the Best Student Paper (Top 0.01%).
If you follow machine learning topics in the news, I am sure by now you would have come across Andrej Karpathy‘s blog post on The Unreasonable Effectiveness of Recurrent Neural Networks. Apart from the post itself, I have found it very fascinating to read about the diverse applications that its readers have found for it. Since then I have spent several hours hacking with different machine learning models to compose tabla rhythms:
Although Tabla does not have a standardized musical notation that is accepted by all, it does have a language based on the bols (literally, verbalize in English) or the sounds of the strokes played on it. These bols may be expressed in written form which when pronounced in Indian languages sound like the drums. For example, the theka for the commonly used 16-beat cycle – Teental is written as follows:
Dha | Dhin | Dhin | Dha | Dha | Dhin | Dhin | Dha
Dha | Tin | Tin | Ta | Ta | Dhin | Dhin | Dha
For this task, I made use of Abhijit Patait‘s software – TaalMala, which provides a GUI environment for composing Tabla rhythms in this language. The bols can then be synthesized to produce the sound of the drum. In his software, Abhijit extended the tabla language to make it easier for users to compose tabla rhythms by adding a square brackets after each bol that specify the number of beats within which it must be played. You could also lay more emphasis on a particular bol by adding ‘+’ symbols which increased their intensity when synthesized to sound. Variations of standard bols can be defined as well based on different the hand strokes used:
Dha1 = Na + First Closed then Open Ge
Now that we are armed with this background knowledge, it is easy to see how we may attempt to learn tabla like a language model using Natural Language Processing techniques. Predictive modeling of tabla has been previously explored in "N-gram modeling of tabla sequences using variable-length hidden Markov models for improvisation and composition" (Avinash Sastry, 2011). But, I was not able to get access to the datasets used in the study and had to rely on the compositions that came with the TaalMala software. This is comparatively a much smaller database than what you would otherwise use to train a neural network: It comprises of 207 rhythms with 6,840 bols in all. I trained a char-rnn and sampled some compositions after priming it with different seed text such as “Dha”, “Na” etc. Given below is a minute long composition sampled from my network. We can see that not only the network has learned the TaalMala notation but it has also understood some common phrases used in compositions such as the occurrence of the phrase “TiRa KiTa“, repetitions of “Tun Na” etc.:
Ti [0.50] | Ra | Ki | Te | Dha [0.50] | Ti [0.25] | Ra | Ki
| Ta | Tun [0.50] | Na | Dhin | Na
| Tun | Na | Tun | Na | Dha | Dhet | Dha | Dhet | Dha | Dha
| Tun | Na | Dha | Tun | Na | Ti | Na | Dha | Ti | Te | Ki |
Ti | Dha [0.50] | Ti [0.25] | Ra | Ki | Te | Dhin [0.50] |
Dhin | Dhin | Dha | Ge | Ne | Dha | Dha | Tun | Na | Ti
[0.25] | Ra | Ki | Ta | Dha [0.50] | Ti [0.25] | Ra | Ki |
Te | Dha [1.00] | Ti | Dha | Ti [0.25] | Ra | Ki | Te | Dha
[0.50] | Dhet | Dhin | Dha | Tun | Na | Ti [0.25] | Ra | Ki
| Ta | Dha [0.50] | Ti [0.25] | Ra | Ki | Te | Ti | Ka | Tra
[0.50] | Ti | Ti | Te | Na [0.50] | Ki [0.50] | Dhin [0.13]
| Ta | Ti [0.25] | Ra | Ki | Te | Tra | Ka | Ti [0.25] | Ra
| Ki | Te | Dhin [0.50] | Na [0.25] | Ti [0.25] | Ra | Ki |
Te | Tra | Ka | Dha [0.34] | Ti [0.25] | Ra | Ki | Ta | Tra
| Ka | Tra [0.50] | Ki [0.50] | Tun [0.50] | Dha [0.50] | Ti
[0.25] | Ra | Ki | Ta | Tra | Ka | Ta | Te | Ti | Ta | Kat |
Ti | Dha | Ge | Na | Dha | Ti [0.25] | Ra | Ki | Te | Dha
[0.50] | Dhin | Dhin | Dhin | Dha | Tun | Na | Ti | Na | Ki
| Ta | Dha [0.50] | Dha | Ti [0.50] | Ra | Ki | Te | Tun
[0.50] | Tra [0.25] | Ti [0.25] | Ra | Ki | Te | Tun | Ka |
Ti [0.25] | Ra | Ki | Te | Dha [0.50] | Ki [0.25] | Ti | Dha
| Ti | Ta | Dha | Ti | Dha [0.50] | Ti | Na | Dha | Ti
[0.25] | Ra | Ki | Te | Dhin [0.50] | Na | Ti [0.25] | Ra |
Ki | Te | Tra | Ka | Dha [0.50] | Ti [0.50] | Ra | Ki | Te |
Tun [0.50] | Na | Ki [0.25] | Te | Dha | Ki | Dha [0.50] |
Ti [0.25] | Ra | Ki | Te | Dha [0.50] | Ti [0.25] | Ra | Ki
| Te | Dha [0.50] | Tun | Ti [0.25] | Ra | Ki | Te | Dhin
[0.50] | Na | Ti [0.25] | Te | Dha | Ki [0.25] | Te | Ki |
Te | Dhin [0.50] | Dhin | Dhin | Dhin | Dha | Dha | Tun | Na
| Na | Na | Ti [0.25] | Ra | Ki | Ta | Ta | Ka | Dhe [0.50]
| Ti [0.25] | Ra | Ki | Te | Ti | Re | Ki | Te | Dha [0.50]
| Ti | Dha | Ge | Na | Dha | Ti [0.25] | Ra | Ki | Te | Ti |
Te | Ti | Te | Ti | Te | Dha [0.50] | Ti [0.25] | Te | Ra |
Ki | Te | Dha [0.50] | Ki | Te | Dha | Ti [0.25]
Here’s a loop that I synthesized by pasting a composition sampled 4 times one after the another:
Of course, I also tried training n-gram models and the smoothing methods using the SRILM toolkit. Adding spaces between letters is a quick hack that can be used to train character level models using existing toolkits. Which one produces better compositions? I can’t tell for now but I am trying to collect more data and hope to add updates to this post as and when I find time to work on it. I am not confident if simple perplexity scores may be enough to judge the differences between two models, specially on the rhythmic quality of the compositions. There are many ways in which one can extend this work. One there is a possibility of training on different kinds of compositions: kaidas, relas, laggis etc., different rhythm cycles and also from different gharanas. All of this would required collecting a bigger composition database:
If you have access to any good tabla compositions database(s) please do let me know. Thanks! — Gaurav Trivedi (@trivedigaurav) May 26, 2015
And then there is a scope for allowing humans to interactively edit compositions at places where AI goes wrong. You could also use the samples generated by it as an infinite source of inspiration.
Finally, here’s a link to the work in progress playlist of the rhythms I have sampled till now.
Update: Here’s our full paper announcement with source-code release…
I am working on a project to support the use of Natural Language Processing in the clinical domain. Modern NLP systems often make use of machine learning techniques. However, physicians and other clinicians, who are interested in analyzing clinical records, may be unfamiliar with these methods. Our project aims to enable such domain experts make use of Natural Language Processing using a point-and-click interface . It combines novel text-visualizations to help its users make sense of NLP results, revise models and understand changes between revisions. It allows them to make any necessary corrections to computed results, thus forming a feedback loop and helping improve the accuracy of the models.
Here’s the walk-through video of the prototype tool that we have built:
At this point we are redesigning some portions of our tool based on feedback from a formative user study with physicians and clinical researchers. Our next step would be to conduct an empirical evaluation of the tool to test our hypotheses about its design goals.
Gaurav Trivedi. 2015. Clinical Text Analysis Using Interactive Natural Language Processing. In Proceedings of the 20th International Conference on Intelligent User Interfaces Companion (IUI Companion ’15). ACM, New York, NY, USA, 113-116. DOI 10.1145/2732158.2732162 [Presentation] [PDF]
Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, Harry Hochheiser. 2015. An Interactive Tool for Natural Language Processing on Clinical Text. Presented at 4th Workshop on Visual Text Analytics (IUI TextVis 2015), Atlanta. http://vialab.science.uoit.ca/textvis2015/ [PDF]
Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, and Harry Hochheiser. 2015. Bridging the Natural Language Processing Gap: An Interactive Clinical Text Review Tool. Poster presented at the 2015 AMIA Summit on Clinical Research Informatics (CRI 2015). San Francisco. March 2015. [Poster][Abstract]
I recently prepared a deck of slides for my machine learning course. In the presentation, I talk about some of the recently proposed methods on learning from multiple annotators. In these methods we do not assume the labels that we get from the annotators to be the ground truth, as we do in traditional machine learning, but try to find “truth” from noisy data.
There are two main directions of work in this area. One focuses on finding the consensus labels first and then do traditional learning, while the other approach is to learn a consensus model directly. In the second approach, we may estimate the consensus labels during the process of building a classifier itself.
Here are the slides for the presentation. I would be happy to receive your comments and suggestions.
The good news is that the organizers have made available the entire set of video lectures in 4K for you to watch. But since Dr. Le did most of them on a blackboard and did not provide any accompanying slides, I decided to put the brief content descriptions of the lectures along with the videos here. Hope this will help you navigate the videos better.
Lecture 1: Neural Networks Review
Dr. Le begins his lecture starting from the fundamentals on Neural Networks if you’d like to brush up your knowledge about them. Otherwise feel free to quickly skim through the initial sections, but I promise there are interesting things later on. You may use the links below to skip to the relevant parts. The links are using an experimental script, let me know in the comments if they don’t work.
If you have already covered NN in the past then the first lecture may have been a bit dry for you but the real fun begins in this lecture when Dr. Le starts talking about his experiences of using deep learning in practice.
In this lecture, Dr. Le finishes his description on NN architectures. He also talks a bit about how they are being used at Google – for applications in image and speech recognition, and language modelling.
I have a (bad) habit of checking my Twitter feed while at work. Yesterday after my machine learning class, I found my timeline to be filled with Tweets mocking Rahul Gandhi about his first-ever television interview. Naturally, I was curious to know why and I tried to give it a listen. Most of his answers made no sense to me whatsoever! But then guess what? Who else is bad at responding to questions in natural language? The machines are! Maybe it was time to put them to a test and see if the machines could understand Mr. Gandhi. Making use of the transcript made available by the Times of India and some free NLP tools(ets), I spent a couple of hours (unproductive, ofcourse :P) trying to make sense of the interview.
Here’s a wordle summary of his answers, that would at least give you an overview about what was being spoken about during the interview:
Here are some of the most used (best) words from the transcript. The number times they were used are mentioned in parenthesis.
Next, I set out to generate a summary of his answers. And lo! to my surprise, it made perfect sense (contrary to what you usually get from a summarizer). This is the summary generated from the online tool at http://freesummarizer.com/:
What I feel is that this country needs to look at the fundamental issues at hand, the fundamental political issue at hand is that our Political system is controlled by too few people and we absolutely have to change the way our political system is structured, we have to change our Political parties, we have to make them more transparent, we have to change the processes that we use to elect candidates, we have to empower women in the political parties, that is where the meat of the issue but I don’t hear that discussion, I don’t hear the discussion about how are we actually choosing that candidate, that is never the discussion.
That ascribes huge power to the Congress party, I think the Congress party’s strength comes when we open up when we bring in new people, that is historically been the case and that is what I want to do.
The Gujarat riots took place frankly because of the way our system is structured, because of the fact that people do not have a voice in the system. And what I want to do. He was CM when Gujarat happened The congress party and the BJP have two completely different philosophies, our attack on the BJP is based on the idea that this country needs to move forward democratically, it needs push democracy deeper into the country, it needs to push democracy into the villagers, it needs to give women democratic powers, it needs to give youngsters democratic powers.
You are talking about India, we have had a 1 hour conversation here, you haven’t asked me 1 question about how we are going to build this country, how we are going to take this country forward, you haven’t asked me one question on how we are going to empower our people, you haven’t asked me one question on what we are going to do for youngsters, you are not interested in that.
There is the Congress Party that believes in openness, that believes in RTI, that believes in Panchayati Raj, that believes in giving people power. The Congress party is an extremely powerful system and all the Congress party needs to do is bring in younger fresher faces in the election which is what we are going to do and we are going to win the election.
In retrospect, repeating a few points several times is a good enough cue for an auto-summarizer to identify important sentences. This interview was perfect for a task like this as Mr. Gandhi repeated the same set of (rote) answers for almost every question that he was asked. Perhaps this is what he was hoping for? To make sure that when lazy journalists use automatic tools to do their jobs, it would give them a perfect output!
Now coming to the interesting bit. If you were a human listener like me and wanted to read the answers that he really did attempt to answer  , what would you do? Fear not! I have built an SVM classifier from this transcript that you could make use of in future. I used LightSide, an open source platform created by CMU LTI researchers to understand features from the transcript of his answers. Let’s get into the details then.
When you go for a interview, you could either choose to answer a question or try to avoid by cleverly diverting from the main question asked. In Rahul’s case, we have answers that can be mainly grouped into three categories – a) the questions that he answered, b) he managed to successfully avoid and c) the LOL category (the answer bears no resemblance to the question asked). I combined categories (b) and (c) to come up with classes: ANSWERED or UNANSWERED. You may check out my list of classes here and read the interview answers from the Times of India article here. They follow the same order as in the transcript with the exception of single line questions-answers that would’ve otherwise served as noise for machine learning. I selected a total of 114 questions in all out which 45 were answered and the remaining 69 were either successfully avoided or belonged to the LOL category  .
For feature extraction, I used quite simple language features like Bigrams, Trigrams, Line length after excluding stop words etc. You can download them in the LightSide feature format. I used the SVM plugin to learning the classification categories from the feature. Here is the final model that the tool built using the extracted features. And the results were surprising (or probably not :). With 10-fold cross validation, the resulting model had an accuracy of over 72%! An accuracy percentage like this is considered to be exceptional (in case you are not familiar with the field). The machines indeed understand Rahul Gandhi!
Unfortunately, I did not have enough data to run a couple of tests separately. We’ll have to probably wait for Mr. Gandhi to give his next interview for that. Hope that the Congress party members work as hard as the NLP researchers so that we can have a good competition by then!
He did make an effort to answer about 40% of the questions to his credit ^
A biomarker is a measurable indicator of a biological condition. Usually it is seen as a substance or a molecule introduced in the body but even physiological indicators may function as dynamic biomarkers for certain diseases. Dr. Sejdić and his team at the IMED Lab work on finding innovative ways to measure such biomarkers. During the ISP seminar last week, he presented his work on using low-cost devices with simple electronics such as accelerometers and microphones, to capture the unique patterns of physiological variables. It turns out that by analyzing these patterns, one can differentiate between healthy and pathological conditions. Building these devices requires an interdisciplinary investigation and insights from signal processing, biomedical engineering and also machine learning.
Listening to the talk, I felt that Dr. Sejdić is a researcher who is truly an engineer at heart as he described his work on building an Asperometer. It is a device that is placed on the throat of a patient to find out when they have swallowing difficulties (Dysphagia). The device picks up the vibrations from the throat and does a bit of signal processing magic to identify problematic scenarios. Do you remember the little flap called the Epiglotis that guards the entrance to your wind pipe, from your high school Biology? Well, that thing is responsible for directing the food into the oesophagus (food pipe) while eating and preventing it from going into wrong places (like the lungs!). As it moves to cover the wind pipe, it records a characteristic motion pattern on the accelerometer. The Asperometer can then distinguish between regular and irregular patterns to find out when should we be concerned. The current gold standard to do these assessments involve using some ‘Barium food’ and X-Rays to visualize its movement. As you may have realized, the Asperometer is not only unobstrusive but also appears to be a safer method to do so. There are a couple of issues left to iron out though, such as removing sources of noise in the signal due to speech or even breathing through the mouth. We can, however, still use it in controlled usage scenarios in the presence of a specialist.
The remaining part of the talk briefly dealt with Dr. Sejdić’s investigations of gait, handwriting processes and preference detection, again with the help of signal processing and some simple electronics on the body. He is building on work in biomedical engineering to study age and disease related changes in our bodies. The goal is to explore simple instruments providing useful information that can ultimately help to prevent, retard or reverse such diseases.