Category Archives: Projects

Machines learn to play Tabla, Part – 2

This is a followup on my earlier post on Machines Learn to play Tabla. You may wish it check it out first reading this one…

Three years ago, I published a post on using recurrent neural networks to generate tabla rhythms. Sampling music from machine learned models was not in vogue then. My post received a lot of attention on the web and became very popular. The project had been a proof-of-concept and I have wanted build on it for a long time now.

This weekend, I worked on making it more interactive and I am excited to share these updates with you. Previously, I was using a proprietary software to convert tabla notation to sound. That made it hard to experiment with sampled rhythms and I could share only a handful sounds. Taking inspiration from our friends at Vishwamohini, I am now able to convert bols into rhythm on the fly using MIDI.js.

Let me show off the new javascript synthesizer using a popular Delhi kaida. Hit the ‘play’ button to listen:

Now that you’ve heard the computer play, here’s an example of it being played by a tabla maestro:

Of course, the synthesized outcome is not much of a comparison to the performance by the maestro, but it is not too bad either…

Now to the more exciting part- Since our browsers have learned to play the tabla, we can throw in the char-rnn model that I built in the earlier post.  To do this, I used the RecurrentJS library and combined it with my javascript tabla player:

Feel free to play around with tempo and maximum character-limit for sampling. When you click on ‘generate’,  it will play a new rhythm every time. Hope you’ll enjoy playing with it as much as I did!

The player has a few kinks at this point I am working towards fixing them. You too can contribute to my repository on GitHub.

There are two areas that need major work:

Data: The models that I trained for my earlier post was done using a small amount of training data. I have been on a lookout for better dataset since then. I wrote a few emails, but without much success till now. I am interested in knowing about more datasets I could train my models on.

Modeling: Our model did a very good job of understanding the structure of TaalMala notations. Although character level recurrent neural networks work well, it is still based on very shallow understanding of the rhythmic structures. I have not come across any good approaches for generating true rhythms yet:

I think more data samples covering a range of rhythmic structures would only partially address this problem. Simple rule based approaches seem to outperform machine learned models with very little effort. Vishwamohini.com has some very good rule-based variation generators that you could check out.  They sound better than the ones created by our AI. After all the word for compositions- bandish, literally derived from ‘rules’ in Hindi. But on the other hand, there are only so many handcrafted rules that you can come up with before they starts sounding repetitive.

Contact me if you have some ideas and if you’d like to help out! Hope that I am able to post an update on this sooner than three years this time 😀

Announcing NLPReViz…

NLPReViz - http://nlpreviz.github.io

We have released the source code for our NLPReViz project. Head to http://nlpreviz.github.io to checkout its project page.

Also, here’s our new JAMIA publication on it:

Gaurav Trivedi, Phuong Pham, Wendy W Chapman, Rebecca Hwa, Janyce Wiebe, Harry Hochheiser; NLPReViz: an interactive tool for natural language processing on clinical text. Journal of American Medical Informatics Association. 2017. DOI: 10.1093/jamia/ocx070.

Increasing Patient-Provider Interaction with “Pharma-C”

This weekend I took part in The Pitt Challenge Hackathon hosted by the School of Pharmacy and the Clinical and Translational Science Institute. I found this hackathon interesting because it had specific goals and challenged the participants to “Change the way the world looks at Health.” I went to the event with absolutely no prior ideas about what to build. I enjoy participating in hackathons for a chance to work with a completely new group of team members every time. I joined a team of two software professionals Zee and Greg right after registration. We were then joined by a business major – Shoueb during the official team formation stage of the event. The hackathon organizers provided us with ample opportunities to have discussions with researchers, professors and practitioners about the problems they’d like to solve with technology.

We started with a lot of interesting ideas and everyone in the team had a lot to contribute. We realized that almost all of our ideas revolved around the concept of increasing the interaction between the patient and providers outside of the health care setting. Currently, the patients have little interaction with the health care providers apart from the short face-to-face meetings and sporadic phone calls. Providers are interested in knowing more about their patients during their normal activities. Patients would also feel better cared for when the providers are more vested in them. We began with a grand scheme of creating a three-way communication channel with patient, physicians and pharmacists. After having more discussions with the mentors, we soon understood our big challenges – ‘busy schedules’ and ‘incumbent systems.’ We decided to focus on patient-pharmacy interactions. We brainstormed ideas about how we can build a system that ties well with the existing systems and isn’t too demanding in terms of time, either from the pharmacists or the patients. We decided to call ourselves – “Pharma-Cand after appropriate amount of giggling over the name, we sat down to think about the tech.

This slideshow requires JavaScript.

We wanted to design a system that could be less intrusive than phone calls, where both participants must be available at the same time, but also more visible than emails that could be left ignored in the promotions inbox. We began with an idea of using an email based system that could also appear as Google Now Cards as notifications on phones and smart devices. To our disappointment, we learned that Google Now only supports schemas for a limited number of activities (such as restaurant reservations, flights etc.). As a result, we moved on to a custom notification service. We agreed upon using the Pushover app which made it very easy to build a prototype for the hackathon.

We built a web-based system that could be connected to the existing loyalty programs from the pharmacies. The patients could opt for signing up for additional follow-up questions about their prescriptions. These questions could be generic ones such as: How many prescribed doses have you missed this week?, Is your prescribed medicine affordable?, Do you have questions about your current prescription?; or specific follow-up questions about the drugs they are taking. One could be interested in knowing how the patients are doing, whether the drug is having the desired effects or even reminding them about the common side-effects. Once signed up, a weekly script could send notifications to the participants and collect their responses from their preferred devices. Having such a system in place would help the pharmacists gather better information about the patients and offer interventions. They could look at the summary information screen when they make their follow-up calls according the existing systems in place. We believe the such a system could benefit both the pharmacies and the users without disrupting their regular workflows.

During the course of 24 hours, we finished building a working prototype and could demo everything in real-time to all our judges. One addition that improve the challenge would be to release some datasets for the participants to work with. We wanted to try some interesting data analysis methods for our problems but were limited to work on data collection hacks. Overall, I enjoyed taking part in the Pitt Challenge Hackathon and will look forward to their future events.

On Clippy and building software assistants

I have been attending a reading group on visualization tools for the last few weeks. This is a unique multi-institution group that meets over web-conferencing at 4 PM EST / 1 PM PST on Fridays. It includes a diverse bunch of participants including non-academic researchers.

Every week we vote on and discuss a range of topics related to building tools for visualizing data.

This week, it was my turn to lead the discussion on the Lumiere paper. This is the research responsible for the now retired Clippy Office assistant. I also noticed a strong ISP presence in the references section as the paper focuses on Bayesian user modeling.

During the discussion, we talked about how we can offer help to use vis tools better. Here are my slides from it:

 

References

  1. Eric Horvitz, Jack Breese, David Heckerman, David Hovel, and Koos Rommelse. 1998. The lumière project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI’98), Gregory F. Cooper and Serafín Moral (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 256-265.
  2. Justin Matejka, Wei Li, Tovi Grossman, and George Fitzmaurice. 2009. CommunityCommands: command recommendations for software applications. In Proceedings of the 22nd annual ACM symposium on User interface software and technology. ACM, New York, NY, USA, 193-202.

 

Using Machine learning to help Manage Diabetes

I participated in the PennApps hackathon in Philadelphia this weekend. While most of the city was struck with a bad snow storm, a group of hackers holed up inside the Penn engineering buildings to work on some cool hacks. My team consisting of three other hackers: Daniel, Alex and Madhur, decided to work on an app that could predict blood glucose levels of diabetes patients by building machine learning models.

Our proof-of-concept. We have our own logo!
We have our own logo!

We used the OneTouch Reveal API to gather some data provided by the Johnson & Johnson’s company. They are the manufacturers of OneTouch glucose monitors for diabetes patients. They also give their patients an app for tagging events like exercise (light, moderate, heavy etc.), when they eat food and use insulin (different kinds – fast acting, before/after meals etc.). Our team thought that it might be a good idea to hack on this dataset to find out whether we could predict patients’ glucose levels without them having them to punch a hole in their fingers. A real world use case for this app would be to alert a patient when we predicted unusual glucose levels or have them do an actual blood test when the confidence on our predictions falls low.

We observed mixed results for the patients in our dataset. We did reasonably well for those with more data, but others had very few data points to make good predictions. We also saw that our predictions became more precise as we considered more data. Another issue was that the OneTouch API did not give sufficient information about food and exercise events for any of the patients – mostly without additional event tagging. As a result, our models were not influenced much by them.

The black line indicates the actual glucose levels measured. The pink line is our predictions at different timestamps. The shaded region indicates our prediction range. Whenever this region is broader, our confidence in prediction goes down.
The black line indicates the actual glucose levels measured. The pink line is our predictions at different timestamps. The shaded region indicates our prediction range. Whenever this region is broader, our confidence in prediction goes down.

We believe that in the near future, it would be common for the patients to have such monitors communicate with other wearable sensors such as smart watches. Such systems would be able to provide ample information about one’s physical activity etc., to make more meaningful predictions possible. Here’s a video demonstrating our proof-of-concept:

.

Interactive Natural Language Processing for Legal Text

Update: We received the best student paper award for our paper at JURIX’15!

In an earlier post, I talked about my work on Natural Language Processing in the clinical domain. The main idea behind the project is to enable domain experts to build machine learning models for analyzing text. We do this by designing usable tools for NLP without really having the need to send datasets to machine learning experts or understanding the inner working details of the algorithms. The post also features a demo video of the prototype tool that we have built.

I was presenting this work at my program’s bi-weekly meetings where Jaromir, a fellow ISP graduate student, pointed out that such an approach could be useful for his work as well. Jaromir also holds a degree in Law and works on building AI systems for legal applications. As a result, we ended up collaborating on a project on using the approach for statutory analysis. While, the main topic of discussion in the project is on the framework in which a human experts cooperate with a machine learning text classification algorithm, we also ended up augmenting our approach with a new way of capturing and re-using knowledge. In our tool datasets and models are treated separately and our not tied together. So, if you were building a classification model for say statutes from the state of Alaska, when you need to analyze laws from Kansas you need not start from scratch. This allows us to be in a better starting place in terms of all the performance measures and build a model using fewer training examples.

The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, 1p and 2p for the first (ML model-oriented) and second (interaction-oriented) evaluation perspectives, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.
The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.

We will be presenting this work at JURIX’15 during the 28th year of the conference focusing on legal information systems. Previously, we had presented portions of this work at the AMIA Summit on Clinical Research Informatics and at the ACM IUI Workshop on Visual Text Analytics.

References

Jaromír Šavelka, Gaurav Trivedi, and Kevin Ashley. 2015. Applying an Interactive Machine Learning Approach to Statutory Analysis. In Proceedings of the 28th International Conference on Legal Knowledge and Information Systems (JURIX ’15). Braga, Portugal. [PDF] – Awarded the Best Student Paper (Top 0.01%).