Using Machine learning to help Manage Diabetes

I participated in the PennApps hackathon in Philadelphia this weekend. While most of the city was struck with a bad snow storm, a group of hackers holed up inside the Penn engineering buildings to work on some cool hacks. My team consisting of three other hackers: Daniel, Alex and Madhur, decided to work on an app that could predict blood glucose levels of diabetes patients by building machine learning models.

Our proof-of-concept. We have our own logo!
We have our own logo!

We used the OneTouch Reveal API to gather some data provided by the Johnson & Johnson’s company. They are the manufacturers of OneTouch glucose monitors for diabetes patients. They also give their patients an app for tagging events like exercise (light, moderate, heavy etc.), when they eat food and use insulin (different kinds – fast acting, before/after meals etc.). Our team thought that it might be a good idea to hack on this dataset to find out whether we could predict patients’ glucose levels without them having them to punch a hole in their fingers. A real world use case for this app would be to alert a patient when we predicted unusual glucose levels or have them do an actual blood test when the confidence on our predictions falls low.

We observed mixed results for the patients in our dataset. We did reasonably well for those with more data, but others had very few data points to make good predictions. We also saw that our predictions became more precise as we considered more data. Another issue was that the OneTouch API did not give sufficient information about food and exercise events for any of the patients – mostly without additional event tagging. As a result, our models were not influenced much by them.

The trend line is our blood glucose prediction based on past data. The shaded region indicates our prediction range. Whenever this region is broader, our confidence in prediction goes down.
The pink trend-line is our blood glucose predictions based on prior data. The shaded region indicates the prediction range. Whenever this region is broader, our confidence in prediction goes down.

We believe that in the near future, it would be common for the patients to have such monitors communicate with other wearable sensors such as smart watches. Such systems would be able to provide ample information about one’s physical activity etc., to make more meaningful predictions possible. Here’s a video demonstrating our proof-of-concept:

.

Interactive Natural Language Processing for Legal Text

Update: We received the best student paper award for our paper at JURIX’15!

In an earlier post, I talked about my work on Natural Language Processing in the clinical domain. The main idea behind the project is to enable domain experts to build machine learning models for analyzing text. We do this by designing usable tools for NLP without really having the need to send datasets to machine learning experts or understanding the inner working details of the algorithms. The post also features a demo video of the prototype tool that we have built.

I was presenting this work at my program’s bi-weekly meetings where Jaromir, a fellow ISP graduate student, pointed out that such an approach could be useful for his work as well. Jaromir also holds a degree in Law and works on building AI systems for legal applications. As a result, we ended up collaborating on a project on using the approach for statutory analysis. While, the main topic of discussion in the project is on the framework in which a human experts cooperate with a machine learning text classification algorithm, we also ended up augmenting our approach with a new way of capturing and re-using knowledge. In our tool datasets and models are treated separately and our not tied together. So, if you were building a classification model for say statutes from the state of Alaska, when you need to analyze laws from Kansas you need not start from scratch. This allows us to be in a better starting place in terms of all the performance measures and build a model using fewer training examples.

The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, 1p and 2p for the first (ML model-oriented) and second (interaction-oriented) evaluation perspectives, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.
The results of the cold start (Kansas) and the knowledge re-use (Alaska) experiment. In the Figure KS stands for Kansas, AK for Alaska, P for precision, R for recall, F1 for F1 measure, and ROC with a number for an ROC curve of the ML classifier trained on the specified number of documents.

We will be presenting this work at JURIX’15 during the 28th year of the conference focusing on legal information systems. Previously, we had presented portions of this work at the AMIA Summit on Clinical Research Informatics and at the ACM IUI Workshop on Visual Text Analytics.

References

Jaromír Šavelka, Gaurav Trivedi, and Kevin Ashley. 2015. Applying an Interactive Machine Learning Approach to Statutory Analysis. In Proceedings of the 28th International Conference on Legal Knowledge and Information Systems (JURIX ’15). Braga, Portugal. [PDF] – Awarded the Best Student Paper (Top 0.01%).

Clinical Text Analysis Using Interactive Natural Language Processing

Update: Here’s our full paper announcement with source-code release…

I am working on a project to support the use of Natural Language Processing in the clinical domain. Modern NLP systems often make use of machine learning techniques. However, physicians and other clinicians, who are interested in analyzing clinical records, may be unfamiliar with these methods. Our project aims to enable such domain experts make use of Natural Language Processing using a point-and-click interface . It combines novel text-visualizations to help its users make sense of NLP results, revise models and understand changes between revisions. It allows them to make any necessary corrections to computed results, thus forming a feedback loop and helping improve the accuracy of the models.

Here’s the walk-through video of the prototype tool that we have built:

At this point we are redesigning some portions of our tool based on feedback from a formative user study with physicians and clinical researchers. Our next step would be to conduct an empirical evaluation of the tool to test our hypotheses about its design goals.

We will be presenting a demo of our tool at the AMIA Summit on Clinical Research Informatics and also at the ACM IUI Workshop on Visual Text Analytics in March.

References

  1. Gaurav Trivedi. 2015. Clinical Text Analysis Using Interactive Natural Language Processing. In Proceedings of the 20th International Conference on Intelligent User Interfaces Companion (IUI Companion ’15). ACM, New York, NY, USA, 113-116. DOI 10.1145/2732158.2732162 [Presentation] [PDF]
  2. Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, Harry Hochheiser. 2015. An Interactive Tool for Natural Language Processing on Clinical Text. Presented at 4th Workshop on Visual Text Analytics (IUI TextVis 2015), Atlanta. http://vialab.science.uoit.ca/textvis2015/ [PDF]
  3. Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, and Harry Hochheiser. 2015. Bridging the Natural Language Processing Gap: An Interactive Clinical Text Review Tool. Poster presented at the 2015 AMIA Summit on Clinical Research Informatics (CRI 2015). San Francisco. March 2015. [Poster][Abstract]

Ugly Pic Tweet

Lately I have observed the twitterrati follow a trend of tweeting “text” as images. My timeline was completely filled with such tweets today.

This is even encouraged by twitter as it expands all picture tweets by default.

So to further spread this epidemic (to convince Twitter to do something about it), I re-purposed one of my Interactive System Design class assignments [1] into a Ugly-Pic-Tweeter.

Go ahead, start posting your own ugly pic tweets. May you fill your followers timelines with them!

 

Footnotes

  1. Thanks Julio for teaming up for the original assignment 🙂 ^

Kivy wrap for the summer

As I conclude my summer work on Kivy and Plyer, here’s a post to summarize all the contributions I have made. It would also be useful to start from here when I wish to revisit any of this in future.

To draw a comparison to the current state of Plyer development, this table shows a list of supported facades before the summer started:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X
Email (open mail client) X

If you have been following the updates, you would have come across my weekly progress posts over the last couple of months. Here’s a list of all such posts since mid-summer for easy access (also check out my mid-summer summary post):

  1. I can haz commit access and other updates
  2. Maintenance work in progress
  3. Plyer on iOS
  4. More, more facades

And in comparison to the table above, this is how the Plyer support looks like as of today after all these changes:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X X
Email (open mail client) X X X X X
Vibrator X
Sms (send messages) X X
Compass X X X
Unique ID (IMEI or SN) X X X X X X
Gyroscope X X X
Battery X X X X X X

Of course there’s more than what meets the eye. There has been a lot of background work that went into writing them. This included understanding the individual platforms APIs and working with other Kivy projects — Pyjnius and Pyobjus that support this work. Some of these changes called for a re-write of old facades in order to follow a consistent approach. Since Plyer is at an early stage of development, I also contributed some maintenance code and writing build scripts.

In the beginning of August, I took a break from facade development for two weeks and made recommendations on making Kivy apps more accessible. I looked into existing projects that could be useful for us and pointed at a possible candidate that we could adapt for our purposes. Here are the two posts summarizing my investigations:

  1. Towards Making Kivy Apps Accessible
  2. Towards Making Kivy Apps Accessible – 2

At this point, I would also include a thank you note to everyone on #kivy and #plyer on freenode for helping me out whenever I got stuck. This was the first time I actively participated in IRC discussions over an extended period. I also tried to return the favor by offering help, when I could, to other new users. Apart from getting a chance to work with the Kivy community from all around the world (with so many timezones!), there were couple of other firsts as well that I experienced while working on the project. Those served as good learning experiences and a motivation for making contributions to open source.

Overall, it was a quite a fun experience contributing to kivy over the summer and I hope to continue doing so every now and then. Now as Kivy is gaining more popularity everyday, I hope to see many more users diving into writing code for it and be a part of this community. Hope these posts could also serve to point them to relevant development opportunities.

More, more facades

This week I added many more facade implementations in Plyer. It was only a few days ago that I had started working on iOS and I am happy that the list has grown quite a bit this week.

I also added Plyer in the kivy-ios tool-chain, i.e. it is now a part of the build-all script and would be available for use in apps packaged with Kivy for iOS.

Apart from that I also did a couple of maintenance fixes to close the holes that I noticed with the checked in code and fix style problems with other contributions.

Although this update was a short one, it did involve a considerable amount of coding effort.

As the summer is coming to a close, I will be spending the next week wrapping up my work, polishing the rough edges in the contributions till now, and of course write the “obvious” bits and pieces that I may have ignored from the documentation till now.

Plyer on iOS

I finally had access to an iOS development device this week. Unlike the other platforms, I didn’t have any prior experience developing for it. So I spent some time familiarizing myself with the tools and stuff. This process was mostly painless but did end up consuming some time.

I also setup my kivy-ios tool chain for the first time. After a couple hello-world programs and fixing minor typos in the examples code, I then moved on to further explore pyobjus for writing Plyer facades. I worked on a new version of the accelerometer example that was not dependent on bridge.m supplied with all kivy-ios packages by default. When I am done moving all the sensors we could do away with the classes contained in the bridge (note to self!).

While playing with this code, I also noticed something interesting on the Xcode dashboard:

Turns out that we had a hit a major bug in Pyobjus that was causing the memory allocated for accelerometerData to leak. In fact this would happen everywhere you’d otherwise need to use something like @autoreleasepool in your Objective C code. Pyobjus objects didn’t account for cases like these. Tito suggested a fix for the issue but is still working on finalizing it.

Meanwhile, I also created another iOS facade for retrieving battery status. Although it was a fairly short one to code but I did have to learn many background things before I could finish that. I hope that this will make it easy for me to finish the other facades in the coming weeks.