Machine Learning – Page 2

Clinical Text Analysis Using Interactive Natural Language Processing

Update: Here’s our full paper announcement with source-code release…

I am working on a project to support the use of Natural Language Processing in the clinical domain. Modern NLP systems often make use of machine learning techniques. However, physicians and other clinicians, who are interested in analyzing clinical records, may be unfamiliar with these methods. Our project aims to enable such domain experts make use of Natural Language Processing using a point-and-click interface . It combines novel text-visualizations to help its users make sense of NLP results, revise models and understand changes between revisions. It allows them to make any necessary corrections to computed results, thus forming a feedback loop and helping improve the accuracy of the models.

Here’s the walk-through video of the prototype tool that we have built:

At this point we are redesigning some portions of our tool based on feedback from a formative user study with physicians and clinical researchers. Our next step would be to conduct an empirical evaluation of the tool to test our hypotheses about its design goals.

We will be presenting a demo of our tool at the AMIA Summit on Clinical Research Informatics and also at the ACM IUI Workshop on Visual Text Analytics in March.

References

Gaurav Trivedi. 2015. Clinical Text Analysis Using Interactive Natural Language Processing. In Proceedings of the 20th International Conference on Intelligent User Interfaces Companion (IUI Companion ’15). ACM, New York, NY, USA, 113-116. DOI 10.1145/2732158.2732162 [Presentation] [PDF]
Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, Harry Hochheiser. 2015. An Interactive Tool for Natural Language Processing on Clinical Text. Presented at 4th Workshop on Visual Text Analytics (IUI TextVis 2015), Atlanta. http://vialab.science.uoit.ca/textvis2015/ [PDF]
Gaurav Trivedi, Phuong Pham, Wendy Chapman, Rebecca Hwa, Janyce Wiebe, and Harry Hochheiser. 2015. Bridging the Natural Language Processing Gap: An Interactive Clinical Text Review Tool. Poster presented at the 2015 AMIA Summit on Clinical Research Informatics (CRI 2015). San Francisco. March 2015. [Poster][Abstract]

Learning from multiple annotators

I recently prepared a deck of slides for my machine learning course. In the presentation, I talk about some of the recently proposed methods on learning from multiple annotators. In these methods we do not assume the labels that we get from the annotators to be the ground truth, as we do in traditional machine learning, but try to find “truth” from noisy data.

There are two main directions of work in this area. One focuses on finding the consensus labels first and then do traditional learning, while the other approach is to learn a consensus model directly. In the second approach, we may estimate the consensus labels during the process of building a classifier itself.

Here are the slides for the presentation. I would be happy to receive your comments and suggestions.

Quoc Le’s Lectures on Deep Learning

Update: Dr. Le has posted tutorials on this topic: Part 1 and Part 2.

Dr. Quoc Le from the Google Brain project team (yes, the one that made headlines for creating a cat recognizer) presented a series of lectures at the Machine Learning Summer School (MLSS ’14) in Pittsburgh this week. This is my favorite lecture series from the event till now and I was glad to be able to attend them.

The good news is that the organizers have made available the entire set of video lectures in 4K for you to watch. But since Dr. Le did most of them on a blackboard and did not provide any accompanying slides, I decided to put the brief content descriptions of the lectures along with the videos here. Hope this will help you navigate the videos better.

Lecture 1: Neural Networks Review

[JavaScript needed to view this video.]

Dr. Le begins his lecture starting from the fundamentals on Neural Networks if you’d like to brush up your knowledge about them. Otherwise feel free to quickly skim through the initial sections, but I promise there are interesting things later on. You may use the links below to skip to the relevant parts. The links are using an experimental script, let me know in the comments if they don’t work.

Introduction
Why Neural Networks: Motivation, Non-linear classification
Mathematical Expression for NN: Decision function, Minimizing Loss and Gradient Descent (Correction in derivative), Making decision
Backpropagation: Audience questions, Derivation for backpropagation, Backpropagation algorithm

Lecture 2: NNs in Practice

[JavaScript needed to view this video.]

If you have already covered NN in the past then the first lecture may have been a bit dry for you but the real fun begins in this lecture when Dr. Le starts talking about his experiences of using deep learning in practice.

Stochastic gradient descent
Clarifications from Lecture 1: Data partitioning is not needed, Derivative of the loss function, Tip – Write unit tests!
Ideas for practical implementations: Breaking Symmetry, Monitoring Progress on training, Underfitting and overfitting, How to select NN architecture and hyper-parameters, Other tips for improvements
Deep Neural Networks: Review of why NN, Shallow vs. Deep, Rectified Linear Units, Definitions for deep NN, History of NN
Deep NN Architectures: Autoencoder, Intuition for using autoencoders for initialization (Continued in the next lecture)

Lecture 3: Deep NN Architectures

[JavaScript needed to view this video.]

In this lecture, Dr. Le finishes his description on NN architectures. He also talks a bit about how they are being used at Google – for applications in image and speech recognition, and language modelling.

Pre-training with autoencoders
Convolutional NN (Convnets): Local receptive field, Why are Convnets useful?, Image classification, General Pipeline
Recurrent NN: Word Vectors
Applications: Google Brain and other ongoing work

Talk: The Signal Processing Approach to Biomarkers

A biomarker is a measurable indicator of a biological condition. Usually it is seen as a substance or a molecule introduced in the body but even physiological indicators may function as dynamic biomarkers for certain diseases. Dr. Sejdić and his team at the IMED Lab work on finding innovative ways to measure such biomarkers. During the ISP seminar last week, he presented his work on using low-cost devices with simple electronics such as accelerometers and microphones, to capture the unique patterns of physiological variables. It turns out that by analyzing these patterns, one can differentiate between healthy and pathological conditions. Building these devices requires an interdisciplinary investigation and insights from signal processing, biomedical engineering and also machine learning.

Listening to the talk, I felt that Dr. Sejdić is a researcher who is truly an engineer at heart as he described his work on building an Asperometer. It is a device that is placed on the throat of a patient to find out when they have swallowing difficulties (Dysphagia). The device picks up the vibrations from the throat and does a bit of signal processing magic to identify problematic scenarios. Do you remember the little flap called the Epiglotis that guards the entrance to your wind pipe, from your high school Biology? Well, that thing is responsible for directing the food into the oesophagus (food pipe) while eating and preventing it from going into wrong places (like the lungs!). As it moves to cover the wind pipe, it records a characteristic motion pattern on the accelerometer. The Asperometer can then distinguish between regular and irregular patterns to find out when should we be concerned. The current gold standard to do these assessments involve using some ‘Barium food’ and X-Rays to visualize its movement. As you may have realized, the Asperometer is not only unobstrusive but also appears to be a safer method to do so. There are a couple of issues left to iron out though, such as removing sources of noise in the signal due to speech or even breathing through the mouth. We can, however, still use it in controlled usage scenarios in the presence of a specialist.

The remaining part of the talk briefly dealt with Dr. Sejdić’s investigations of gait, handwriting processes and preference detection, again with the help of signal processing and some simple electronics on the body. He is building on work in biomedical engineering to study age and disease related changes in our bodies. The goal is to explore simple instruments providing useful information that can ultimately help to prevent, retard or reverse such diseases.

Talk: ISP Seminar

Turn-Taking Behavior in a Human Tutoring Corpus by Zahra Rahimi

In their research, Zahra and Homa, analyze turn-taking behavior between students in a human-human spoken tutoring system. This analysis could be helpful in understanding how users from different demographics interact with a tutor. In this study, they use sequences of speech and silence over time to mark ‘Active’ and ‘In-active’ states the dialogues between the tutor and the student. Considering both the tutor and student together we have four different combinations of these states, in which each one of them being active or inactive. The next step is to learn (using a semi-Markov process) a model from the dialogues. Using this model, they are able to measure the association of these models with features such as gender, scores obtained in the pre-test etc. The experiments provide some interesting results such as female students speak simultaneously longer with the tutor than male students; while their activities are less than their male counterparts. Also, for the students with a lower pre-test scores, the tutor tended to speak for longer time.

Content-Based Cross-Domain Recommendations Using Segmented Models by Shaghayegh Sahebi

Sherry presented her work on the job recommendation systems in her talk. This was done as part of her internship at LinkedIn last summer. The site originally used a single model to make job recommendations to the users by selecting features from their profiles. But, these profiles tend to vary a lot according to the job function the users play and the industry they are in. Professionals in academia, for example, may put a very different set of information on their resume as opposed to a banking executive. With this new study, they wish to segment users using these very features (current job function and industry etc.) before sending them to the recommender systems. This allows them to develop an efficient method of feature augmentation and adapt their algorithms.

The model was built and evaluated based on some pre-collected data. They evaluated the accuracy of the system in recommending the jobs that the users applied to. This, however, restricted them to a certain extent and an online A/B testing is still under process. We’ll have to wait and watch for the results to find out if they do better than the one-size-fits-all model that is currently in place.

Talk: Intelligent Tutoring Systems

Starting this week, I am adding a new feature on the blog. Every week I’ll be posting something about a talk or a colloquium that I attend. Serves as good talk notes, a writing practice and an assignment all in one full scoop? You bet it does!

The program that I am pursuing, Intelligent Systems Program provides a collaborative atmosphere for both students and faculty by giving them regular opportunities to present their research. It not only helps them gather feedback from others but also introduce their work to the new members of the program (like me!). As a part of these efforts, we have a series of talks called the ISP Colloquium Series.

For the first set of talks from the ISP Colloquium Series this semester, we had Mohammad Falakmasir and Roya Hosseini to present two of their award winning papers, both on Intelligent Tutoring Systems.

1. A Spectral Learning Approach to Knowledge Tracing by Mohammad Falakmasir

For developing intelligent tutoring systems that adapt to the student’s requirements, one would need a way to determine the student’s knowledge of skills being taught. This is commonly done by modeling it based on a couple of parameters. After learning from sequences of students’ responses to a quiz, one could predict the values of these parameters for future questions. This information could then be used to adapt the tutor to keep a pace that students are comfortable with. The paper proposes the use of a Spectral Learning ^[1] algorithm over other techniques such as Expectation Maximization (or EM) to estimate these parameters that model knowledge. EM is known to be a time consuming algorithm. The results of this paper show that similar or higher accuracy in prediction can be achieved while significantly improving the knowledge tracing time.

To design experiments with this new method, Mohammad and his co-authors analyzed data collected using a software-tutor. This tool was being used for an Introductory programming class at Pitt for over 9-semesters. They could then compare the performance of their new method over EM learning of parameters. They calculated both accuracy of prediction and root mean squared error as metrics for the comparison. Learning data was used from the first semester and tested against the second semester, and they could do this over and over again by learning data from the first-two semesters and predict the results from the third one and so on. This allowed them to back their results that show a time-improvement by a factor of 30(!), with a robust statistical analysis.

2. KnowledgeZoom for Java: A Concept-Based Exam Study Tool with a Zoomable Open Student Model by Roya Hosseini

Roya talks about open student modeling as opposed to a hidden one for modelling the students’ skills and knowledge. In her paper, she goes on to propose that a visual presentation of this model could be helpful during exam preparation. Using it one could quickly review the entire syllabus and identify the topics that need more work. I find it to be a very interesting concept and again something that I would personally like to use.

The authors designed a software tutor called Knowledge Zoom that could be used as an exam preparation tool for Java classes. It is based on a concept-level model of knowledge about Java and Object-oriented programming. Each question is associated with these concepts and specifies the pre-requisites that are needed to answer it. It also gives details on outcome concepts that could be mastered by working on a particular question. The students are provided with a zoom-able tree explorer that visually presents this information. Each node is represented using different sizes and colors that indicate the importance of the concept and the student’s knowledge in that area respectively. Another component of the tool provides students with a set of questions and adaptively recommends new questions. Based on the information from the ontology and indexing of the questions as discussed above, it can calculate how prepared a student is to attempt a particular question.

Evaluation of this method is done using a class-room study where students could use multiple tools (including KZ) to answer Java questions. They do a statistical analysis in comparison to the other tools that the features that KZ introduces. The results demonstrated that KZ helped students to reach their goals faster in moving from easy to harder questions. I was impressed by the fact that on top of these results, the authors decided to back it up with a subjective analysis by the students. Students preferred KZ over others by a great margin. They also received valuable feedback from them during this analysis.

While these tutors can currently support only concept-based subjects like programming and math where one could do by testing with objective-styled questions, the fact that we can intelligently adapt to a student’s pace of learning, is something that is really promising. I wish I could use some of these tools for learning my courses!

Footnotes

You can find out more about spectral learning algorithms here: http://www.cs.cmu.edu/~ggordon/spectral-learning/. ^

Futher Reading

M. H. Falakmasir, Z. A. Pardos, G. J. Gordon, P. Brusilovsky, A Spectral Learning Approach to Knowledge Tracing, In Proceedings of the 6th International Conference on Educational Data Mining. Memphis, TN, July 2013. Available: http://people.cs.pitt.edu/~falakmasir/images/EDMPaper2013.pdf
Brusilovsky, P., Baishya, D., Hosseini, R., Guerra, J., & Liang, M.,“KnowledgeZoom for Java: A Concept-Based Exam Study Tool with a Zoomable Open Student Model”, ICALT 2013, Beijing, China. Available: http://people.cs.pitt.edu/~hosseini/papers/kz.pdf

Category: Machine Learning

Clinical Text Analysis Using Interactive Natural Language Processing

References

Learning from multiple annotators

Quoc Le’s Lectures on Deep Learning

Lecture 1: Neural Networks Review

Contents

Lecture 2: NNs in Practice

Contents

Lecture 3: Deep NN Architectures

Contents

Talk: The Signal Processing Approach to Biomarkers

Talk: ISP Seminar

Further Reading

Talk: Intelligent Tutoring Systems

Footnotes

Futher Reading