I attended my first AMIA meeting last week. It was an amazing and exciting experience to meet with close to 2,500 informaticians at once. In some ways it was a bit overwhelming as well due to the scale of the event as well as being in company of famous researchers whose papers you have read. If you weren’t able to attend the event in person, the good news is that the a lot of informaticians are big into documenting stuff on twitter. Here’s a twitter moment capturing the event as seen by me. Check out my twitter moment here and the hashtag #AMIA2017 for even more…
I have been attending a reading group on visualization tools for the last few weeks. This is a unique multi-institution group that meets over web-conferencing at 4 PM EST / 1 PM PST on Fridays. It includes a diverse bunch of participants including non-academic researchers.
Every week we vote on and discuss a range of topics related to building tools for visualizing data.
This week, it was my turn to lead the discussion on the Lumiere paper. This is the research responsible for the now retired Clippy Office assistant. I also noticed a strong ISP presence in the references section as the paper focuses on Bayesian user modeling.
During the discussion, we talked about how we can offer help to use vis tools better. Here are my slides from it:
- Eric Horvitz, Jack Breese, David Heckerman, David Hovel, and Koos Rommelse. 1998. The lumière project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI’98), Gregory F. Cooper and Serafín Moral (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 256-265.
- Justin Matejka, Wei Li, Tovi Grossman, and George Fitzmaurice. 2009. CommunityCommands: command recommendations for software applications. In Proceedings of the 22nd annual ACM symposium on User interface software and technology. ACM, New York, NY, USA, 193-202.
I recently presented on small data for my mobile health class. I have posted my slides here. I would be happy to receive your thoughts and comments:
Dr. Quoc Le from the Google Brain project team (yes, the one that made headlines for creating a cat recognizer) presented a series of lectures at the Machine Learning Summer School (MLSS ’14) in Pittsburgh this week. This is my favorite lecture series from the event till now and I was glad to be able to attend them.
The good news is that the organizers have made available the entire set of video lectures in 4K for you to watch. But since Dr. Le did most of them on the board and did not provide any accompanying slides, I decided to put the contents of the lectures along with the videos here.
In this post I posted Dr. Le’s lecture videos and added content links with short descriptions to help you navigate them better.
Lecture 1: Neural Networks Review
Dr. Le begins his lecture starting from the fundamentals on Neural Networks if you’d like to brush up your knowledge about them. Otherwise feel free to quickly skim through the initial sections but I promise there are interesting things later on. You may use the links below to quickly skip the video to the relevant parts. Let me know in the comments if they don’t work.
- Why Neural Networks: Motivation, Non-linear classification
- Mathematical Expression for NN: Decision function, Minimizing Loss and Gradient Descent (Correction in derivative), Making decision
- Backpropagation: Audience questions, Derivation for backpropagation, Backpropagation algorithm
Lecture 2: NNs in Practice
If you have already covered NN in the past then the first lecture may have been a bit dry for you but the real fun begins in this lecture when Dr. Le starts talking about his experiences of using deep learning in practice.
- Stochastic gradient descent
- Clarifications from Lecture 1: Data partitioning is not needed, Derivative of the loss function, Tip – Write unit tests!
- Ideas for practical implementations: Breaking Symmetry, Monitoring Progress on training, Underfitting and overfitting, How to select NN architecture and hyper-parameters, Other tips for improvements
- Deep Neural Networks: Review of why NN, Shallow vs. Deep, Rectified Linear Units, Definitions for deep NN, History of NN
- Deep NN Architectures: Autoencoder, Intuition for using autoencoders for initialization (Continued in the next lecture)
Lecture 3: Deep NN Architectures
In this lecture, Dr. Le finishes his description on NN architectures. He also talks a bit about how they are being used at Google for applications in image and speech recognition, and language modelling.
- Pre-training with autoencoders
- Convolutional NN (Convnets): Local receptive field, Why are Convnets useful?, Image classification, General Pipeline
- Recurrent NN: Word Vectors
- Applications: Google Brain and other ongoing work
This week I attended a high energy ISP seminar on Human-Data Interaction by Saman Amirpour. Saman is an ISP graduate student who also works with the CREATE Lab. His work in progress project on the Explorable Visual Analytics tool serves as a good introduction to this post:
While this may have some resemblance with other projects such as the famous Gapminder Foundation led by Hans Rosling, Saman presented a bigger picture in his talk and provided motivation for the emergence of a new field: Human-Data Interaction.
Big data is a term that gets thrown around a lot these days and probably needs no introduction. There are three parts of the big data problem, involving data collection, knowledge discovery and communication. Although we are able to collect massive amounts of data easily, the real challenge lies in using it to our advantage. Unfortunately, we do not enough sophistication in our machine learning algorithms that can handle this as yet. You really can’t do without the human in the loop for making some sense of the data and asking intelligent questions. And as this Wired article points out, visualization is the key for allowing us humans to do this. But, our present-day tools are not well suited for this purpose and it is difficult to handle high dimensional data. We have a tough time to intuitively understand such data. For example, try visualizing a 4D analog of a cube in your head!
So now the relevant question that one could ask is that if Human-data interaction (or HDI) really any different from the long existing areas of visualization and visual analytics? Saman suggests that HDI addresses much more than visualization alone. It involves answering 4 big questions on:
- Steering To help in navigate the high dimensional space. This is the main area of interest for researchers in the visualization area.
But we also need to solve problems with:
- Sense-making i.e. how can we help the users to make discoveries from the data. Sometimes, the users may not even start with the right questions in mind!
- Communication The data experts need a medium to share their models that can in-turn allow others to ask new questions.
- And finally, all of this needs to be done after solving the Technical challenges in building the interactive systems that support all of this.
Tools that can sufficiently address these challenges are the way to go in future. They can truly help the humans in their sense-making processes by providing them with responsive and interactive methods to not only test and validate their hypotheses but also communicate them.
Saman devoted the rest of the talk to demo some of the tools that he contributed towards and gave some examples of beautiful data visualizations. Most of them were accompanied by a lot of gasping sounds from the audience. He also presented some initial guidelines for building HDI interfaces based on these experiences.
A biomarker is a measurable indicator of a biological condition. Usually it is seen as a substance or a molecule introduced in the body but even physiological indicators may function as dynamic biomarkers for certain diseases. Dr. Sejdić and his team at the IMED Lab work on finding innovative ways to measure such biomarkers. During the ISP seminar last week, he presented his work on using low-cost devices with simple electronics such as accelerometers and microphones, to capture the unique patterns of physiological variables. It turns out that by analyzing these patterns, one can differentiate between healthy and pathological conditions. Building these devices requires an interdisciplinary investigation and insights from signal processing, biomedical engineering and also machine learning.
Listening to the talk, I felt that Dr. Sejdić is a researcher who is truly an engineer at heart as he described his work on building an Asperometer. It is a device that is placed on the throat of a patient to find out when they have swallowing difficulties (Dysphagia). The device picks up the vibrations from the throat and does a bit of signal processing magic to identify problematic scenarios. Do you remember the little flap called the Epiglotis that guards the entrance to your wind pipe, from your high school Biology? Well, that thing is responsible for directing the food into the oesophagus (food pipe) while eating and preventing it from going into wrong places (like the lungs!). As it moves to cover the wind pipe, it records a characteristic motion pattern on the accelerometer. The Asperometer can then distinguish between regular and irregular patterns to find out when should we be concerned. The current gold standard to do these assessments involve using some ‘Barium food’ and X-Rays to visualize its movement. As you may have realized, the Asperometer is not only unobstrusive but also appears to be a safer method to do so. There are a couple of issues left to iron out though, such as removing sources of noise in the signal due to speech or even breathing through the mouth. We can, however, still use it in controlled usage scenarios in the presence of a specialist.
The remaining part of the talk briefly dealt with Dr. Sejdić’s investigations of gait, handwriting processes and preference detection, again with the help of signal processing and some simple electronics on the body. He is building on work in biomedical engineering to study age and disease related changes in our bodies. The goal is to explore simple instruments providing useful information that can ultimately help to prevent, retard or reverse such diseases.
Yesterday, I attended a talk by Prof. David Forsyth. One of the perks of being a student is to be able to attend seminars like these. The talk was mostly about his work on understanding pictures of rooms and inserting objects into them. It was a light talk and he did not go too much into the details of his paper. Apart from that he gave an overview of the current work done by the computer vision researchers and his vision (pun intended) for the future. Overall, it was a fun talk to attend and a Friday evening well spent :D.
Here is a video showing a demo of the method, in case you are curious: