Making makefiles for your research code

Edit: This post needs a refresh using modern methods namely, Docker and Kubernetes. I hope to find some time to write a post on them one of these days…

There has been a lot of discussion lately about Reproducibility in computer science [1] . It is a bit disappointing to know that a lot of the research described in recent papers is not reproducible. This is despite that only equipment needed to conduct a good part of these experiments is something that you already have access to. In this study described in the paper here, only about half of the projects could be even built to begin with. And we are not talking about reproducing the same results as yet. So why is it so hard for people who are the leaders in the field to publish code that can be easily compiled? There could be a lot of factors like – lack of incentives, time constraints, maintenance costs etc. that have already been put forward by the folks out there, so I wouldn’t really go into that. This post is about my experiences with building research code. And, I have had my own moments of struggle with them now and then!

One is always concerned about not wasting too much effort on seemingly unproductive tasks while working on a research project. But spending time on preparing proper build scripts could in fact be more efficient when you need to send code to your advisor, collaborate with others, publish it … Plus it makes things easy for your own use. This becomes much more important for research code which often has several different pieces delicately stitched together, just to show the specific point that you are trying to make in your project. There’s no way even you would remember how it worked once you are done with it.

In one of my current projects I have some code from four different sources, all written in different programming languages. My code (like most other projects) builds upon a couple of projects and work done by other people. It certainly doesn’t have to be the way it has been structured right now and it wouldn’t be unacceptable to do so for any other use but a research project. It seemed quite logical to reuse the existing code as much as possible to save time and effort. However, this makes it very hard to compile and run the project when you don’t remember the steps involved in between.

One way to tackle this problem is to write an elaborate readme file. You could even use simple markdown tags to format it nicely as well. But that is not quite elegant enough when used as a substitute for build scripts. You wouldn’t know how many “not-so-obvious” steps you’d skip during documentation. Besides it wouldn’t be as simple as to running a build command to try out the cool thing that you made. A readme on the other hand should carry other important stuff like a short introduction to the code, how to use it and a description of the “two”-step build process that you chose for it.

Luckily this is not a new problem, and generations of programmers have provided us with excellent tools for getting around it. They offer a mechanism to document your readme steps in a very systematic way. And there’s no reason you shouldn’t use them!

One such program that you may already know about is make. Here’s a short and sweet introduction to make by @mattmight. Allow me to take this a little further to demonstrate why these tools are indeed so useful. Let’s start from something simple. A very basic makefile could read something like:

But its advantage seems more clear when you’d like to handle a more complicated scenario. So let’s cook up an example for that; say, I’d like to convert some Python code to C (don’t ask why!) using Cython and then create an executable by compiling the converted C code. Here’s how I’d probably write a makefile for it:

Now the wins are quite obvious. It saves you from remembering such a long build command and also documents the steps you need to follow for building the code. But we still have a couple of issues left if you were to distribute your code. You’d notice that I have hard-coded my python versions as well as the path to the include directories in my makefile. Running this on a different computer would certainly cause problems. One way to handle this is to declare all the variables in the beginning of your make file:

This makes it quite easy for the poor souls using your code to edit the variables according to their configurations. All of the things to change are conveniently located at the top. But wouldn’t it be nice if you could save them from all of this manual labor of finding the right paths for linking libraries, versions of the software installed etc. as well? Reading and understanding your code is already hard enough :D. A shell script could have been quite useful, no?

These awesome people at GNU Autotools have already done the hard work and have given us a bunch of tools just to do exactly what we need here. These tools includes libtool, automake and autoconf to help you create and configure your makefiles.

To write a configure script, you’d first need a configure.ac file. This can be used by the autoconf tool to generate a script to fill the variables in the makefile. Using these tools will make sure that all of your projects have a consistent two-step build process. So that anyone wanting to run your code would have to simply run the configure script followed by make to build your project. No manual tweaking of variables is required during these steps.

There are couple of other helper tools that offer you the luxury of using macros that cut your work further in writing these files. Let us continue with our cython example here.

With just two statements in my configure.ac, I’d be able to create configuration file to fill in my makefile variables:

And to tell what to fill in, I’ll add some placeholder text in my makefile and call it Makefile.in:

At this point I can run autoconf to generate the configure script that would do all the work of figuring out and filling in the variables.

I can even code my own checks here. So let’s add a couple: With my configure script, I’d like to not only assign the path for linking python libraries but also check if the user has all the pre-requisites installed on the system to be able to compile the code. You have an option to prompt the user to install the missing pieces or even start an installation for them. We’ll stop ourselves at just printing a message for the user to do the needful. So let’s go back to our configure.ac.

Here I have added some code to check if cython is available on the user’s machine. Also note that with the AC_PYTHON_DEVEL macro, I am also making sure that the python installed on the user’s machine is newer than version 2.5. You can add more checks here depending on what else is needed for your code to build and run. The best part is that a lot of macros are already available so you don’t have to write them from scratch.

There’s more stuff that you could explore here: alternatives like cmake provide a more cross-platform approach to managing your build processes and also have GUIs to do these steps. A couple of other tools which could handle the configuration portions such as pkg-config exist as well but may not come pre-installed on most OS, unlike make. There are a few language specific project managers that you could also consider (like Rake for Ruby). If you are dealing with a Java project then Ant or Maven are also good candidates. IDEs such as Netbeans create configuration files for them automatically. There are a lot of newer (relatively speaking) projects out there that let you easily package code involving web applications (more on this here) and make them ready for deployment on other machines.

Footnotes

  1. You might also be interested in this article in response to the article raising questions on reproducibility ^

Talk: Human-Data Interaction

This week I attended a high energy ISP seminar on Human-Data Interaction by Saman Amirpour. Saman is an ISP graduate student who also works with the CREATE Lab. His work in progress project on the Explorable Visual Analytics tool serves as a good introduction to this post:

While this may have some resemblance with other projects such as the famous Gapminder Foundation led by Hans Rosling, Saman presented a bigger picture in his talk and provided motivation for the emergence of a new field: Human-Data Interaction.

Big data is a term that gets thrown around a lot these days and probably needs no introduction. There are three parts of the big data problem, involving data collection, knowledge discovery and communication. Although we are able to collect massive amounts of data easily, the real challenge lies in using it to our advantage. Unfortunately, we do not enough sophistication in our machine learning algorithms that can handle this as yet. You really can’t do without the human in the loop for making some sense of the data and asking intelligent questions. And as this Wired article points out, visualization is the key for allowing us humans to do this. But, our present-day tools are not well suited for this purpose and it is difficult to handle high dimensional data. We have a tough time to intuitively understand such data. For example, try visualizing a 4D analog of a cube in your head!

So now the relevant question that one could ask is that if Human-data interaction (or HDI) really any different from the long existing areas of visualization and visual analytics? Saman suggests that HDI addresses much more than visualization alone. It involves answering 4 big questions on:

  • Steering To help in navigate the high dimensional space. This is the main area of interest for researchers in the visualization area.

But we also need to solve problems with:

  • Sense-making i.e. how can we help the users to make discoveries from the data. Sometimes, the users may not even start with the right questions in mind!
  • Communication The data experts need a medium to share their models that can in-turn allow others to ask new questions.
  • And finally, all of this needs to be done after solving the Technical challenges in building the interactive systems that support all of this.

Tools that can sufficiently address these challenges are the way to go in future. They can truly help the humans in their sense-making processes by providing them with responsive and interactive methods to not only test and validate their hypotheses but also communicate them.

Saman devoted the rest of the talk to demo some of the tools that he contributed towards and gave some examples of beautiful data visualizations. Most of them were accompanied by a lot of gasping sounds from the audience. He also presented some initial guidelines for building HDI interfaces based on these experiences.

Machines understand Rahul Gandhi!

I have a (bad) habit of checking my Twitter feed while at work. Yesterday after my machine learning class, I found my timeline to be filled with Tweets mocking Rahul Gandhi about his first-ever television interview. Naturally, I was curious to know why and I tried to give it a listen. Most of his answers made no sense to me whatsoever! But then guess what? Who else is bad at responding to questions in natural language? The machines are! Maybe it was time to put them to a test and see if the machines could understand Mr. Gandhi. Making use of the transcript made available by the Times of India and some free NLP tools(ets), I spent a couple of hours (unproductive, ofcourse :P) trying to make sense of the interview.

Here’s a wordle summary of his answers, that would at least give you an overview about what was being spoken about during the interview:

Screen Shot 2014-01-28 at 4.36.43 pm
Such system. Many people. Wow! Apparently the word ‘system’ was used 70 times during the entire interview.

Here are some of the most used (best) words from the transcript. The number times they were used are mentioned in parenthesis.

  1. system (70)
  2. people (66)
  3. going (52)
  4. party (51)
  5. country (44)
  6. want (34)
  7. congress (34)
  8. power (32)
  9. political (31)
  10. issue (26)

Next, I set out to generate a summary of his answers. And lo! to my surprise, it made perfect sense (contrary to what you usually get from a summarizer). This is the summary generated from the online tool at http://freesummarizer.com/:

What I feel is that this country needs to look at the fundamental issues at hand, the fundamental political issue at hand is that our Political system is controlled by too few people and we absolutely have to change the way our political system is structured, we have to change our Political parties, we have to make them more transparent, we have to change the processes that we use to elect candidates, we have to empower women in the political parties, that is where the meat of the issue but I don’t hear that discussion, I don’t hear the discussion about how are we actually choosing that candidate, that is never the discussion.

That ascribes huge power to the Congress party, I think the Congress party’s strength comes when we open up when we bring in new people, that is historically been the case and that is what I want to do.

The Gujarat riots took place frankly because of the way our system is structured, because of the fact that people do not have a voice in the system. And what I want to do. He was CM when Gujarat happened The congress party and the BJP have two completely different philosophies, our attack on the BJP is based on the idea that this country needs to move forward democratically, it needs push democracy deeper into the country, it needs to push democracy into the villagers, it needs to give women democratic powers, it needs to give youngsters democratic powers.

You are talking about India, we have had a 1 hour conversation here, you haven’t asked me 1 question about how we are going to build this country, how we are going to take this country forward, you haven’t asked me one question on how we are going to empower our people, you haven’t asked me one question on what we are going to do for youngsters, you are not interested in that.

There is the Congress Party that believes in openness, that believes in RTI, that believes in Panchayati Raj, that believes in giving people power. The Congress party is an extremely powerful system and all the Congress party needs to do is bring in younger fresher faces in the election which is what we are going to do and we are going to win the election.

In retrospect, repeating a few points several times is a good enough cue for an auto-summarizer to identify important sentences. This interview was perfect for a task like this as Mr. Gandhi repeated the same set of (rote) answers for almost every question that he was asked. Perhaps this is what he was hoping for? To make sure that when lazy journalists use automatic tools to do their jobs, it would give them a perfect output!

Now coming to the interesting bit. If you were a human listener like me and wanted to read the answers that he really did attempt to answer [1] , what would you do? Fear not! I have built an SVM classifier from this transcript that you could make use of in future. I used LightSide, an open source platform created by CMU LTI researchers to understand features from the transcript of his answers. Let’s get into the details then.

When you go for a interview, you could either choose to answer a question or try to avoid by cleverly diverting from the main question asked. In Rahul’s case, we have answers that can be mainly grouped into three categories – a) the questions that he answered, b) he managed to successfully avoid and c) the LOL category (the answer bears no resemblance to the question asked). I combined categories (b) and (c) to come up with classes: ANSWERED or UNANSWERED. You may check out my list of classes here and read the interview answers from the Times of India article here. They follow the same order as in the transcript with the exception of single line questions-answers that would’ve otherwise served as noise for machine learning. I selected a total of 114 questions in all out which 45 were answered and the remaining 69 were either successfully avoided or belonged to the LOL category [2] .

For feature extraction, I used quite simple language features like Bigrams, Trigrams, Line length after excluding stop words etc. You can download them in the LightSide feature format. I used the SVM plugin to learning the classification categories from the feature. Here is the final model that the tool built using the extracted features. And the results were surprising (or probably not :). With 10-fold cross validation, the resulting model had an accuracy of over 72%! An accuracy percentage like this is considered to be exceptional (in case you are not familiar with the field). The machines indeed understand Rahul Gandhi!

Unfortunately, I did not have enough data to run a couple of tests separately. We’ll have to probably wait for Mr. Gandhi to give his next interview for that. Hope that the Congress party members work as hard as the NLP researchers so that we can have a good competition by then!

Footnotes

  1. He did make an effort to answer about 40% of the questions to his credit ^
  2. These are solely based on my personal opinion. ^

The Story of Digital Color

Earlier this year, I did some work on digital color management for a project. During my readings for the project, I accumulated a lot of interesting articles that I thought I could share. The festive colors around inspired me to finally write about them (and also in case I need to refer them again :D). In this post, I have presented a collection of articles, papers, wikis, comics, podcasts …, that you may refer to find out more about the subject. Let’s then begin our story, starting from all the way back to how we see and perceive colors:

I. The background

Dispersion of light - Prism experiment.
The ‘splitting of white light into seven colors’ experiment. In reality, you see more of a continuous spectrum than the discrete seven colors shown above.

Before we get started on digital color, let’s refresh some  high-school science topics. It is kind of mind-boggling to think about it, but the concept of colors is something that you make up in your own head. Fundamentally they are electromagnetic radiation with frequencies in the visible range. Our eye’s retina is layered with mainly two types of photoreceptor cells: the rods (black-and-white vision) and the cones. The cones enable us to see color and are of three types: rho (more sensitive to longer wavelengths), gamma (medium) and beta (short). At this point I would like to introduce you to the applets designed by Prof. Mark Levoy. I’d strongly recommend you to play around with them, at least the ones on the Introduction to color theory page.

Radiation with different wavelengths (or frequencies) excite these color receptors in our eyes to varying levels which is then processed by the brain to give us a perception of seeing color. This phenomenon is known as metamerism or the tri-stimulus response in humans.  Such type of color reproduction is cheaper to process and is easier to control. IA similar technique is also exploited in building displays for our computer screens and mobile phones as well – the objects that we cherish the most and spend most of our time staring at. They also use three types of sources to produce all the colors on the display.

There may be differences in how we see the world by adding more types of color receptors. Most men have 3 types of cones (8% have even fewer types and are color-blind), while women can have up-to four due to genetic factors. This Oatmeal comic beautifully illustrates how the number of colors affects color vision in erm.. the Oatmeal way. As a bonus, you also get to find out about how Mantis Shrimp’s vision is powered by technology superior to humans. [1] All of this and the search for a tetra-chromat woman, can be found in this Radiolab podcast on colors.

Now that you understand that the beauty is indeed in the eye, let’s get a little further into the color theory and move on to our next topic.

II. Color Theory Basics

In the previous section, we learned that the illusion of color can be created by three primary colors. Based on this we have two types of color systems:

a) Additive: We add varying quantities of primaries to get other colors. If we are using R, G and B as our primaries: we can have R+G = yellow, R+B = magenta and B+G = cyan. This kind of color mixing is used in digital displays when we have individual sources for each primaries.

b) Subtractive: A paint or ink based medium would follow such a system. It is named so because of the fact that we perceive the color of object as the kind of light that it reflects back, while absorbing the rest of the colors. This can be imagined to be like subtracting color by reflecting it. An example of this system could have Cyan, Magenta and Yellow as the primaries. We also add black to increase contrast and for other practical concerns in the popular CMYK color format.

The other type of system that you may have heard about deals with Hue, Saturation and Value (HSV). These three dimensions are supposed to describe how we actually understand colors:

200px-Munsell-system.svg
Representation of a HSV color system. Hue is depicted as an angle of of the circle, saturation along the radial line and value along the vertical axis through the center.
  1. Hue: name of the color – red, yellow, blue
  2. Saturation:  a color’s ‘strength’. A neutral gray would have 0% saturation while saturated, apple red would be 100%. Pink would be an example of something in between – an unsaturated red.
  3. Value: It deals with the intensity of light. You can kind of understand it as seeing colors in dark and bright light. You don’t see any color in very dim environments and also when it is blinding bright.

This color system is more suited for image processing and manipulation uses.

III. Color Spaces

Finally we have arrived at the computer science portion of our discussion. Since computer displays have to process and present all of these different types of colors, we must find a good way to represent them as well. Most color spaces have 3 to 4 components (or channels or dimensions). A complete digital image reproduction system would involve three phases: 1) acquiring the image (say using a camera), 2) transmitting it (saving it in an image format / using an A/V cable etc.) and finally 3) displaying it (on a screen, printing or projecting it). Some color formats are more suited for a particular phase in this system. These color spaces would be based on one of the color systems that we learned about in the section above.

The simplest of the color spaces would be a Gray color space. It only has a single channel with values varying from black to white. The common 3-channel color space families are:

RGB family: These are mainly used in displays and scanners. Members of this family included sRGB, Adobe RGB color formats etc. These formats are defined by international standards defined by various organizations.

YUV / YCbCr / YCC family: These are the most unintuitive types of color spaces. They were designed keeping in mind the transmission efficiency and for efficient storing.

Device Independent Colors: All the colors spaces that we have discussed till now may produced varied results on different output devices unless calibrated. Different devices have different ranges of colors that they can produce. [2] As a result they are called device dependent colors. To counter this, some clever folks at CIE developed imaginary color formats (although not useful for any making output devices [3] ) that specify color as perceived by a ‘standard’ human. Examples of these colors spaces include CIE XYZ, L*a*b and L*u*v.

The Apple developer article on color management is a good source to read more on this topic.

IV. Color Display: Techniques and Terminologies

Now that we have covered most of the basic stuff. Let’s talk about the other common terms in brief that you may have encountered about color reproduction.

  • Color Depth and Bits per pixel (BPP)
    These parameters define the number of bits used to define a single pixel’s color. If you are using 8 bit color, you will use 3 bits (or 8 levels) for R & G and 2 for B. This is assuming that your 3 channels are red, green and blue. Similary you can have other color depths like 16-bit and 24-bit (True color) which can represent 256 shades of red, green and blue. Modern displays support something known as deep color (up to 64-bit) with gamut comprising of a billion or more colors. Although our human eye can not distinguish between so many colors, we need this bigger gamuts for high dynamic range imaging. This also affects image perception by the humans as we do perceive red, green and blue in equal capacities. So even though we may be producing more colors on the screen, we may not have sufficient number of shades for the colors we are more sensitive to. Having a larger gamut takes care of all those shades to which our eyescan distinguish but need more information in order to be represented in the digital space.
  • Color Temperature
    Color temperature is derived from the color of radiation emitted by a black body when heated to a particular temperature. Hence, it is commonly specified in units of Kelvin. Our main interest here is adjusting the white point or the chromaticity of the color reproduced by equal red, green and blue components in an output device. This allows us to adjust the colors appear “warmer” or “cooler”.
  • Dynamic Range and Quantization
    You may have heard about the newer camera phones having an HDR mode. These phones are able to process a photograph so that it can have both clear shadows and also brighter regions in the frame. The dynamic range of the image is what they are referring to here. Dynamic range is the ratio of the brightest to the darkest light levels. These levels are quantized into several intensity levels in this range. An 8-bit device would have 256 such intensity levels.
  • Gamma
    Gamma
    We are able to better distinguish between colors intensities at lower levels.

    Our eye is sensitive to the various intensities of colors in a non-linear or power relationship. This allows us to see clearly both indoors (or at night) and outdoors in bright daylight. [4] This leads us to have something known as gamma correction in computer graphics.  You can map the linear levels of your images to a curve of your choice. This must be done according to the characteristics of the output device like the monitor. When you adjust the brightness, contrast or gamma settings on your display, you are essentially manipulating this curve. This GPU Gems article from the Nvidia’s Developer zone tells you more about Gamma that you need to know.  I think, I’ll probably stop here but this topic probably deserves more space than a short paragraph. There are a couple of excellent articles by Charles Poynton on this such as The Rehabilitation of Gamma and the Gamma FAQ.

  • Dithering
    Our eyes tends to smear adjacent pixel colors together. This phenomenon is exploited to simulated higher color depth images than what that could be supported by an output device or possible in an image format. You can see some examples of dithering here. In another type of dithering known as temporal dithering, colors are changed in consecutive frames to create a similar illusion.

V. The sRGB color format

Although now a bit dated, most of the consumer devices still use sRGB as the default color spaces. When you see sRGB content on compatible devices, you will enjoy the same colors. You can find out more about what goes into making a standard color space here: http://www.w3.org/Graphics/Color/sRGB.html. After reading about the fundamentals of digital color, I hope that you’d have a good understanding of how to read a specification like that.

This is a very short post considering the breadth of topics that it deals with. I have tried to highlight the human factors that influenced the development of color theory and technologies throughout this post. They sure were able to piqué my curiosity to find out more. I hope that you’d also like this collection of resources to read about color. I would be glad if you could point out any errors and typos here.

Image Credits: Munsell System. © 2007, Jacob Rus

Footnotes

  1. Update: Study reveals that Mantis shrimp uses color to communicate! http://www.livescience.com/42797-mantis-shrimp-sees-color.html?cmpid=514636 ^
  2. More info here: http://graphics.stanford.edu/courses/cs178-10/applets/colormatching.html ^
  3. Device independent colors are useful for studying the color theory. They are also used as intermediaries when converting between different colors spaces. ^
  4. This could be similar to other psychometric curves: http://en.wikipedia.org/wiki/Psychometric_function ^

The Stroop Test

The Stroop task is among the popular research tools used in experimental psychology. It involves creating a conflict situation using words and colors. The original experiment was conducted by John Ridley Stroop way back in 1935. The participants are asked to do an oral reading of color words – like “red“, “green“, “blue” etc. in these tasks. In his first experiment, Stroop compared the reading times of words in two conditions: a) the neutral condition: words are presented in a normal black ink on a white background; and b) Incongruent condition: having incompatible color combinations e.g. “red“. It was during his second experiment that he found the reading times were much faster when using colored rectangles as compared to incongruent color-word combinations.

Other Stroop paradigms have also compared incongruent with congruent conditions. The Wikipedia article on the subject summarizes their results:

Three experimental findings are recurrently found in Stroop experiments:

  1. A first finding is semantic interference, which states that naming the ink color of neutral stimuli (e.g. when the ink color and word do not interfere with each other) is faster than in incongruent conditions. It is called semantic interference since it is usually accepted that the relationship in meaning between ink color and word is at the root of the interference.
  2. Semantic facilitation, explains the finding that naming the ink of congruent stimuli is faster (e.g. when the ink color and the word match) than when neutral stimuli are present (e.g. when the ink is black, but the word describes a color).
  3. The third finding is that both semantic interference and facilitation disappear when the task consists of reading the word instead of naming the ink. It has been sometimes called Stroop asynchrony, and has been explained by a reduced automatization when naming colors compared to reading words.

I needed a mobile (Yay, HTML5!) Stroop task as one my experiments in a class project. First, I tried using the Google Speech API but it was failing to recognize my own speech inputs, so I resorted to using regular buttons instead. I also wanted to vary the difficulty of the task, so I added an option of adding more color variations (More buttons, more targets, Fitts’ law …). Here’s what I could come up with in a couple of hours:

A Symphony of Drums

I spent my afternoon today at Tablaphilia – a tabla symphony composed by Pt. Samir Chatterjee. It was incredible to listen to 22 drummers playing together on a single stage in harmony and conducted in a symphony. Do you find it strange when I talk about harmonics while describing something performed on drums? A lot of people new to Tabla certainly do. You could sense the confusion in the audience as they watched the band “tune” their drums. Most drums, such as the ones used in a rock performance produce only in-harmonic vibrations. You’d probably prefer a dry sound without any ‘ring’ on these drums contrary to what you get from a Tabla.

tabla
A tabla is usually played as a set of two drums. The one on the left is the bass drum while the right one has a pitch and can produce harmonic overtones.

Tabla is part of a family of musical drums including Mridangam and Pakhawaj, which can produce harmonic overtones. In fact, tabla featured relatively much later in Indian music. There are several theories about its origins, such as the legend involving the famous musician – Amir Khusro [1] , and how he invented tabla when a jealous competitor broke his drum. Many also believe that Tabla may have more ancient roots in the instrument known as Tripushkara, which had three different parts like a drum kit: horizontal, vertical and embraced. I find this theory more convincing but the absolute beginnings of tabla remain unclear. However, we do know that Tabla became popular in the 17th century, when there was a need to have a drum that could give faster and complex rhythm structures required for the then emerging Sufi music style. The “finger and palm” technique used on the Tabla was much more suited for it than the heavier sounding Pakhawaj.

Alright, coming back to the central theme of the post: what allows a drum to produce harmonics? Without going into its mathematics (read, I can’t :D), allow me to refresh your memory about some concepts in harmonics. Perhaps you remember the experiments performed with the sound column and a tuning fork in your Physics lab. [2] You’d recall that a string vibrates at some fundamental frequency and its integer multiples are known as harmonics. Stringed instruments can vibrate in a harmonic series, and a unique combination of these harmonics make them sound the way they do. But what about a membrane such as the ones that used on these drums? Turns out that the drum-head on the pitch drum (the right one; See picture.) can also produce five such harmonics giving its musical effect. These are responsible for the different bols that are played on the tabla. There are two papers by the Nobel prize-winning physicist CV Raman, who has explored them in a lot of detail: "Musical drums with harmonic overtones" (C.V. Raman and S. Kumar, 1920)  and "The Indian musical drums" (C.V. Raman, 1934)

Unlike most drums, tabla has a sophisticated mechanism to alter its pitch. All tablas are designed to have a pitch that can vary between a certain range. This is done by beating along the sides of the head and adjusting the tension in the straps on the side with wooden blocks. By carefully tuning several tablas, you can derive different notes used in music and have something as amazing like this:

It was exhilarating to hear the “notes” played by the drummers during the concert piece today  as the tablas provided both rhythm and melody. As I was enjoying the music, I couldn’t help but wonder about the engineering skills and knowledge required to build these instruments. Most artisans responsible for building them have no formal training about the theory behind it. All the ingredients that go into making the drum, its shape, the three layers of membrane on the head and even the placement of the black spot (or syahi) contribute to the unique sounds of the instrument. Some of the techniques for building the drum have been passed within families over hundreds of years. The exact recipe for making the syahi, is still kept as a closely guarded secret in these families. This loading of the membrane with the syahi allows the tabla to produce overtones. Did you notice that it is placed off the center on the bass (the left drum) though? The membrane also has different regions which when struck in a certain manner produce the overtones. When you go about counting the different parameters and factors, it makes you wonder about how by mere experience, the makers of tabla could design it.

Listening to Tablaphilia was an incredibly unique experience. I always enjoy when students play together in a tabla class while learning a piece but having a complete performance like this was very new for me. Usually, tabla performances are either done solo or they give company to a melody instrument but never as an ensemble. Some connoisseurs of Indian music would argue that a format like this doesn’t offer any room for improvisations. Agreed; but then what is music without variety and experimentation. I look forward to more such symphonies and concerts.

References

  1. C.V. Raman (1934), The Indian musical drums. Available: http://www.ias.ac.in/jarch/proca/1/179-188.pdf.
  2. C.V. Raman and S. Kumar (1920), Musical drums with harmonic overtones. Available: http://www.nature.com/nature/journal/v104/n2620/abs/104500a0.html.

Footnotes

  1. Amir Khusro, the second; as I am told, but I couldn’t find a reference online ^
  2. The Wikipedia article on Harmonics has some really neat illustrations. ^

Talk: The Signal Processing Approach to Biomarkers

A biomarker is a measurable indicator of a biological condition. Usually it is seen as a substance or a molecule introduced in the body but even physiological indicators may function as dynamic biomarkers for certain diseases. Dr. Sejdić and his team at the IMED Lab work on finding innovative ways to measure such biomarkers. During the ISP seminar last week, he presented his work on using low-cost devices with simple electronics such as accelerometers and microphones, to capture the unique patterns of physiological variables. It turns out that by analyzing these patterns, one can differentiate between healthy and pathological conditions. Building these devices requires an interdisciplinary investigation and insights from signal processing, biomedical engineering and also machine learning.

Listening to the talk, I felt that Dr. Sejdić is a researcher who is truly an engineer at heart as he described his work on building an Asperometer. It is a device that is placed on the throat of a patient to find out when they have swallowing difficulties (Dysphagia). The device picks up the vibrations from the throat and does a bit of signal processing magic to identify problematic scenarios. Do you remember the little flap called the Epiglotis that guards the entrance to your wind pipe, from your high school Biology? Well, that thing is responsible for directing the food into the oesophagus (food pipe) while eating and preventing it from going into wrong places (like the lungs!). As it moves to cover the wind pipe, it records a characteristic motion pattern on the accelerometer. The Asperometer can then distinguish between regular and irregular patterns to find out when should we be concerned. The current gold standard to do these assessments involve using some ‘Barium food’ and X-Rays to visualize its movement. As you may have realized, the Asperometer is not only unobstrusive but also appears to be a safer method to do so. There are a couple of issues left to iron out though, such as removing sources of noise in the signal due to speech or even breathing through the mouth. We can, however, still use it in controlled usage scenarios in the presence of a specialist.

The remaining part of the talk briefly dealt with Dr. Sejdić’s investigations of gait, handwriting processes and preference detection, again with the help of signal processing and some simple electronics on the body. He is building on work in biomedical engineering to study age and disease related changes in our bodies. The goal is to explore simple instruments providing useful information that can ultimately help to prevent, retard or reverse such diseases.