Machines learn to play Tabla, Part – 2

This is a followup on my earlier post on Machines Learn to play Tabla. You may wish it check it out first reading this one…

Three years ago, I published a post on using recurrent neural networks to generate tabla rhythms. Sampling music from machine learned models was not in vogue then. My post received a lot of attention on the web and became very popular. The project had been a proof-of-concept and I have wanted build on it for a long time now.

This weekend, I worked on making it more interactive and I am excited to share these updates with you. Previously, I was using a proprietary software to convert tabla notation to sound. That made it hard to experiment with sampled rhythms and I could share only a handful sounds. Taking inspiration from our friends at Vishwamohini, I am now able to convert bols into rhythm on the fly using MIDI.js.

Let me show off the new javascript synthesizer using a popular Delhi kaida. Hit the ‘play’ button to listen:

Now that you’ve heard the computer play, here’s an example of it being played by a tabla maestro:

Of course, the synthesized outcome is not much of a comparison to the performance by the maestro, but it is not too bad either…

Now to the more exciting part- Since our browsers have learned to play the tabla, we can throw in the char-rnn model that I built in the earlier post.  To do this, I used the RecurrentJS library and combined it with my javascript tabla player:

Feel free to play around with tempo and maximum character-limit for sampling. When you click on ‘generate’,  it will play a new rhythm every time. Hope you’ll enjoy playing with it as much as I did!

The player has a few kinks at this point I am working towards fixing them. You too can contribute to my repository on GitHub.

There are two areas that need major work:

Data: The models that I trained for my earlier post was done using a small amount of training data. I have been on a lookout for better dataset since then. I wrote a few emails, but without much success till now. I am interested in knowing about more datasets I could train my models on.

Modeling: Our model did a very good job of understanding the structure of TaalMala notations. Although character level recurrent neural networks work well, it is still based on very shallow understanding of the rhythmic structures. I have not come across any good approaches for generating true rhythms yet:

I think more data samples covering a range of rhythmic structures would only partially address this problem. Simple rule based approaches seem to outperform machine learned models with very little effort. Vishwamohini.com has some very good rule-based variation generators that you could check out.  They sound better than the ones created by our AI. After all the word for compositions- bandish, literally derived from ‘rules’ in Hindi. But on the other hand, there are only so many handcrafted rules that you can come up with which may lead to generating repetitive sounds.

Contact me if you have some ideas and if you’d like to help out! Hope that I am able to post an update on this sooner than three years this time 😀

Talk: Socially Embedded Search

This week I attended a full house talk by Dr. Meredith Ringel Morris on Socially Embedded Search Engines. Dr. Morris put together a lot of material in her presentation and we (audience) could appreciate how she presented all of it, with great clarity, in just one hour. But I think it would tricky for me to summarize everything in a short post. Do check out Dr. Morris’ website to find out more information on the subject.

Social Search is term for when you pose a question to your friends by using one of the social networking tools (like Facebook, Twitter). There is good chance that you might have already been using “Social Search” without knowing the term for it. So, why would you want to do that instead of using regular search engines that you have access to? It may be simpler to ask your friends at times and they could also provide direct, reliable and personalized answers. Moreover, this is something that could work along with the traditional search engines. Dr. Morris’ work gives some insight into the areas where the search engineers have opportunities in combining traditional algorithmic approaches with social search. She tells us about what kind of questions are asked more in a social search and which types of them are more likely to succeed in getting a useful answer. She goes on further into how the topics for these questions vary with people from different cultures.

I really liked the part about “Search buddies” during the talk. In their paper, Dr. Morris and her colleagues have proposed implanting automated agents that post relevant replies to your social search queries. One type of such an agent tries to figure out the topic for the question and recommends friends who seem to be interested in that area by looking at their profiles. While another one would try to use an algorithmic approach and post a link to a web-page that is likely to contain an answer to the question. It was interesting to know more about how other people reacted to the involvement of these automated agents. While some of the people in the experiment appreciated being referred to for an answer, a lot of them found them obnoxious when they didn’t perform well in identifying the contexts. In her more recent work, Dr. Morris has tried to solve these problems by recruiting real people from Mechanical Turk to answer questions on Twitter. Such an approach could respond to people’s questions in a smarter way by collecting information from a several people. It could then respond to these questions in the form of a polling result and quote the number of people recommending a particular answer. It can also work by taking into account any other replies that the participant would have already received from one of his followers. The automated agent would then present that answer for a opinion poll from the Turkers. Although such a system could provide more intelligent replies than ‘dumb’ algorithms but it may still fail in comparison to responses from your friends which would certainly be more personalized and placed better contextually. During the QnA session, one of audience members raised a question (with a follow-up question by Prof. Kraut)  about comparing these methods with question-and-answer websites such as Quora. While these sites may not provide as personalized results but will certainly do better in drawing the attention of people interested in similar topics. It may not be always possible to find somebody amongst your friends, to answer question on a specialized  topic.

Dr. Morris’ talk provided some really good backing for some of the recent steps taken by search engines like Bing (having ties with both Twitter and Facebook), Google (and the Google plus shebang) and also Facebook (with Graph Search) in this direction. It would be interesting to see how social computing research shapes the future of internet search.

Further Reading

You can find Dr. Morris’ publications on this topic here: http://research.microsoft.com/en-us/um/people/merrie/publications.html

How about collaboration?

My previous post on Computers and Chess, serves as a good prologue to this.

watson
That’s me geeking out at the Jeopardy stage setup.

A little more than two years ago, the IBM Watson played against and defeated the previous champions of Jeopardy!, the TV game show in which the contestants are tested on their general knowledge with quiz-style questions.[1] I remember being so excited while watching this episode that I ended up playing it over and over again, only to have the Jeopardy jingle loop in my head for a couple of days! Now, this is a much harder challenge for the computer scientists to solve than making a machine play chess.

Computers have accomplished so many things that we thought that only humans could do (play chess and jeopardy, drive a car all by itself …). While these examples are by no means small problems that we have solved, we still have a long way to go. While it can solve problems that we as humans often find difficult (such as playing chess, calculating 1234567890 raised to the power 42 etc.), it cannot* do a lot of things that you and I take for granted. For example, it can’t comprehend this post as well as you do (Watson may not be able to answer everything), read it out naturally & fluently (Siri still sounds robotic) and make sense of the visuals on this page (and so on). *At least not yet.

Computers were designed as tools to help us with calculations or computations. By this very definition, are computers are inherently better at handling certain types of problems while in others they fail? Well, we have no answer [2] to this question now and I at least hope that it isn’t in affirmative so that someday we can replicate human intelligence. As we have seen in the past, we certainly can not say that “X” is something that computers will never be able to do. But we can sure point out the areas in which the researchers are working hard and hoping to improve.

Here’s a video that talks about the topic that I am hinting at. While I promise not to post many TED talks in future, you can be sure of finding this central idea (the first half of the talk) as a common theme on this blog. Also, I prefer the word “Collaboration” over “Cooperation” [3] :

TLDR Let’s not try to solve big problems solely with computers. Make computers do the boring repetitive work and involve humans for providing creative inputs or heuristics for the machines. Try to improve interfaces that make this possible.

Although this was an idea envisioned in "Man-Computer Symbiosis" (Licklider J. C. R., 1960) more than half-a-century ago, researchers seem to have not given due importance to it when [4] the computers failed to perform as well as expected. Of course, more the number of “X”s that the computers are able to do by themselves, the more it frees us to do whatever we do best. When we do look around and observe the devices that we use and how we interact with the machines everyday, we seem to have knowingly or unknowingly progressed in the direction shown by Licklider. With the furthering of research in areas such as Human Computing, Social Computing, and (the new buzzword) Crowd-sourcing, the interest shown in such ideas has never been greater.

References

  1. Licklider J. C. R. (1960), Man-Computer Symbiosis. IEEE. Available: http://groups.csail.mit.edu/medg/people/psz/Licklider.html.

Footnotes

  1. More about Watson from IBM here. See also, Jeopardy vs. Chess. ^
  2. Amazon’s Mechanical Turk does talk about “HITs” or Human Intelligence Tasks ^
  3. In AI terms, it would indeed be multi-agent co-operation but then again we are not treating humans just as agents in this case. ^
  4. AI Winter: http://en.wikipedia.org/wiki/AI_winter ^

Computers and Chess

Deep Blue vs Kasparov '96 Game 1
Deep Blue vs. Kasparov: 1996 Game 1. Deep Blue won this game but Kasparov went on to win the match by 4-2. In the 1997 re-match, however, Deep Blue won 3½–2½.

To design an algorithm for playing the game of chess has been one of the challenges that has attracted the attention of many mathematicians and computer scientists. The sheer number of combinatorial possibilities make it hard to predict the result for both humans and computers alike. There have been many highly publicized games pitting humans against the (super) computers in the ’90s and ’00s, such as the Deep Blue vs. Kasparov one.

It was around the same time that I was starting out with chess and was interested in learning how to play better. My father had gifted me a copy of a computer game called Maurice Ashley Teaches Chess. It included playing strategies, past-game analysis and video coaching by the chess grandmaster Maurice Ashley. It also had a practice mode where you could compete and play against the computer. I didn’t end up being a good chess player but if my memory serves me right, it did not take me long to start beating the in-game AI. But things have changed a lot since then. Computers are not only faster and more powerful now (to explore more number of moves) but are also equipped with better algorithms to evaluate a decision. Let’s compare excerpts from the introductory chapters from two of my textbooks:

From "Cognitive Psychology" (Medin et.al., 2004):

The number of ways in which the first 10 moves can be played is on the order of billions and there are more possible sequences for the game than there are atoms in the universe! Obviously neither humans nor machines can determine the best moves by considering all the possibilities. In fact, grandmaster chess players typically report that they consider only a handful of the possible moves and “look ahead” for only a few moves. In contrast, chess computers are capable of examining more than 2,000,000 potential moves per second and can search quite a few moves ahead. The amazing thing is that the best grandmasters (as of this writing) are still competitive with the best computers.

Now consider, "Artificial Intelligence: A Modern Approach (3rd Edition)" (Russell et.al., 2010):

IBM’s DEEP BLUE became the first computer program to defeat the world champion in a chess match when it bested Garry Kasparov by a score of 3.5 to 2.5 in an exhibition match (Goodman and Keene, 1997). Kasparov said that he felt a “new kind of intelligence” across the board from him. Newsweek magazine described the match as “The brain’s last stand.” The value of IBM’s stock increased by $18 billion. Human champions studied Kasparov’s loss and were able to draw a few matches in subsequent years, but the most recent human-computer matches have been won convincingly by the computer.

So, what happened in the six year gap between the publishing of these books? It turns out that there has indeed been such a shift in the recent years. The computers’ superior performance stats can be seen on this Wikipedia entry. We have come a long way since the Kasparov vs. Deep Blue matches due the the advancements in both hardware and AI algorithms. Computers have now started not only wining but dominating in the human-computer chess matches so much so that even mobile phones running slower hardware are reaching Grandmaster levels. Guess, time’s right for switching to new board games! Btw, Checkers is a solved problem since 2007: http://www.sciencemag.org/content/317/5844/1518.full! It will end up in a draw (they have a computational proof of that) if both players use the perfect strategies, i.e. the one that never loses.

Image Credits: en:User:Cburnett / Wikimedia Commons / CC-BY-SA-3.0 / GFDL

References

  1. Russell et.al. (2010), Artificial Intelligence: A Modern Approach (3rd Edition), 49. Prentice Hall. Available: http://www.amazon.com/Artificial-Intelligence-Modern-Approach-Edition/dp/0136042597.
  2. Medin et.al. (2004), Cognitive Psychology. Wiley. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0471458201.