Machines learn to play Tabla, Part – 2

This is a followup on my earlier post on Machines Learn to play Tabla. You may wish it check it out first reading this one…

Three years ago, I published a post on using recurrent neural networks to generate tabla rhythms. Sampling music from machine learned models was not in vogue then. My post received a lot of attention on the web and became very popular. The project had been a proof-of-concept and I have wanted build on it for a long time now.

This weekend, I worked on making it more interactive and I am excited to share these updates with you. Previously, I was using a proprietary software to convert tabla notation to sound. That made it hard to experiment with sampled rhythms and I could share only a handful sounds. Taking inspiration from our friends at Vishwamohini, I am now able to convert bols into rhythm on the fly using MIDI.js.

Let me show off the new javascript synthesizer using a popular Delhi kaida. Hit the ‘play’ button to listen:

Now that you’ve heard the computer play, here’s an example of it being played by a tabla maestro:

Of course, the synthesized outcome is not much of a comparison to the performance by the maestro, but it is not too bad either…

Now to the more exciting part- Since our browsers have learned to play the tabla, we can throw in the char-rnn model that I built in the earlier post.  To do this, I used the RecurrentJS library and combined it with my javascript tabla player:

Feel free to play around with tempo and maximum character-limit for sampling. When you click on ‘generate’,  it will play a new rhythm every time. Hope you’ll enjoy playing with it as much as I did!

The player has a few kinks at this point I am working towards fixing them. You too can contribute to my repository on GitHub.

There are two areas that need major work:

Data: The models that I trained for my earlier post was done using a small amount of training data. I have been on a lookout for better dataset since then. I wrote a few emails, but without much success till now. I am interested in knowing about more datasets I could train my models on.

Modeling: Our model did a very good job of understanding the structure of TaalMala notations. Although character level recurrent neural networks work well, it is still based on very shallow understanding of the rhythmic structures. I have not come across any good approaches for generating true rhythms yet:

I think more data samples covering a range of rhythmic structures would only partially address this problem. Simple rule based approaches seem to outperform machine learned models with very little effort. Vishwamohini.com has some very good rule-based variation generators that you could check out.  They sound better than the ones created by our AI. After all the word for compositions- bandish, literally derived from ‘rules’ in Hindi. But on the other hand, there are only so many handcrafted rules that you can come up with which may lead to generating repetitive sounds.

Contact me if you have some ideas and if you’d like to help out! Hope that I am able to post an update on this sooner than three years this time 😀

AMIA 2017

I attended my first AMIA meeting last week. It was an exciting experience to meet with close to 2,500 informaticians at once. It was also a bit overwhelming due to the scale of the event as well as being in company of famous researchers whose papers you have read:

Twitter log from the 2017 AMIA Annual Symposium held from Nov 4 – 8 in Washington DC. AMIA brings together informatics researchers, professionals, students, and everyone using informatics in health care… [Click to view]

If you weren’t able to attend the event in person, the good news is that the a lot of informaticians are big into documenting stuff on twitter. Check out my twitter moment here and the hashtag #AMIA2017 for more…

Kivy wrap for the summer

As I conclude my summer work on Kivy and Plyer, here’s a post to summarize all the contributions I have made. It would also be useful to start from here when I wish to revisit any of this in future.

To draw a comparison to the current state of Plyer development, this table shows a list of supported facades before the summer started:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X
Email (open mail client) X

If you have been following the updates, you would have come across my weekly progress posts over the last couple of months. Here’s a list of all such posts since mid-summer for easy access (also check out my mid-summer summary post):

  1. I can haz commit access and other updates
  2. Maintenance work in progress
  3. Plyer on iOS
  4. More, more facades

And in comparison to the table above, this is how the Plyer support looks like as of today after all these changes:

Platform Android < 4.0 Android > 4.0 iOS Windows OSX Linux
Accelerometer X X X X X
Camera (taking picture) X X
GPS X X
Notifications X X X X X
Text to speech X X X X X X
Email (open mail client) X X X X X
Vibrator X
Sms (send messages) X X
Compass X X X
Unique ID (IMEI or SN) X X X X X X
Gyroscope X X X
Battery X X X X X X

Of course there’s more than what meets the eye. There has been a lot of background work that went into writing them. This included understanding the individual platforms APIs and working with other Kivy projects — Pyjnius and Pyobjus that support this work. Some of these changes called for a re-write of old facades in order to follow a consistent approach. Since Plyer is at an early stage of development, I also contributed some maintenance code and writing build scripts.

In the beginning of August, I took a break from facade development for two weeks and made recommendations on making Kivy apps more accessible. I looked into existing projects that could be useful for us and pointed at a possible candidate that we could adapt for our purposes. Here are the two posts summarizing my investigations:

  1. Towards Making Kivy Apps Accessible
  2. Towards Making Kivy Apps Accessible – 2

At this point, I would also include a thank you note to everyone on #kivy and #plyer on freenode for helping me out whenever I got stuck. This was the first time I actively participated in IRC discussions over an extended period. I also tried to return the favor by offering help, when I could, to other new users. Apart from getting a chance to work with the Kivy community from all around the world (with so many timezones!), there were couple of other firsts as well that I experienced while working on the project. Those served as good learning experiences and a motivation for making contributions to open source.

Overall, it was a quite a fun experience contributing to kivy over the summer and I hope to continue doing so every now and then. Now as Kivy is gaining more popularity everyday, I hope to see many more users diving into writing code for it and be a part of this community. Hope these posts could also serve to point them to relevant development opportunities.

More, more facades

This week I added many more facade implementations in Plyer. It was only a few days ago that I had started working on iOS and I am happy that the list has grown quite a bit this week.

I also added Plyer in the kivy-ios tool-chain, i.e. it is now a part of the build-all script and would be available for use in apps packaged with Kivy for iOS.

Apart from that I also did a couple of maintenance fixes to close the holes that I noticed with the checked in code and fix style problems with other contributions.

Although this update was a short one, it did involve a considerable amount of coding effort.

As the summer is coming to a close, I will be spending the next week wrapping up my work, polishing the rough edges in the contributions till now, and of course write the “obvious” bits and pieces that I may have ignored from the documentation till now.

Plyer on iOS

I finally had access to an iOS development device this week. Unlike the other platforms, I didn’t have any prior experience developing for it. So I spent some time familiarizing myself with the tools and stuff. This process was mostly painless but did end up consuming some time.

I also setup my kivy-ios tool chain for the first time. After a couple hello-world programs and fixing minor typos in the examples code, I then moved on to further explore pyobjus for writing Plyer facades. I worked on a new version of the accelerometer example that was not dependent on bridge.m supplied with all kivy-ios packages by default. When I am done moving all the sensors we could do away with the classes contained in the bridge (note to self!).

While playing with this code, I also noticed something interesting on the Xcode dashboard:

Turns out that we had a hit a major bug in Pyobjus that was causing the memory allocated for accelerometerData to leak. In fact this would happen everywhere you’d otherwise need to use something like @autoreleasepool in your Objective C code. Pyobjus objects didn’t account for cases like these. Tito suggested a fix for the issue but is still working on finalizing it.

Meanwhile, I also created another iOS facade for retrieving battery status. Although it was a fairly short one to code but I did have to learn many background things before I could finish that. I hope that this will make it easy for me to finish the other facades in the coming weeks.

Towards Making Kivy Apps Accessible, Part – 2

In this second part of my post of on making kivy apps accessible I would like to describe some of the existing libraries and APIs we can model our accessibility features upon. Last week we identified that the main obstacle for creating accessible kivy apps is a missing module that could communicate the widget states to screen-readers. I explored other frameworks that may have tried to solve these problems before us and will discus one such project in particular.

Kivy includes pygame as one of the supported window providers. While looking for accessible apps in Python I found a GUI engine for pygame called OcempGUI. As a part of this project they also worked upon an accessibility module named Papi:

Papi, the Python Accessibility Programming Interface, is a Python wrapper around the GNOME ATK toolkit. It allows a developer to make python objects and applications easily accessibility aware without the need to install PyGTK and the GNOME accessibility components. Instead it only depends on ATK and – on the developers behalf – the ATK/AT-SPI bridge shipped with AT-SPI.

There is also some support for Microsoft Active Accessibility (MSAA) on Microsoft Windows.

Papi is not limited to the apps you can build with OceampGUI graphical user interfaces but can help support accessibility for any python object. Here’s an example accessible app using Papi:


I created a gist as I couldn’t find a way to embed a Sourceforge file but you can find the complete project there. You can also read more about Papi and OcempGUI on their website. Unfortunately, the project looks like that it is no longer in active development. It’s last release was in 2008.

Assuming that we can build upon this module we’d be able to support accessible Kivy apps on Windows and Linux. We are still left with taking care of OS X Accessibility and also on the mobile devices.

MacOS has a very good accessibility support but only when you are writing Cocoa or Carbon apps. In order to provide a similar level of accessibility support you would likely need to hack around a bit. With inputs from our folks at #macdev on freenode, I set out to do just that. They suggested that I could subclass NSApplication and implement the NSAccessibility protocol within the Kivy app. This would involve creating an hierarchy of fake UI objects that provide accessibility implementations as on a Cocoa app. I did make some progress with by using our in-house project — PyObjus to access AppKit frameworks and subclass objective-c classes. But the situation became a little to overwhelming for me to handle within this one week and I haven’t succeeded in creating a working proof of concept as yet.

Fortunately, Apple folks have recently launched a new API starting OS X 10.10 that includes a NSAccesibilityElement class. I am hoping that this would help avoid creating fake UI objects to implement their accessibility protocols. Here are a couple of examples demonstrating that but I haven’t tried them out yet. Need to get access to their beta versions first. You can also watch their WWDC 2014 session videos on Accessibility on OS X describing it.

We are still left with discussing about the mobile platforms. Here are links to the relevant docs pages for Android (http://developer.android.com/guide/topics/ui/accessibility/apps.html#custom-views) and iOS (https://developer.apple.com/library/ios/documentation/UserExperience/Conceptual/iPhoneAccessibility/Accessibility_on_iPhone/Accessibility_on_iPhone.html). I would like use these bookmarks when I dive deeper into exploring mobile accessibility APIs. However this week, I plan to go back to my original plan and continue the Plyer work from where I left off.

 

Towards Making Kivy Apps Accessible

After last week‘s discussion with tshirtman about making making Kivy apps more accessible, I spent this week exploring the accessibility options on the devices I own. The idea was to investigate and plan a set of common features that we could implement for Kivy.

I ended up playing with all the settings I that could find under the accessibility menus on an Android phone and a Macbook. I will soon repeat this exercise with an iPhone as well.

In comparison to my Galaxy running Android 4.4.2, these features were easier to use and more responsive to interact with on my Mac 10.9. Even the tutorials were simpler to follow and avoided statements like:

“When this feature is on, the response time could be slowed down in Phone, Calculator and other apps.”

I will keep my rantings aside for now and would instead like to focus on how we could support these features within Kivy apps. Following Apple’s way of organizing these features, you can group accessibility features into three broad categories:

Here's TalkBack on Android reading out my PIN to me!
TalkBack on Android reading out my PIN to me!
  • Seeing: This includes display settings such as colors and contrast, making text larger and zooming the screen. A lot of these settings come free of cost to the app developers. However, an important feature in this category includes the screen reader app (Example: VoiceOver and TalkBack). This is the main area of interest to us as the onus of making this feature work correctly rests with us and the app developers. I will get back to this after we finish classifying the features.
  • Hearing: We must make sure all our alert widgets in Kivy are accompanied by visual cues (such as a screen flash) along with audio indicators. Another feature that might be of interest to us would be to ensure that we provide a subtitles option along in our video player widget.
  • Interacting: It includes features like the assistant menu on Android to provide easy access to a limited number of commonly accessed functions. Most of these features are again managed and controlled by the operating systems themselves. Apart from a few design choices to make sure that app doesn’t interfere with their functions, there’s nothing that the app developers have to do to support them.

So the most important missing piece in the Kivy framework is to provide support to the screen-reading apps. These apps are designed to provide spoken feedback to what the users select (touch or point) and activate. For example if I hover over a menu-item or select a button on the screen, the app must describe what it is and also say what is does, if possible.

While settings such as zoom, larger font-sizes etc. are already taken care of by the frameworks that Kivy builds upon, we must provide explicit support for the screen readers. Here’s an example of an hello world kivy app and how it is seen by the screen-readers. There is nothing apart from the main windows buttons, i.e. “Close”, “Minimize” and “Zoom” that is available to the VoiceOver program to describe:

There is no meta information available to the screen-readers to describe the interface to the user.

The main challenge here is that none of the Kivy widgets are selectable at this point. Not only that makes it difficult to use them with screen-readers but it is also not possible to navigate them using a keyboard. We must provide a screen-reader support along with our feature on providing focus support for Kivy widgets.

Along with on_focus we must send a descriptive feedback that the screen-readers as a description about the widget. While most of the native OS UI elements have such content descriptors already defined, our kivy widgets lack such information. We must make an effort to make sure that common widgets such as text-inputs, select options etc. have a description consistent with the one natively provided by the platform.

So a major portion of this implementation must be dealt within Kivy itself. While we have previously adopted an approach for implementing the lowest common set of features in other Plyer facades, it would be a good idea to implement the content descriptors (or accessibility attributes) in a way that it covers all our target platforms. Plyer could then take on from there to make the the platform specific calls to support the native screen reading apps. We would need to provide a mechanism to transform these attributes into a form relevant on these platforms separately. We could safely ignore the extra set of attributes at this point.

In this post I have only outlined the first steps that we need to take for making Kivy apps more accessible. I will work on figuring out the finer details about implementing them in the coming weeks. Hopefully this would help in triggering off a meaningful discussion within the community and we can have a chance to listen to other opinions on this as well.

 

This is a two part post. You can continue reading on the topic here — Towards Making Kivy Apps Accessible – 2.