Talk: Look! New stuff in my room!

Yesterday, I attended a talk by Prof. David Forsyth. One of the perks of being a student is to be able to attend seminars like these. The talk was mostly about his work on understanding pictures of rooms and inserting objects into them. It was a light talk and he did not go too much into the details of his paper. Apart from that he gave an overview of the current work done by the computer vision researchers and his vision (pun intended) for the future. Overall, it was a fun talk to attend and a Friday evening well spent :D.

Here is a video showing a demo of the method, in case you are curious:

Talk: Understanding Storytelling

This week I attended a very interesting talk by Dr. Micha Elsner. Yes, this was one of those full-house ISP seminars. I was glad that I reached the venue a bit earlier than the usual. Dr. Elsner started his talk by giving us an overview of the bigger goals he is looking at. His work is helping us formally understand storytelling and develop computational methods for it. If you have ever used Auto Summarize in Word, you’ll have an intuitive idea about how it works: It finds sentences with frequently used words to make a summary of the document. It can generate satisfactory summaries for articles that merely state some facts, but would fail miserably in trying to understand and summarize a story.

Dr. Elsner’s approach focuses on observing social relationships between characters as the story unfolds, to understand the high level plot. He uses two basic insights about common plots in a story: a) it has an emotional trajectory, i.e. over time, we see a variation in negative and positive emotions, and b) characters interact with each other and have a social network just like in real life.

To begin his analysis, Dr. Elsner would first parse the text to identify characters from the noun phrases in the sentences. This step itself is not an easy one. For example, one character may be referred to by several different names through the chapters like – Miss Elizabeth Bennet, Miss Bennet, Miss Eliza, Lizzy and so on. Once we have that, we could try understanding the relationships between different characters over course of time. Simple functions measuring nearby mentions (co-occurrence) of the characters and their emotional trajectory curves are used to build a complex similarity measure. Emotion trajectory is plotted by finding words with “strong sentiment” cues. This makes up the first-order character kernel for measuring similarity. Now, he adds social network features to build the second order kernel. Characters are more similar if they each have close friends who are also similar.

I think that the method for testing the similarity the proof of concept was also an ingenious one. Dr. Elsner artificially re-orders the chapters of a book, and attempts to distinguish it from the one in the original form. Success here would imply that we indeed been able to gather some understanding about a plot by using this method. A corpus of novels from Project Gutenberg is used as a training data for this purpose. Do go through the links in the section below to find out more!

Further Reading

  1. Micha Elsner. Character-based Kernels for Novelistic Plot Structure. Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France. Available: http://aclweb.org/anthology-new/E/E12/E12-1065.pdf
  2. Presentation slides are also available on Dr. Elsner’s page: http://www.ling.ohio-state.edu/~melsner/slides/novelpres.pdf

Talk: ISP Seminar

Turn-Taking Behavior in a Human Tutoring Corpus by Zahra Rahimi

In their research, Zahra and Homa, analyze turn-taking behavior between students in a human-human spoken tutoring system. This analysis could be helpful in understanding how users from different demographics interact with a tutor. In this study, they use sequences of speech and silence over time to mark ‘Active’ and ‘In-active’ states the dialogues between the tutor and the student. Considering both the tutor and student together we have four different combinations of these states, in which each one of them being active or inactive. The next step is to learn (using a semi-Markov process) a model from the dialogues. Using this model, they are able to measure the association of these models with features such as gender, scores obtained in the pre-test etc. The experiments provide some interesting results such as female students speak simultaneously longer with the tutor than male students; while their activities are less than their male counterparts. Also, for the students with a lower pre-test scores, the tutor tended to speak for longer time.

Content-Based Cross-Domain Recommendations Using Segmented Models by Shaghayegh Sahebi

Sherry presented her work on the job recommendation systems in her talk. This was done as part of her internship at LinkedIn last summer. The site originally used a single model to make job recommendations to the users by selecting features from their profiles. But, these profiles tend to vary a lot according to the job function the users play and the industry they are in. Professionals in academia, for example, may put a very different set of information on their resume as opposed to a banking executive. With this new study, they wish to segment users using these very features (current job function and industry etc.) before sending them to the recommender systems. This allows them to develop an efficient method of feature augmentation and adapt their algorithms.

The model was built and evaluated based on some pre-collected data. They evaluated the accuracy of the system in recommending the jobs that the users applied to. This, however, restricted them to a certain extent and an online A/B testing is still under process. We’ll have to wait and watch for the results to find out if they do better than the one-size-fits-all model that is currently in place.

Further Reading

  1. Z. Rahimi, Homa B. Hashemi “Turn-Taking Behavior in Human Tutoring Corpus.” AIED 2013. Available: http://link.springer.com/chapter/10.1007%2F978-3-642-39112-5_111

Talk: Socially Embedded Search

This week I attended a full house talk by Dr. Meredith Ringel Morris on Socially Embedded Search Engines. Dr. Morris put together a lot of material in her presentation and we (audience) could appreciate how she presented all of it, with great clarity, in just one hour. But I think it would tricky for me to summarize everything in a short post. Do check out Dr. Morris’ website to find out more information on the subject.

Social Search is term for when you pose a question to your friends by using one of the social networking tools (like Facebook, Twitter). There is good chance that you might have already been using “Social Search” without knowing the term for it. So, why would you want to do that instead of using regular search engines that you have access to? It may be simpler to ask your friends at times and they could also provide direct, reliable and personalized answers. Moreover, this is something that could work along with the traditional search engines. Dr. Morris’ work gives some insight into the areas where the search engineers have opportunities in combining traditional algorithmic approaches with social search. She tells us about what kind of questions are asked more in a social search and which types of them are more likely to succeed in getting a useful answer. She goes on further into how the topics for these questions vary with people from different cultures.

I really liked the part about “Search buddies” during the talk. In their paper, Dr. Morris and her colleagues have proposed implanting automated agents that post relevant replies to your social search queries. One type of such an agent tries to figure out the topic for the question and recommends friends who seem to be interested in that area by looking at their profiles. While another one would try to use an algorithmic approach and post a link to a web-page that is likely to contain an answer to the question. It was interesting to know more about how other people reacted to the involvement of these automated agents. While some of the people in the experiment appreciated being referred to for an answer, a lot of them found them obnoxious when they didn’t perform well in identifying the contexts. In her more recent work, Dr. Morris has tried to solve these problems by recruiting real people from Mechanical Turk to answer questions on Twitter. Such an approach could respond to people’s questions in a smarter way by collecting information from a several people. It could then respond to these questions in the form of a polling result and quote the number of people recommending a particular answer. It can also work by taking into account any other replies that the participant would have already received from one of his followers. The automated agent would then present that answer for a opinion poll from the Turkers. Although such a system could provide more intelligent replies than ‘dumb’ algorithms but it may still fail in comparison to responses from your friends which would certainly be more personalized and placed better contextually. During the QnA session, one of audience members raised a question (with a follow-up question by Prof. Kraut)  about comparing these methods with question-and-answer websites such as Quora. While these sites may not provide as personalized results but will certainly do better in drawing the attention of people interested in similar topics. It may not be always possible to find somebody amongst your friends, to answer question on a specialized  topic.

Dr. Morris’ talk provided some really good backing for some of the recent steps taken by search engines like Bing (having ties with both Twitter and Facebook), Google (and the Google plus shebang) and also Facebook (with Graph Search) in this direction. It would be interesting to see how social computing research shapes the future of internet search.

Further Reading

You can find Dr. Morris’ publications on this topic here: http://research.microsoft.com/en-us/um/people/merrie/publications.html

Talk: Understanding Social Dynamics of Emergent Hashtag

This post is about a talk titled, “#Bigbirds Never Die: Understanding Social Dynamics of Emergent Hashtag” by Dr. Yu-Ru Lin in the ISP Colloquium Series. You may browse all such posts under the Talks category in the archives.

Hashtags could be simply defined as words that are a prefixed by a “#” sign. They serve as a means to group meaningful messages together on social media. Twitter (and recently Facebook) makes it possible for users to search for specific hashtags to look at all the relevant posts on a topic. While Twitter wasn’t the first to use this concept, it has unarguably gained more popularity since its use on the micro-blogging site.

Dr. Lin’s research concerns with studying the rise of new hashtags (such as #bigbird) during the 2012 US Presidential Election debates. She presents an analysis on the emergence and evolution of such hashtags and in turn the topics that they represent. Posts were analyzed during the periods when new never-before-used hashtags were created, used and shared by other people.

Since different people may be tweeting on the same topic around the same time, we can have several different candidates (eg. #bigbird, #supportbird, #savebigbird etc.) but a few gain more popularity amongst the fellow tweeters (or twitterers, take your pick!). Dr. Lin and her colleagues put them into two classes: ‘winners’ and ‘also-rans’. A ‘winner’ hashtag is considered to be the one that emerges more quickly and is sustained for longer periods of time.

Now the question to be asked is that what factors are influential in making a hashtag, a ‘winner’? Here are two of the important results from the study:

  • A hashtag is adopted faster when re-tweeted more. It also depends on the size of the audience that gets to read them.
  • More replies and diversity amongst the tweeters using them imply longer persistence.

I think that apart from the results above (which should be studied carefully by people involved in making promotional campaigns etc.), there is a lot more to take back from research like this. It not only gives us insights into the dynamics that come into play on social networks (which may be interesting to the social sciences researchers) but also give us tools and methods to analyze big data. It serves as example data-driven computational and statistical approaches to make sense of the conversations on social networking sites like Twitter.

Further Reading

  1. Y.-R. Lin, D. Margolin, B. Keegan, A. Baronchelli and D. Lazer, #Bigbirds Never Die: Understanding Social Dynamics of Emergent Hashtag, In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM 2013), 2013. Available: http://arxiv.org/pdf/1303.7144v1.pdf

Presentations on the Cloud

Screen Shot 2013-09-19 at 11.51.30 PM
A presentation on Google Drive.
With an old Microsoft Office-y feel.

Like many of you, I have been using the Google Docs (or Google Drive) for a long time. It works just fine when you need to work with a group and have several members contributing to a project. In fact, that is the only application that I use for collaborating on documents and spreadsheets. You sometimes wonder about how did we even manage before the times when it wasn’t possible to edit your documents online.

But when it comes to making presentations online, I haven’t been able to find a very usable solution. I have never found the Google’s interface good enough. It takes some effort and time to get used to so many toolbars inside a browser.

Screen Shot 2013-09-19 at 11.54.39 PM
Keynote on iCloud with a super easy interface.

While I don’t really “create” new presentations on the cloud, but I do tend to edit them quite often and make a lot of changes before presenting. I would recall the points that I should (or shouldn’t 🙂 ) have included at times when I wouldn’t have an access to my computer. Or I’d be using my office computer which has a different operating system or even worse, on mobile.

Keynote on iCloud offers something that seems just right for my needs. It has a super easy to use interface which looks very familiar across devices and has all the features that I frequently use. It is so much more convenient to revise presentations with it. You can seamlessly convert and download your presentations in the format of your choice when you are done. Or, if you don’t depend on the presenter view a lot, you can also play the presentation right from the browser.

I must admit that I am an Apple fan-boy when talking about user interfaces. iCloud not only offers the same desktop-like interface across all devices but presents all of that with very neat designs. Take a look at the home page for iCloud, for example:Screen Shot 2013-09-19 at 11.56.03 PMiCloud has many more things to offer with just as stunning interfaces. I haven’t explored the other available apps since I haven’t really found many use-cases for them. For mail, calendar and contacts I still prefer to use the good old Google with its familiar power-user functions.

URI for me!

A google search with my name yields more than 518,000 results. Nah, I am not popular (I wish!) but it turns out that there are a lot of “Gaurav Trivedi”s in this world. Yes, with the same first name and the last name. A search on Facebook will give you results along with their pictures as well. So I do share my name with loads of “real” people. For the first time I wished that my parents had given me a middle name. It would have been easier to stand out.

Screen Shot 2013-09-10 at 2.33.45 AM
The omniscient google has sympathies for me!

Fortunately, I have been using a systematic strategy to use trivedigaurav as an identifier for myself online (You have been notified now!). For example, this site: www.trivedigaurav.com and on Twitter (@trivedigaurav). Now that I do know that so many others with the same name exist; I have come to a realisation that I’ve been quite lucky to have that ID available for me, specially on popular sites. I am proud of the 5-year old me who could think ahead 😉

But as an aspiring researcher, is this something that I should be concerned about? Would it be a good idea to have a pen-name now that I’d be starting to author more relevant academic writings? Here’s a question on StackExchange that deals with the same problem.


Update 8/24/17:

I got myself an Orcid ID: orcid.org/0000-0001-8472-2139, but haven’t really made use of it yet!