Talk: Socially Embedded Search

This week I attended a full house talk by Dr. Meredith Ringel Morris on Socially Embedded Search Engines. Dr. Morris put together a lot of material in her presentation and we (audience) could appreciate how she presented all of it, with great clarity, in just one hour. But I think it would tricky for me to summarize everything in a short post. Do check out Dr. Morris’ website to find out more information on the subject.

Social Search is term for when you pose a question to your friends by using one of the social networking tools (like Facebook, Twitter). There is good chance that you might have already been using “Social Search” without knowing the term for it. So, why would you want to do that instead of using regular search engines that you have access to? It may be simpler to ask your friends at times and they could also provide direct, reliable and personalized answers. Moreover, this is something that could work along with the traditional search engines. Dr. Morris’ work gives some insight into the areas where the search engineers have opportunities in combining traditional algorithmic approaches with social search. She tells us about what kind of questions are asked more in a social search and which types of them are more likely to succeed in getting a useful answer. She goes on further into how the topics for these questions vary with people from different cultures.

I really liked the part about “Search buddies” during the talk. In their paper, Dr. Morris and her colleagues have proposed implanting automated agents that post relevant replies to your social search queries. One type of such an agent tries to figure out the topic for the question and recommends friends who seem to be interested in that area by looking at their profiles. While another one would try to use an algorithmic approach and post a link to a web-page that is likely to contain an answer to the question. It was interesting to know more about how other people reacted to the involvement of these automated agents. While some of the people in the experiment appreciated being referred to for an answer, a lot of them found them obnoxious when they didn’t perform well in identifying the contexts. In her more recent work, Dr. Morris has tried to solve these problems by recruiting real people from Mechanical Turk to answer questions on Twitter. Such an approach could respond to people’s questions in a smarter way by collecting information from a several people. It could then respond to these questions in the form of a polling result and quote the number of people recommending a particular answer. It can also work by taking into account any other replies that the participant would have already received from one of his followers. The automated agent would then present that answer for a opinion poll from the Turkers. Although such a system could provide more intelligent replies than ‘dumb’ algorithms but it may still fail in comparison to responses from your friends which would certainly be more personalized and placed better contextually. During the QnA session, one of audience members raised a question (with a follow-up question by Prof. Kraut)  about comparing these methods with question-and-answer websites such as Quora. While these sites may not provide as personalized results but will certainly do better in drawing the attention of people interested in similar topics. It may not be always possible to find somebody amongst your friends, to answer question on a specialized  topic.

Dr. Morris’ talk provided some really good backing for some of the recent steps taken by search engines like Bing (having ties with both Twitter and Facebook), Google (and the Google plus shebang) and also Facebook (with Graph Search) in this direction. It would be interesting to see how social computing research shapes the future of internet search.

Further Reading

You can find Dr. Morris’ publications on this topic here: http://research.microsoft.com/en-us/um/people/merrie/publications.html

Talk: Understanding Social Dynamics of Emergent Hashtag

This post is about a talk titled, “#Bigbirds Never Die: Understanding Social Dynamics of Emergent Hashtag” by Dr. Yu-Ru Lin in the ISP Colloquium Series. You may browse all such posts under the Talks category in the archives.

Hashtags could be simply defined as words that are a prefixed by a “#” sign. They serve as a means to group meaningful messages together on social media. Twitter (and recently Facebook) makes it possible for users to search for specific hashtags to look at all the relevant posts on a topic. While Twitter wasn’t the first to use this concept, it has unarguably gained more popularity since its use on the micro-blogging site.

Dr. Lin’s research concerns with studying the rise of new hashtags (such as #bigbird) during the 2012 US Presidential Election debates. She presents an analysis on the emergence and evolution of such hashtags and in turn the topics that they represent. Posts were analyzed during the periods when new never-before-used hashtags were created, used and shared by other people.

Since different people may be tweeting on the same topic around the same time, we can have several different candidates (eg. #bigbird, #supportbird, #savebigbird etc.) but a few gain more popularity amongst the fellow tweeters (or twitterers, take your pick!). Dr. Lin and her colleagues put them into two classes: ‘winners’ and ‘also-rans’. A ‘winner’ hashtag is considered to be the one that emerges more quickly and is sustained for longer periods of time.

Now the question to be asked is that what factors are influential in making a hashtag, a ‘winner’? Here are two of the important results from the study:

  • A hashtag is adopted faster when re-tweeted more. It also depends on the size of the audience that gets to read them.
  • More replies and diversity amongst the tweeters using them imply longer persistence.

I think that apart from the results above (which should be studied carefully by people involved in making promotional campaigns etc.), there is a lot more to take back from research like this. It not only gives us insights into the dynamics that come into play on social networks (which may be interesting to the social sciences researchers) but also give us tools and methods to analyze big data. It serves as example data-driven computational and statistical approaches to make sense of the conversations on social networking sites like Twitter.

Further Reading

  1. Y.-R. Lin, D. Margolin, B. Keegan, A. Baronchelli and D. Lazer, #Bigbirds Never Die: Understanding Social Dynamics of Emergent Hashtag, In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM 2013), 2013. Available: http://arxiv.org/pdf/1303.7144v1.pdf

Presentations on the Cloud

Screen Shot 2013-09-19 at 11.51.30 PM
A presentation on Google Drive.
With an old Microsoft Office-y feel.

Like many of you, I have been using the Google Docs (or Google Drive) for a long time. It works just fine when you need to work with a group and have several members contributing to a project. In fact, that is the only application that I use for collaborating on documents and spreadsheets. You sometimes wonder about how did we even manage before the times when it wasn’t possible to edit your documents online.

But when it comes to making presentations online, I haven’t been able to find a very usable solution. I have never found the Google’s interface good enough. It takes some effort and time to get used to so many toolbars inside a browser.

Screen Shot 2013-09-19 at 11.54.39 PM
Keynote on iCloud with a super easy interface.

While I don’t really “create” new presentations on the cloud, but I do tend to edit them quite often and make a lot of changes before presenting. I would recall the points that I should (or shouldn’t 🙂 ) have included at times when I wouldn’t have an access to my computer. Or I’d be using my office computer which has a different operating system or even worse, on mobile.

Keynote on iCloud offers something that seems just right for my needs. It has a super easy to use interface which looks very familiar across devices and has all the features that I frequently use. It is so much more convenient to revise presentations with it. You can seamlessly convert and download your presentations in the format of your choice when you are done. Or, if you don’t depend on the presenter view a lot, you can also play the presentation right from the browser.

I must admit that I am an Apple fan-boy when talking about user interfaces. iCloud not only offers the same desktop-like interface across all devices but presents all of that with very neat designs. Take a look at the home page for iCloud, for example:Screen Shot 2013-09-19 at 11.56.03 PMiCloud has many more things to offer with just as stunning interfaces. I haven’t explored the other available apps since I haven’t really found many use-cases for them. For mail, calendar and contacts I still prefer to use the good old Google with its familiar power-user functions.

URI for me!

A google search with my name yields more than 518,000 results. Nah, I am not popular (I wish!) but it turns out that there are a lot of “Gaurav Trivedi”s in this world. Yes, with the same first name and the last name. A search on Facebook will give you results along with their pictures as well. So I do share my name with loads of “real” people. For the first time I wished that my parents had given me a middle name. It would have been easier to stand out.

Screen Shot 2013-09-10 at 2.33.45 AM
The omniscient google has sympathies for me!

Fortunately, I have been using a systematic strategy to use trivedigaurav as an identifier for myself online (You have been notified now!). For example, this site: www.trivedigaurav.com and on Twitter (@trivedigaurav). Now that I do know that so many others with the same name exist; I have come to a realisation that I’ve been quite lucky to have that ID available for me, specially on popular sites. I am proud of the 5-year old me who could think ahead 😉

But as an aspiring researcher, is this something that I should be concerned about? Would it be a good idea to have a pen-name now that I’d be starting to author more relevant academic writings? Here’s a question on StackExchange that deals with the same problem.


Update 8/24/17:

I got myself an Orcid ID: orcid.org/0000-0001-8472-2139, but haven’t really made use of it yet!

Talk: Intelligent Tutoring Systems

Starting this week, I am adding a new feature on the blog. Every week I’ll be posting something about a talk or a colloquium that I attend. Serves as good talk notes, a writing practice and an assignment all in one full scoop? You bet it does!

The program that I am pursuing, Intelligent Systems Program provides a collaborative atmosphere for both students and faculty by giving them regular opportunities to present their research. It not only helps them gather feedback from others but also introduce their work to the new members of the program (like me!). As a part of these efforts, we have a series of talks called the ISP Colloquium Series.

For the first set of talks from the ISP Colloquium Series this semester, we had Mohammad Falakmasir and Roya Hosseini to present two of their award winning papers, both on Intelligent Tutoring Systems.

1. A Spectral Learning Approach to Knowledge Tracing by Mohammad Falakmasir

For developing intelligent tutoring systems that adapt to the student’s requirements, one would need a way to determine the student’s knowledge of skills being taught. This is commonly done by modeling it based on a couple of parameters. After learning from sequences of students’ responses to a quiz, one could predict the values of these parameters for future questions. This information could then be used to adapt the tutor to keep a pace that students are comfortable with. The paper proposes the use of a Spectral Learning [1] algorithm over other techniques such as Expectation Maximization (or EM) to estimate these parameters that model knowledge. EM is known to be a time consuming algorithm. The results of this paper show that similar or higher accuracy in prediction can be achieved while significantly improving the knowledge tracing time.

To design experiments with this new method, Mohammad and his co-authors analyzed data collected using a software-tutor. This tool was being used for an Introductory programming class at Pitt for over 9-semesters. They could then compare the performance of their new method over EM learning of parameters. They calculated both accuracy of prediction and root mean squared error as metrics for the comparison. Learning data was used from the first semester and tested against the second semester, and they could do this over and over again by learning data from the first-two semesters and predict the results from the third one and so on. This allowed them to back their results that show a time-improvement by a factor of 30(!), with a robust statistical analysis.

2. KnowledgeZoom for Java: A Concept-Based Exam Study Tool with a Zoomable Open Student Model by Roya Hosseini

Roya talks about open student modeling as opposed to a hidden one for modelling the students’ skills and knowledge. In her paper, she goes on to propose that a visual presentation of this model could be helpful during exam preparation. Using it one could quickly review the entire syllabus and identify the topics that need more work. I find it to be a very interesting concept and again something that I would personally like to use.

The authors designed a software tutor called Knowledge Zoom that could be used as an exam preparation tool for Java classes. It is based on a concept-level model of knowledge about Java and Object-oriented programming. Each question is associated with these concepts and specifies the pre-requisites that are needed to answer it. It also gives details on outcome concepts that could be mastered by working on a particular question. The students are provided with a zoom-able tree explorer that visually presents this information. Each node is represented using different sizes and colors that indicate the importance of the concept and the student’s knowledge in that area respectively. Another component of the tool provides students with a set of questions and adaptively recommends new questions. Based on the information from the ontology and indexing of the questions as discussed above, it can calculate how prepared a student is to attempt a particular question.

Evaluation of this method is done using a class-room study where students could use multiple tools (including KZ) to answer Java questions. They do a statistical analysis in comparison to the other tools that the features that KZ introduces. The results demonstrated that KZ helped students to reach their goals faster in moving from easy to harder questions. I was impressed by the fact that on top of these results, the authors decided to back it up with a subjective analysis by the students. Students preferred KZ over others by a great margin. They also received valuable feedback from them during this analysis.

While these tutors can currently support only concept-based subjects like programming and math where one could do by testing with objective-styled questions, the fact that we can intelligently adapt to a student’s pace of learning, is something that is really promising. I wish I could use some of these tools for learning my courses!

Footnotes

  1. You can find out more about spectral learning algorithms here: http://www.cs.cmu.edu/~ggordon/spectral-learning/. ^

Futher Reading

  1. M. H. Falakmasir, Z. A. Pardos, G. J. Gordon, P. Brusilovsky, A Spectral Learning Approach to Knowledge Tracing, In Proceedings of the 6th International Conference on Educational Data Mining. Memphis, TN, July 2013. Available: http://people.cs.pitt.edu/~falakmasir/images/EDMPaper2013.pdf
  2. Brusilovsky, P., Baishya, D., Hosseini, R., Guerra, J., & Liang, M.,“KnowledgeZoom for Java: A Concept-Based Exam Study Tool with a Zoomable Open Student Model”, ICALT 2013, Beijing, China. Available: http://people.cs.pitt.edu/~hosseini/papers/kz.pdf

How about collaboration?

My previous post on Computers and Chess, serves as a good prologue to this.

watson
That’s me geeking out at the Jeopardy stage setup.

A little more than two years ago, the IBM Watson played against and defeated the previous champions of Jeopardy!, the TV game show in which the contestants are tested on their general knowledge with quiz-style questions.[1] I remember being so excited while watching this episode that I ended up playing it over and over again, only to have the Jeopardy jingle loop in my head for a couple of days! Now, this is a much harder challenge for the computer scientists to solve than making a machine play chess.

Computers have accomplished so many things that we thought that only humans could do (play chess and jeopardy, drive a car all by itself …). While these examples are by no means small problems that we have solved, we still have a long way to go. While it can solve problems that we as humans often find difficult (such as playing chess, calculating 1234567890 raised to the power 42 etc.), it cannot* do a lot of things that you and I take for granted. For example, it can’t comprehend this post as well as you do (Watson may not be able to answer everything), read it out naturally & fluently (Siri still sounds robotic) and make sense of the visuals on this page (and so on). *At least not yet.

Computers were designed as tools to help us with calculations or computations. By this very definition, are computers are inherently better at handling certain types of problems while in others they fail? Well, we have no answer [2] to this question now and I at least hope that it isn’t in affirmative so that someday we can replicate human intelligence. As we have seen in the past, we certainly can not say that “X” is something that computers will never be able to do. But we can sure point out the areas in which the researchers are working hard and hoping to improve.

Here’s a video that talks about the topic that I am hinting at. While I promise not to post many TED talks in future, you can be sure of finding this central idea (the first half of the talk) as a common theme on this blog. Also, I prefer the word “Collaboration” over “Cooperation” [3] :

TLDR Let’s not try to solve big problems solely with computers. Make computers do the boring repetitive work and involve humans for providing creative inputs or heuristics for the machines. Try to improve interfaces that make this possible.

Although this was an idea envisioned in "Man-Computer Symbiosis" (Licklider J. C. R., 1960) more than half-a-century ago, researchers seem to have not given due importance to it when [4] the computers failed to perform as well as expected. Of course, more the number of “X”s that the computers are able to do by themselves, the more it frees us to do whatever we do best. When we do look around and observe the devices that we use and how we interact with the machines everyday, we seem to have knowingly or unknowingly progressed in the direction shown by Licklider. With the furthering of research in areas such as Human Computing, Social Computing, and (the new buzzword) Crowd-sourcing, the interest shown in such ideas has never been greater.

References

  1. Licklider J. C. R. (1960), Man-Computer Symbiosis. IEEE. Available: http://groups.csail.mit.edu/medg/people/psz/Licklider.html.

Footnotes

  1. More about Watson from IBM here. See also, Jeopardy vs. Chess. ^
  2. Amazon’s Mechanical Turk does talk about “HITs” or Human Intelligence Tasks ^
  3. In AI terms, it would indeed be multi-agent co-operation but then again we are not treating humans just as agents in this case. ^
  4. AI Winter: http://en.wikipedia.org/wiki/AI_winter ^

Computers and Chess

Deep Blue vs Kasparov '96 Game 1
Deep Blue vs. Kasparov: 1996 Game 1. Deep Blue won this game but Kasparov went on to win the match by 4-2. In the 1997 re-match, however, Deep Blue won 3½–2½.

To design an algorithm for playing the game of chess has been one of the challenges that has attracted the attention of many mathematicians and computer scientists. The sheer number of combinatorial possibilities make it hard to predict the result for both humans and computers alike. There have been many highly publicized games pitting humans against the (super) computers in the ’90s and ’00s, such as the Deep Blue vs. Kasparov one.

It was around the same time that I was starting out with chess and was interested in learning how to play better. My father had gifted me a copy of a computer game called Maurice Ashley Teaches Chess. It included playing strategies, past-game analysis and video coaching by the chess grandmaster Maurice Ashley. It also had a practice mode where you could compete and play against the computer. I didn’t end up being a good chess player but if my memory serves me right, it did not take me long to start beating the in-game AI. But things have changed a lot since then. Computers are not only faster and more powerful now (to explore more number of moves) but are also equipped with better algorithms to evaluate a decision. Let’s compare excerpts from the introductory chapters from two of my textbooks:

From "Cognitive Psychology" (Medin et.al., 2004):

The number of ways in which the first 10 moves can be played is on the order of billions and there are more possible sequences for the game than there are atoms in the universe! Obviously neither humans nor machines can determine the best moves by considering all the possibilities. In fact, grandmaster chess players typically report that they consider only a handful of the possible moves and “look ahead” for only a few moves. In contrast, chess computers are capable of examining more than 2,000,000 potential moves per second and can search quite a few moves ahead. The amazing thing is that the best grandmasters (as of this writing) are still competitive with the best computers.

Now consider, "Artificial Intelligence: A Modern Approach (3rd Edition)" (Russell et.al., 2010):

IBM’s DEEP BLUE became the first computer program to defeat the world champion in a chess match when it bested Garry Kasparov by a score of 3.5 to 2.5 in an exhibition match (Goodman and Keene, 1997). Kasparov said that he felt a “new kind of intelligence” across the board from him. Newsweek magazine described the match as “The brain’s last stand.” The value of IBM’s stock increased by $18 billion. Human champions studied Kasparov’s loss and were able to draw a few matches in subsequent years, but the most recent human-computer matches have been won convincingly by the computer.

So, what happened in the six year gap between the publishing of these books? It turns out that there has indeed been such a shift in the recent years. The computers’ superior performance stats can be seen on this Wikipedia entry. We have come a long way since the Kasparov vs. Deep Blue matches due the the advancements in both hardware and AI algorithms. Computers have now started not only wining but dominating in the human-computer chess matches so much so that even mobile phones running slower hardware are reaching Grandmaster levels. Guess, time’s right for switching to new board games! Btw, Checkers is a solved problem since 2007: http://www.sciencemag.org/content/317/5844/1518.full! It will end up in a draw (they have a computational proof of that) if both players use the perfect strategies, i.e. the one that never loses.

Image Credits: en:User:Cburnett / Wikimedia Commons / CC-BY-SA-3.0 / GFDL

References

  1. Russell et.al. (2010), Artificial Intelligence: A Modern Approach (3rd Edition), 49. Prentice Hall. Available: http://www.amazon.com/Artificial-Intelligence-Modern-Approach-Edition/dp/0136042597.
  2. Medin et.al. (2004), Cognitive Psychology. Wiley. Available: http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/0471458201.