IBM Watson Hackathon

•

Inspiration

We'd like to get out a little more, intellectually speaking. To make that happen, we wanted to show Red Ventures the value of cognitive computing by performing research at IBM's Watson Hackathon (http://www.ibm.com/smarterplanet/us/en/ibmwatson/watson-hackathon.html). Once Red Ventures gets the potential, we'll have a chance to lead the way.

Goal

We aimed to augment and optimize conversations between customers and sales professionals using natural language processing and the Watson Developer Cloud. More specifically, we set out to explore chat data and dig up insights. Moving forward, we'll turn those insights into tools and recommendations for our sales teams.

Team

+ Paul Prae

+ Renato Pereyra

+ Nathan Johnson

+ Thomas Bailey

How it works

First, we used the Alchemy API (http://www.alchemyapi.com/) to enrich thousands of sales chat logs with linguistic metrics (sentiment, key words, etc). Second, we looked for correlation between linguistic metrics and successful chat outcomes.

Challenges we ran into

+ Alchemy API rate limits
+ Formatting data for Alchemy analysis
+ Data visualization (some representations hide important patterns)

Accomplishments that we're proud of

+ Verified three out of three hypotheses
+ Discovered trends that can be acted on to increase chat agent success

What we learned

We verified these hypotheses:

+ Chats with positive sentiment generate more follow-up calls from customers
+ Agents who talk like the customer generate more follow-up calls
+ Keywords from sales agent speech are good for predicting call outcomes

Visualizations of Watson Hackathon Results

We set forth on this trip to test specific hypotheses. Through the testing of these hypotheses, we turned raw data into insights. One powerful thing about natural language processing is the ability to scale out insight across many conversations simultaneously. During this hackathon, we were able to automate linguistic insight with software to perform analysis on over 3,000 chat conversations. We verified all three hypotheses we tested (though we still need to measure things like statistical significance and error rates).

On this page, you will find some visualizations we used to better understand the results of our experiments. The scripts and modules we wrote for our hack mostly involved data transformation and analysis. To explore and understand the output, we visualized our results with Excel, R, and Tableau.

Chats with positive sentiment generate more follow-up calls from customers

We used Alchemy API's Sentiment Analysis (http://www.alchemyapi.com/api/sentiment-analysis) for this portion. "Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document" (http://en.wikipedia.org/wiki/Sentiment_analysis). We made sure to distinguish between the agent and the customer in the conversation. Here is a visualization of our results using Tableau:

Some insights:
+ In general, you can see that the higher the sentiment rating for both the agent and the customer, the more successful the conversation. This is shown by the many chats in the top right quadrant.
+ We can also see a drop off in success when the agent is too positive. It seems if the agent is over zealous, the chats are less successful. The sweet spot seems to be between a 0.3 and 0.5 agent sentiment.
+ The agent's sentiment is consistently positive but the customer's can vary greatly. This makes sense since our agent's are trying to make a sale and need to keep a positive vibe. The customer does not have to be positive.
+ Also notice how the customer's sentiment is commonly around zero. Zero reflects a neutral sentiment. The customer may just want to get down to business in these cases, showing no emotion.

Agents who talk like the customer generate more follow-up calls

In this case, we wanted to see what happens when the agent talks like the customer. To do this, we compared the vocabulary, or word choice, of both sides. An example of an agent matching the vocabulary of a customer would be if the agent repeated a question back to the customer for clarification. Our analysis clearly shows a benefit when an agent reflects the vocabulary of the customer.

As advised by the computational linguist we brought along, the measurement we chose to compare vocabulary with was cosine similarity (http://en.wikipedia.org/wiki/Cosine_similarity). Since we were doing our processing in Node.js, we were happy to find a package to do just this type of measurement: https://www.npmjs.com/package/cosine. Once the data was processed, we did a visual analysis of our results and created the following graph in Excel:

Some insights:

+ We can see the success ratio increase as we move right on the graph. This shows a strong positive trend with the greatest increase in success occurring between 0.2 and 0.4 similarity.
+ According to this graph, the more similar both sides of the conversation are to each other the better for our metrics. This probably won't hold true at extremely high values like 1.0 where the agent would literally be echoing the customer. We need to experiment with more data on the higher side of the similarity measure to see where success drops off.