Sentiment analysis is a machine learning method used to evaluate the emotional tone that is expressed in a series of written words. In marketing analysis, it is an important tool used to gauge the levels of satisfaction associated with products and services that have been purchased and then reviewed by the targeted customer base. Most typically, this form of analysis will focus on customer satisfaction reviews associated with the reliability of a particular product that has been purchased at a retail outlet, or perhaps the quality of service offered by a specifically-named business. However, it is rarely the case that sentiment analysis has been used to generate metrics which permit comparisons of customer satisfaction between business sectors.
During this study, AlgoTactica used sentiment scoring algorithms to analyze 43000 online customer satisfaction reviews. These were associated with the restaurant, real estate, and automotive business sectors, as well as the professional services sector which spanned several practice areas, including legal, medical, and dental. The reviews were associated with businesses in a large North American city that has a population nearing several million. Four separate natural language processing algorithms for sentiment scoring were used, with each one based on its own unique mathematical formulation and available from the Python software platform. For each review, the text was processed to extract four individual sentiment scores, which were then averaged. In association with the review text, there was also a numeric grade of one through five stars, assigned by the reviewer. This value was paired with the sentiment score, to produce a final bivariate feature vector for each review.
The Sentiment vs. Star Rating plot matrix at the top right displays the relationship between computed sentiment score and reviewer-assigned star rating for each business sector. The solid line shows the y-axis median value of sentiment plotted against the x-axis value of star rating. For all business sectors, there is an upwards trend of median sentiment in proportion to the first four rating levels; however, there is a smaller difference in sentiment scores between ratings of four and five stars. This suggests that reviewers might have some difficulty in distinguishing between the last two levels, and that a scale involving only four stars might be better. In the boxplots, there is an overall high degree of variance involving both negative and positive levels of sentiment for star ratings of one through three, suggesting some cases in which reviewers use positive tones, but still assign a low rating. For ratings of four and five stars, three of the sectors show very low variance and only positive sentiment, but the exception to this is automotive, which experiences a comparatively higher variance and some negative sentiment at four stars.
Density graphs of sentiment scores are shown in the Sentiment Distribution plot matrix at the middle right. While there are distributional differences when comparing between the four sectors, for all cases it is seen that the majority of expressed sentiment is positive, with a peaked concentration narrowly centered around the 0.5 value. Differences between the sectors can be visualized by computing the total area under the curve associated with the negative sentiment range, and then dividing that by the total under-curve area associated with the positive range, to produce a negative-to-positive area ratio.
The Negative-to-Positive Area Ratio boxplots, shown at bottom right, illustrate the findings for each sector. These were constructed by bootstrap sampling to produce an ensemble of multiple density curves for each sector, and then computing the area ratio associated with each individual curve. It is clearly evident that the automotive services sector has a ratio of negative-to-positive sentiment that is much higher than for the remaining three business sectors. From this it is straightforward to conclude that customers are significantly less satisfied with the quality of services they are purchasing from this sector, when compared to their expressed sentiment about services bought from the other three sectors.