Social Media Analytics for Higher Sales Conversion Rates
Social data analytics has recently gained esteem in predicting the future outcomes of important events like major political elections and box-office movie revenues. Related actions such as tweeting, liking, and commenting can provide valuable insights about consumer’s attention to a product or service. Such an information venue presents an interesting opportunity to harness data from various social media outlets and generate specific predictions for public acceptance and valuation of new products and brands. This new technology based on gauging consumer interest via the analysis of social media content provides a new and vital tool to the sales team to predict sales numbers with a great deal of accuracy.
This use case focuses on forecasting product sales based on social media and time-series analysis. We present a predictive model of product sales using sentiment and consumer reactions gathered from social media over time periods. Our predictive model illustrates how different time scale-based predictors derived from sentiment can improve the prediction of future sales. The widespread belief that social media data was simply too noisy and too biased, to accurately correlate with sales data was thankfully proven wrong using efficient Artificial Intelligence (AI) models. We developed a unique process that collects relevant data from influential social media outlets and used state of the art machine learning algorithms to predict sales with state-of-the-art accuracy.
The ultimate goal is to develop an accurate estimate of product sales, prior to product releases, to provide sales teams with valuable knowledge of its potential uptake by the target group. Another objective is to enable better decision making on sales strategies, e.g. definition of optimal product quantities needed for relevant launches in different sales regions. An interesting case that can be detected via social media is when there is a negative feedback that can hinder the business from earning leads. To manage this particular case and other similar situations it is therefore necessary for companies to have access to authentic feedback from potential customers in order to react on a timely manner, either by finding a way to satisfy customers, or by improving product quality.
In addition to predicting future product success or failure, the model can be easily configured to provide a detailed map of consumer satisfaction with an already launched product. Other criteria related to consumer demographics such as geographic location and age group can also be extracted and studied to build better sales strategy and targeted marketing campaigns.
The adopted approach is to collect customer sentiments data via social media analytics to train a Machine Learning (ML) model that predicts the evolution of the commercial product or service. The proposed model predicts the success of failure of commercial products/services and highlights the most important trends based on sentiment analysis of social media feedbacks. It aims at helping the sales team to improve or develop new sales strategies to increase customer loyalty and retention. In addition, the tool can help to detect false information and to protect the business brand and reputation. Here are the main steps taken towards building the predictive model:
- Extract data from social media (e.g. posts, comments, reactions…etc.)
- Analyze sentiments of social media feedback
- Generate datasets from Facebook, Instagram, and twitter
- Predict the impact of those sentiments on future product performance
First step consists of extracting data including posts, comments, and reactions from social media, namely Twitter, Facebook, and Instagram through web scraping and relevant APIs. Second step involves preprocessing the extracted data by applying a proprietary sentiment analysis algorithm and using well-known lexicon and rule-based libraries that are specifically attuned to sentiments expressed in social media. A dictionary of lexical features is used to score sentiments with a set of five heuristics. Lexical feature in this context refers to anything used for textual communication including words, emoticons like “:-)”, acronyms like “LOL”, and slang like “meh”. These colloquialisms get mapped to intensity values in order to associate a numerical value to each lexical feature. Lexical features are not the only things in the sentence which affect the sentiment. There are other contextual elements, like punctuation, capitalization, modifiers, and conjunctions that also impact the emotion. All these details are accounted for in the set of five heuristics. The effect of these heuristics is quantified using human raters in well documented processes that showed exceptional efficiency when analyzing the sentiment of movie reviews and opinion articles. After extracting data and applying the sentiment analysis algorithm to it, the next step in the methodology is to generate a dataset that include the percentage of positive, neutral, and negative feedbacks by a specific period, also dubbed timestamp. The developed model is flexible. It can generate a dataset with different time stamps including months, weeks, days, minutes, seconds, or any arbitrary timestamp for that matter. A dataset is defined by a name, start date, end date, and timestamp.
Recall that the mission of the predictive model that we built – as explained in the previous section – is to estimate the evolution of commercial products/services based on the sentiment analysis of feedbacks from social media. We developed and tested several Machine Learning (ML) based algorithms and trained them using social media data that we collected and cured following the rigorous process detailed earlier. To account for seasonal fluctuations in sales, the model uses the technique of time series forecasting to insure a steady accurate prediction. Here are the sequential steps to be followed during the prediction process:
- Select a dataset of interest
- Train all Machine Learning (ML) algorithms with the given dataset
- After convergence of training algorithms, the model will select the Machine Learning (ML) based algorithms that provides the best accuracy
- Using the best algorithm identified in the previous step, predict the total sales revenues of the product of interest
We followed this prediction process to forecast the total revenues generated by the sales of Big Mac meal of McDonald’s chain. First part of the process is to build a dataset to train the model and to gauge its accuracy on historical data before going live. The dataset is divided in two parts:
- The first part is based on features defined by people feedback on social media. These features contain the percentage of positive, negative, and neutral feedback of people for a specific time span and timestamp defined by the user.
- The second part includes sales or turnover following the same timestamp of the first feature. This feature will contain the sales provided by the customer.
Once the dataset is formed, it will be used to train the Machine Learning (ML) based algorithms. For the Application of McDonald’s Big Mac, we developed a small data set (15 rows) which contains the sales of McDonalds starting from January 2016 until March 2020. The predicted sales (average mean) are displayed every three months. We have been predicting the next average sales for the coming three months after March 2020. The table below shows the performance of each Machine Learning (ML) algorithm that we tested including the best algorithm with the highest accuracy.
Table 1: Performance of various Machine Learning (ML) algorithms based on social media dataset. The Bagging algorithm performed best and predicted sales revenues for the Big Mac of McDonald’s in the amount of $6.023M
We linked all the studied models for Facebook, Instagram, and twitter and created a desktop application where the user selects the parameters of the dataset including start and end date along with time period to get an estimate of the sales revenues for any product the user aims to forecast.
Conclusions and recommendations
This use case enumerated the steps needed to build a Machine Learning (ML) model based on social media content to predict the sales of commercial products. All results presented in this case study are based on sentiment analysis of social medias feedbacks. We built a desktop app that can select the optimal Machine Learning (ML) algorithm and provides a prediction of a given product sales with an accuracy approaching 90%. We are currently studying the effect of adding sales information from the competition to improve the model accuracy.
Our positioning and mission statement: “Mastering the Digital Age!” We are an integrated management and technology consulting group and offer the conception and implementation of holistic digitization solutions in an integrated manner. In combination, you will receive strategic advice from us on the digitization of your company from real thought leaders with profound methodological knowledge, extensive transformation experience from successful IT system integrators with many years of industry experience and a high level of technological understanding, as well as deep expertise in generating value from data – the new gold of the 21st century – with the help of the latest artificial intelligence (AI) and cloud computing technologies. All this End2End from a single source for the entire transformation process! Find out more about us here.
Did we spark your interest?
Are you facing similar challenges in your company? Do you have any questions about the case study described? If you are interested in an in-depth discussion, then write us an email or give us a call. We would be happy to present our consulting products and digital solutions to you in a personal discussion. We are looking forward to meet you!
Dr. Mokhtar Sadok
Partner, Business Analytics / Artificial Intelligence / Big Data Practice
Strategy & Transformation Consulting