July 9, 2018

Retailer Reduces Costs by Automating Customer Feedback

Key Challenges

   Develop return themes from a static list of predefined categories based on business requirements.
   Detect typing errors and comments that need to be translated.
   Identify and weigh the relative importance of each word in a comment.

Business Case

Our client runs a worldwide chain of retail stores and an online shopping site. The client’s stores sell billions of dollars in first-party products annually. Its retail locations provide a great shopping experience for premium computer products. 

Our client needed to understand why customers were returning retail products. The client also needed to parse customer comments that were gathered in many languages in retail stores across the globe. If recurring patterns in the reasons for customer returns could be identified, the client could better address problems faced by the users of these products. The client would then be able to focus on specific areas where improvements were needed in the first-party product.

Prior to working with MAQ Software, the client kept records of customer comments relating to returns but had not analyzed them for any patterns. MAQ Software worked with the client’s Marketing Operations team and used machine learning models to review the customer data for useful insights.


MAQ Software collected and translated the comments using case statements in SQL, then further preprocessed using Python and R libraries. We then developed a web service to display a word tree map of client return themes.

Key Highlights

   Translated and categorized comments using case statements in SQL.
   Developed a custom Python script to correct typing errors and create a new corrected comments column in the dataset.
   Used R libraries to calculate the frequency and importance of each word.
   Used the Support Vector Machine Algorithms model to identify themes.

MAQ Software imported the required data from the Azure server and selected the columns required for training the model. At the SQL level, we then preprocessed the translated data and categorized the comments into themes based on the rules identified by the business owner. Even after the categorization, there were many comments for which we could not identify themes. We categorized these comments manually to provide a rich data set for training the model.

Once the SQL preprocessing was done, we used Python libraries to clean the comments data. We developed a custom Python script that performed a spell check on the comments and corrected typographical errors. We also used autocorrect libraries based on words in a linked dictionary.

Next, we processed the corrected comments using the text preprocessing libraries in R to identify word frequency and remove words with very high and very low frequencies. We then calculated the term frequency-inverse document frequency (TF-IDF) of each word in the comment to determine the relative importance of each word in the comments. 

We used the preprocessed data set to develop a training model. We split the dataset into a training dataset (used for training the model) and a validation dataset (used for scoring the training model). We used the Support Vector Machine Algorithms model for training purposes. The training was done based on the TF-IDF scores of the comments. We found that this model was able to identify most themes with better accuracy. The model was then finalized and deployed as a web service.

New return comments were sent to the web service, which allowed our client to see themes for the comments along with the accuracy of prediction.

Business Outcome

Using MAQ Software’s return theme model, product managers were able to identify the top reasons for product returns. Customers were returning products either because they had found a lower price elsewhere or because they were experiencing hardware failures.

The returns analysis allowed the client to focus on the specific areas in the pricing and product quality that were causing the most customer dissatisfaction. By pinpointing problem areas, the client was able to save thousands of man-hours and potentially millions of dollars by reducing manufacturing errors and refurbishment costs. 

Outcome Highlights

   Identified the source of most component failures in hardware returns.
   Identified customers’ primary non-hardware-related reasons for returning products.
   Allowed client to focus on the areas where they could most effectively reduce customer returns.

Click here to learn more about our AI or Data expertise.

Our related offers on AppSource

Text Analytics Engine: 2-Week Proof of Concept