October 21, 2024

Automating product feature categorization with AI for an automotive components manufacturer

  













 

About our client

Our client is a leading provider of innovative cockpit electronics and electrification solutions. With a global presence and a focus on cutting-edge technology, they design, manufacture, and sell multiple automobile components.

They required an automated system to streamline the classification of product feature descriptions provided by car manufacturers into specific primary and secondary disciplines. The need arose from the increasing complexity and volume of textual data. These factors made manual classification both time-consuming and prone to errors.


The issue at hand

·    Manual classification effort:
Previously, the client relied on a manual process to classify textual product features provided by manufacturers. This process was labor-intensive, prone to human error, and consumed a significant amount of time given the large volume of data.

·    Scalability issues:
As the volume of product feature data grew, the manual classification system became unsustainable. The client faced difficulties in scaling the process to accommodate increasing amounts of data without hiring additional resources.

·    Inconsistent accuracy:
Due to the complexity of the product features and the subjective nature of manual classification, the accuracy of the classification was inconsistent. This variability led to inefficiencies in product management and data analysis.


How we stepped in with an innovative AI solution

Our client needed to automate the classification of product features (textual statements from car manufacturers) into predefined primary and secondary disciplines. To address the client's needs, an advanced NLP and ML system was developed to categorize product features into disciplines using the following workflow:

1.      Data preparation:

·       Gathered historical textual data with corresponding labels.

·       Preprocessed text data by cleaning, tokenizing, and generating embeddings using OpenAI models to ensure a structured format for machine learning models.

·       Addressed imbalances in the dataset using data balancing techniques and by assigning appropriate class weights.

2.      Model development and training:

·       Developed and trained a machine learning model to classify product features, aiming for a high accuracy of 95%.

·       Azure Synapse was used to store and process the training data.

·       Text classification models were serialized for future use, ensuring that the trained model configurations could be easily deployed.

3.      Deployment and inference:

·       The trained model was stored in Blob Storage for future retrieval.

·       The model was used to classify new product features, and predictions were stored back in Blob Storage for analysis and further use.

·       The system allowed for easy retrieval of prediction results, stored in CSV format, using Azure Storage Explorer.


About the solution flow

Figure 1: Components of the AI-driven solution

Data collection and storage:

·       Historical data was collected and stored in Blob Storage using Azure Storage Explorer.

·       The storage system allowed for seamless retrieval of training data when required for model building.

Data retrieval and preprocessing:

·       Data was retrieved from Blob Storage to Azure Synapse using Jupyter Notebooks for preprocessing.

·       Preprocessing steps included cleaning, aggregation, and lemmatization of text data, ensuring it was suitable for machine learning models.

Data transformation:

·       Text data was transformed into tokenized vectors and numeric labels for input into machine learning models.

·       Embeddings were generated using advanced NLP techniques, enabling the system to understand and classify textual data effectively.

Model training and evaluation:

·       Various machine learning models were trained, tuned, and evaluated for performance.

·       The final model was serialized into a pickle file to ensure its reuse for future classifications.

Prediction and inference:

·       The system was designed to input new product data and classify it using the pre-trained model.

·       The inference results were stored back into Blob Storage and were available for download in CSV format for analysis.

Result storage and retrieval:

·       Prediction results were stored securely in Blob Storage.

·       Results were accessible through Azure Storage Explorer, ensuring ease of access for future analysis or reporting.


How the solution yielded business value

·       Improved efficiency: The automated system significantly reduced the time and effort required for product feature classification. It has reduced human dependency by around 90%.

·       High accuracy: The model achieved close to 90% accuracy, ensuring reliable classification of product features into predefined categories.

·       Scalability: The solution was scalable, allowing the client to handle a larger volume of textual data without compromising speed or accuracy.

·       Cost savings: Automation eliminated the need for manual classification, resulting in substantial cost savings over time.

These results underscore the success of the NLP solution in transforming the organization’s approach to data management. By using advanced AI technologies, the client not only optimized their internal processes but also positioned itself for greater efficiency and competitiveness in the market.


For any further inquiries, contact Sales@MAQSoftware.com to see how AI can transform your business, improve productivity, and accelerate your delivery.