About our client
Our client is a leading provider of innovative cockpit electronics and electrification solutions. With a global presence and a focus on cutting-edge technology, they design, manufacture, and sell multiple automobile components.
They required an automated system to streamline the classification of product feature descriptions provided by car manufacturers into specific primary and secondary disciplines. The need arose from the increasing complexity and volume of textual data. These factors made manual classification both time-consuming and prone to errors.
The issue at hand
· Manual classification effort:
Previously, the client relied on a manual process to classify textual product
features provided by manufacturers. This process was labor-intensive, prone to
human error, and consumed a significant amount of time given the large volume
of data.
· Scalability issues:
As the volume of product feature data grew, the manual classification system
became unsustainable. The client faced difficulties in scaling the process to
accommodate increasing amounts of data without hiring additional resources.
· Inconsistent accuracy:
Due to the complexity of the product features and the subjective nature of
manual classification, the accuracy of the classification was inconsistent.
This variability led to inefficiencies in product management and data analysis.
How we stepped in with an innovative AI solution
Our client needed to automate the classification of product features (textual statements from car manufacturers) into predefined primary and secondary disciplines. To address the client's needs, an advanced NLP and ML system was developed to categorize product features into disciplines using the following workflow:
1.
Data preparation:
· Gathered historical textual data with
corresponding labels.
· Preprocessed text data by cleaning, tokenizing,
and generating embeddings using OpenAI models to ensure a structured format for
machine learning models.
· Addressed imbalances in the dataset using data
balancing techniques and by assigning appropriate class weights.
2. Model
development and training:
· Developed and trained a machine learning model
to classify product features, aiming for a high accuracy of 95%.
· Azure Synapse was used to store and process the
training data.
· Text classification models were serialized for
future use, ensuring that the trained model configurations could be easily
deployed.
3. Deployment
and inference:
· The trained model was stored in Blob Storage for
future retrieval.
· The model was used to classify new product
features, and predictions were stored back in Blob Storage for analysis and
further use.
· The system allowed for easy retrieval of prediction results, stored in CSV format, using Azure Storage Explorer.
About the solution flow
Figure 1: Components of the AI-driven solution |
Data collection and storage:
· Historical data was collected and stored in Blob
Storage using Azure Storage Explorer.
· The storage system allowed for seamless retrieval of training data when required for model building.
Data retrieval and preprocessing:
· Data was retrieved from Blob Storage to Azure
Synapse using Jupyter Notebooks for preprocessing.
· Preprocessing steps included cleaning, aggregation, and lemmatization of text data, ensuring it was suitable for machine learning models.
Data transformation:
· Text data was transformed into tokenized vectors
and numeric labels for input into machine learning models.
· Embeddings were generated using advanced NLP techniques, enabling the system to understand and classify textual data effectively.
Model training and evaluation:
· Various machine learning models were trained,
tuned, and evaluated for performance.
· The final model was serialized into a pickle file to ensure its reuse for future classifications.
Prediction and inference:
· The system was designed to input new product
data and classify it using the pre-trained model.
· The inference results were stored back into Blob Storage and were available for download in CSV format for analysis.
Result storage and retrieval:
· Prediction results were stored securely in Blob
Storage.
· Results were accessible through Azure Storage Explorer, ensuring ease of access for future analysis or reporting.
How the solution yielded business value
· Improved efficiency: The automated system
significantly reduced the time and effort required for product feature
classification. It has reduced human dependency by around 90%.
· High accuracy: The model achieved close
to 90% accuracy, ensuring reliable classification of product features into
predefined categories.
· Scalability: The solution was scalable,
allowing the client to handle a larger volume of textual data without compromising speed or accuracy.
· Cost savings: Automation eliminated the
need for manual classification, resulting in substantial cost savings over
time.
These results underscore the success of the NLP solution in transforming the organization’s approach to data management. By using advanced AI technologies, the client not only optimized their internal processes but also positioned itself for greater efficiency and competitiveness in the market.
For any further inquiries, contact Sales@MAQSoftware.com to see how AI can transform your business, improve productivity, and accelerate your delivery.