--- license: mit language: - en library_name: spacy tags: - named-entity-recognition - b2b - ecommerce - order-processing - product-extraction - spacy datasets: - custom metrics: - f1 - precision - recall model-index: - name: b2b-ecommerce-ner results: - task: type: token-classification name: Named Entity Recognition dataset: type: custom name: B2B Ecommerce Orders metrics: - type: f1 value: 0.82 name: F1 Score - type: precision value: 0.82 name: Precision - type: recall value: 0.81 name: Recall --- # B2B Ecommerce NER Model ## Model Description This is a Named Entity Recognition (NER) model specifically trained for B2B ecommerce order processing. The model extracts structured information from retailer-to-manufacturer order text, enabling automated order capture and processing. ## Supported Entities The model identifies the following entity types: - **PRODUCT**: Product names and descriptions (e.g., "Coca Cola", "Golden Dates", "Chocolate Cleanser") - **QUANTITY**: Order quantities (e.g., "5", "10", "twenty") - **SIZE**: Product sizes and measurements (e.g., "500ML", "250G", "1.25L") - **UNIT**: Units of measurement (e.g., "units", "bottles", "packs") ## Features - **High Accuracy**: Achieves F1 score of 0.82 on B2B ecommerce order data - **Product Catalog Matching**: Includes fuzzy matching against a comprehensive product catalog - **Multi-language Support**: Handles mixed English/Hindi text common in Indian B2B commerce - **Real-world Patterns**: Trained on actual retailer order patterns and variations ## Usage ### Basic Usage ```python from huggingface_model.model import B2BEcommerceNER # Load the model model = B2BEcommerceNER.from_pretrained("path/to/model") # Extract entities from order text results = model.predict(["Order 5 bottles of Coca Cola 650ML"]) print(results[0]) # Output: { # 'text': 'Order 5 bottles of Coca Cola 650ML', # 'entities': { # 'products': [{'text': 'Coca Cola', 'label': 'PRODUCT', 'start': 19, 'end': 28}], # 'quantities': [{'text': '5', 'label': 'QUANTITY', 'start': 6, 'end': 7}], # 'sizes': [{'text': '650ML', 'label': 'SIZE', 'start': 29, 'end': 34}], # 'units': [{'text': 'bottles', 'label': 'UNIT', 'start': 8, 'end': 15}], # 'catalog_matches': [...] # } # } ``` ### Pipeline Usage ```python from huggingface_model.model import pipeline # Create NER pipeline ner_pipeline = pipeline("ner", model="b2b-ecommerce-ner") # Process text entities = ner_pipeline("I need 10 packs of biscuits") ``` ### Batch Processing ```python # Process multiple orders at once orders = [ "Order 5 Coke Zero 650ML", "Send 12 bottles of mango juice", "I need 3 units of Chocolate Cleanser 500ML" ] results = model.predict(orders) ``` ## Training Data The model was trained on a dataset of 500 B2B ecommerce orders containing: - Real retailer-to-manufacturer communications - Mixed English/Hindi text patterns - Various product categories (beverages, food items, personal care) - Different order formats and structures - 1,002 labeled entities across 4 entity types ## Model Performance | Metric | Score | |--------|-------| | F1 Score | 0.82 | | Precision | 0.82 | | Recall | 0.81 | The model shows strong performance across all entity types, with particularly good results on PRODUCT and QUANTITY recognition. ## Product Catalog Integration The model includes a fuzzy matching system that can match extracted products against a catalog of 1,855+ products, providing: - **Brand Matching**: Match to specific brands (e.g., "Coca Cola", "Ziofit") - **Product Variants**: Find different sizes/variants of the same product - **Confidence Scores**: Numerical confidence for each match (0-100) - **SKU Mapping**: Direct mapping to product SKUs for order processing ## Limitations - Performance may vary on product names not seen during training - Best results with English text; mixed language support is experimental - Requires product catalog file for fuzzy matching features - Based on spaCy framework, not transformer-based ## Technical Details - **Framework**: spaCy 3.8+ - **Base Model**: en_core_web_sm - **Training**: Custom NER component with 50 iterations - **Entity Labels**: 4 custom entity types - **Input**: Raw text strings - **Output**: Structured entity information with optional catalog matching ## Installation ```bash pip install spacy pandas fuzzywuzzy python-levenshtein python -m spacy download en_core_web_sm ``` ## Citation ```bibtex @misc{b2b_ecommerce_ner_2025, title={B2B Ecommerce NER Model for Order Processing}, author={Your Name}, year={2025}, howpublished={Hugging Face Model Hub}, url={https://huggingface.co/your-username/b2b-ecommerce-ner} } ``` ## License This model is released under the MIT License.