Logotipo del repositorio
Comunidades y Colecciones
Estadísticas
¿Nuevo Usuario? Pulse aquí para registrarse¿Has olvidado tu contraseña?
  1. Inicio
  2. Producción Científica UPeU
  3. Publicaciones
  4. A Flat-Hierarchical Approach Based on Machine Learning Model for e-Commerce Product Classification

A Flat-Hierarchical Approach Based on Machine Learning Model for e-Commerce Product Classification

Author(s)
Harold Enrique Cotacallapa Mamani
Paulo Canas Rodrigues
Rodrigo Salas
Date Issued
1 de enero de 2024
Type
Article
Volume
12
Start Page
72730
End Page
72745
DOI
10.1109/access.2024.3400693
Abstract
Within the e-commerce sphere, optimizing the product classification process assumes pivotal importance, owing to its direct influence on operational efficiency and profitability. In this context, employing machine learning algorithms stands out as a premier solution for effectively automating this process. The design of these models commonly adopts either a flat or local (hierarchical) approach. However, each of them exhibits significant limitations. The regional approach introduces taxonomic inconsistencies in predictions, whereas the flat approach becomes inefficient when dealing with extensive datasets featuring high granularity. Therefore, our research introduces a solution for hierarchical product classification based on a Machine Learning model that integrates flat and local (hierarchical) classification approaches using a 4-level electronic product dataset obtained from a renowned e-commerce platform in Latin America. In pursuit of this goal, a comparative analysis of seven machine learning algorithms, including Multinomial Naive Bayes, Linear Support Vector Classifier, Multinomial Logistic Regression, Random Forest, XGBoost, FastText, and Voting Ensemble, was conducted. This hybrid approach model performs better than models using a single approach. It surpassed the top-performing flat approach model by 0.15% and outperformed the leading local approach (Local Classifier per Level) model by 4.88%, as measured by the weighted F1-score. Additionally, this paper contributes to the academic community by presenting a significant Spanish-language dataset comprising over one million products and discussing the preprocessing techniques tailored for the dataset. It also addresses the study’s inherent limitations and potential avenues for future exploration in this field.
Subjects

Computer science

Machine learning

Artificial intelligen...

Random forest

Naive Bayes classifie...

Support vector machin...

Data mining

Classifier (UML)

Computer science

Machine learning

Artificial intelligen...

Random forest

Naive Bayes classifie...

Support vector machin...

Data mining

Classifier (UML)

Physical Sciences Com...

Physical Sciences Com...

Physical Sciences Com...

Metrics
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your Institution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Desarrollado con Software DSpace-CRIS - Extensión mantenida y optimizada por 4Science

  • Accessibility settings
  • Política de privacidad
  • Acuerdo de usuario final
  • Enviar Sugerencias