This paper indicates an approach of a continuous training pipeline to enhance deep learning models and assessing their feasibility based on an evaluation. The purpose of this research is to analyze the quality effect of a continuously learning neural network algorithm for document classification by taking user feedback into account. The hypothesis implies that user feedback through active learning increases the precision and thus makes the process of document classification more efficient. For this purpose, based on a utility analysis, the available technologies are identified, and necessary ones are selected for designing a software concept. TensorFlow as a deep learning framework, Tesseract as an OCR engine, and Apache Airflow for the life cycle management and for orchestrating the elements for the continuous training pipeline are used. This implementation of a machine learning as a service prototype allows for exploration into the synergistic effect between the use of active learning, in the form of user feedback, and the quality of document classification achieved by deep learning. In an experiment, the implemented service is used to analyze the models behavior based on three different states. This includes synthetic data and active learning in the form of user feedback through data from data augmentation and simulated realistic data. The result shows that active learning enhanced models indicate a higher accuracy than artificially generated models. The evaluation experiment confirms the hypothesis that user feedback with continuously learning models perform better in terms of generalizing within the document classification. In conclusion, the paper demonstrates the technical requirements for implementing a machine learning as a service and affirms that the use of active learning can be integrated into existing industrial systems.

Implementation and Evaluation of a MLaaS for Document Classification with Continuous Deep Learning Models / Walter-Tscharf, Franz Frederik Walter Viktor. - ELETTRONICO. - (2022), pp. 229-239. [10.1007/978-3-031-11232-4_20]

Implementation and Evaluation of a MLaaS for Document Classification with Continuous Deep Learning Models

Walter-Tscharf, Franz Frederik Walter Viktor
2022-01-01

Abstract

This paper indicates an approach of a continuous training pipeline to enhance deep learning models and assessing their feasibility based on an evaluation. The purpose of this research is to analyze the quality effect of a continuously learning neural network algorithm for document classification by taking user feedback into account. The hypothesis implies that user feedback through active learning increases the precision and thus makes the process of document classification more efficient. For this purpose, based on a utility analysis, the available technologies are identified, and necessary ones are selected for designing a software concept. TensorFlow as a deep learning framework, Tesseract as an OCR engine, and Apache Airflow for the life cycle management and for orchestrating the elements for the continuous training pipeline are used. This implementation of a machine learning as a service prototype allows for exploration into the synergistic effect between the use of active learning, in the form of user feedback, and the quality of document classification achieved by deep learning. In an experiment, the implemented service is used to analyze the models behavior based on three different states. This includes synthetic data and active learning in the form of user feedback through data from data augmentation and simulated realistic data. The result shows that active learning enhanced models indicate a higher accuracy than artificially generated models. The evaluation experiment confirms the hypothesis that user feedback with continuously learning models perform better in terms of generalizing within the document classification. In conclusion, the paper demonstrates the technical requirements for implementing a machine learning as a service and affirms that the use of active learning can be integrated into existing industrial systems.
2022
Advances in Architecture, Engineering and Technology
Switzerland
Springer, Cham
978-3-031-11231-7
978-3-031-11232-4
Walter-Tscharf, Franz Frederik Walter Viktor
Implementation and Evaluation of a MLaaS for Document Classification with Continuous Deep Learning Models / Walter-Tscharf, Franz Frederik Walter Viktor. - ELETTRONICO. - (2022), pp. 229-239. [10.1007/978-3-031-11232-4_20]
File in questo prodotto:
File Dimensione Formato  
978-3-031-11232-4.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.39 MB
Formato Adobe PDF
1.39 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/363222
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact