Accéder directement au contenu Accéder directement à la navigation
Article dans une revue

Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries

Nenad Savic Nicolas Bovio Fabien Gilbert 1 José Paz Irina Guseva Canu 
1 IRSET-ESTER - Épidémiologie en Santé au Travail et Ergonomie
Irset - Institut de recherche en santé, environnement et travail
Abstract : Abstract Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool’s development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding.
Type de document :
Article dans une revue
Liste complète des métadonnées

https://hal.univ-angers.fr/hal-03345693
Contributeur : fabien gilbert Connectez-vous pour contacter le contributeur
Soumis le : mercredi 15 septembre 2021 - 17:22:17
Dernière modification le : lundi 25 juillet 2022 - 17:54:07

Identifiants

Citation

Nenad Savic, Nicolas Bovio, Fabien Gilbert, José Paz, Irina Guseva Canu. Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries. Annals of Work Exposures and Health, Oxford University Press, 2021, ⟨10.1093/annweh/wxab037⟩. ⟨hal-03345693⟩

Partager

Métriques

Consultations de la notice

32