About me

I am a Chinese computational linguist. After five years of bachelor study in French language and linguistics (at Dalian University of Foreign Languages and University of Rouen), I decided to continue my master study in computational linguistics at National Institute for Oriental Languages and Civilizations to broaden my professional career. During my master, I did three consecutive internships as a linguistic engineer in the startup XiKO. In 2016, I obtained a French ministerial scholarship for my doctoral study at University of Paris-Saclay, CNRS, LIMSI. In 2019, I defended my thesis in Natural Language Processing on the subject of recognising sub-sentential translation techniques, which is supervised by Anne Vilnat and Gabriel Illouz. After this, I worked during one year as a teaching and research assistant at LIMSI and University of Paris-Saclay (department of computer science). From September 2020, I will work as a postdoctoral researcher during two years at Artificial Intelligence and Human Languages Lab of Beijing Foreign Studies University.

RESEARCH EXPERIENCE

Postdoctoral researcher

2020 - 2022
Beijing Foreign Studies University, Artificial Intelligence and Human Languages Lab, China

Participate in the Rosetta Project

2020 - Present

ROSETTA: Resources for Endangered Languages through Translated Texts
Collaborative interdisciplinary project between Stanford University and University of Lille
Working with Amel Fraisse, Jenn Ronald, Shelley Fisher Fishkin, Zheng Zhang and Alex Zhai

Teaching and Research Assistant

2019 - 2020
University of Paris-Saclay, CNRS, LIMSI, France

The courses that I taught are listed below.

PhD in Computer Science

2016 - 2019
University of Paris-Saclay, CNRS, LIMSI, France

Master internship

2015 - 2016
XiKO, Paris, France

develop linguistic resources, text mining
prepare configuration file for crawling web pages
compare and implement methods of feature selection for automatic document classification
Supervised by Gaël Patin and Damien Nouvel

PUBLICATIONS

CONFERENCE PROCEEDINGS

La réécriture monolingue ou bilingue facilite-t-elle la compréhension ?
Yuming Zhai, Gabriel Illouz and Anne Vilnat (2020), In Proceedings of the 27th Conférence sur le Traitement Automatique des Langues Naturelles (TALN'20). Nancy, France. [bibtex] [material]
Building an English-Chinese Parallel Corpus Annotated with Sub-sentential Translation Techniques
Yuming Zhai, Lufei Liu, Xinyi Zhong, Gabriel Illouz and Anne Vilnat (2020), In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC'20). Marseille, France. [code] [bibtex]
Classification automatique des procédés de traduction
Yuming Zhai, Gabriel Illouz and Anne Vilnat (2019), In Proceedings of the 26th Conférence sur le Traitement Automatique des Langues Naturelles (TALN'19). Toulouse, France. [slide][code][bibtex]
Conception d'un outil d'aide à la compréhension écrite pour les apprenants de français langue étrangère
Yuming Zhai, Gabriel Illouz and Anne Vilnat (2019), In Proceedings of the 9th Conférence Environnements Informatiques pour l'Apprentissage Humain (EIAH'19). Paris, France. [poster][bibtex]
Towards Recognizing Phrase Translation Processes: Experiments on English-French
Yuming Zhai, Pooyan Safari, Gabriel Illouz, Alexandre Allauzen and Anne Vilnat (2019), preprint version In Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING'19). La Rochelle, France. [code][poster][bibtex]
Construction of a Multilingual Corpus Annotated with Translation Relations
Yuming Zhai, Aurélien Max and Anne Vilnat (2018), In Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing@COLING (LR4NLP'18). Santa Fe, New Mexico, USA. [slide][bibtex]
Construction d'un corpus multilingue annoté en relations de traduction
Yuming Zhai (2018), In Proceedings of the 20th REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL'18). Rennes, France. [poster][bibtex]

MASTER THESIS

Étude sur l'apport de la sélection des caractéristiques dans la classification multi-classe des textes
Yuming Zhai (2016), Master thesis defended at National Institute for Oriental Languages and Civilizations (INALCO) (18/20). [slide]
Supervised by Gaël Patin and Damien Nouvel

TALKS

Construction of a Multilingual Corpus Annotated with Translation Relations
Yuming Zhai (2018), In the workshop of Cross-lingual Analysis and Multilingual Parallel and Comparable Corpus Annotation: Present and Future Tendency. University of Paris Diderot, France. [slide]

RESOURCES

Last modification: 05/12/2019. Licence: Attribution-NonCommercial-ShareAlike 4.0
If you reuse the EN-ZH annotation guidelines, please use this citation.
If you reuse the EN-FR annotation guidelines, please use this citation.
Annotation Guidelines of Translation Techniques for English-French
Annotation Guidelines of Translation Techniques for English-Chinese

TEACHING EXPERIENCE

Programming and database administration (Oracle SQL)

2017
IUT of Orsay, France, with Anne Vilnat

bachelor 1st-year practical classes (21hrs)

Operating systems and concurrent computing (C)

2019
University of Paris-Saclay, France, with Thomas Lavergne

bachelor 3rd-year practical classes (30hrs)

Introduction to object-oriented programming (Java)

2019
University of Paris-Saclay, France, with Alice Jacquot

bachelor 2nd-year practical classes (24hrs)

Database management system (Oracle SQL)

2020
University of Paris-Saclay, France, with Emmanuel Waller

bachelor 3rd-year practical classes (16.5hrs)

Advanced programming of interactive interfaces (JavaFX)

2020
University of Paris-Saclay, France, with Ouriel Grynszpan

bachelor 3rd-year practical classes (27hrs)

ACADEMIC SERVICE

Peer review experience
Conferences: ACL, EMNLP 2020; EACL 2021
Journals: Journal of Data Mining & Digital Humanities 2020