Biography

I am a PhD candidate currently working on the neural machine translation of user-generated content (e.g. social media posts).

Nishimwe is a Rwandan name meaning ‘Thanks be to God’. It is pronounced /niːʃiːmŋé/.

Fun fact: There is another Lydia Nishimwe, who is a singer. Though we share quite a few similarities, we are not related. Feel free to check out her YouTube.

Interests
  • Machine Translation
  • Lexical Normalisation
  • Sentence Embeddings
  • Language Models
  • Data Augmentation
  • Languages
Education
  • PhD in Computer Science, 2021-present

    Inria Paris, Sorbonne Université

  • MEng in Mathematics and Computer Science, 2017-2021

    École Centrale de Nantes

  • BSc in Mathematics and Computer Science, 2014-2017

    Université Grenoble Alpes

Languages

gb
English

Native

fr
French

Native

es
Spanish

Advanced

ke
Swahili

Intermediate

de
German

Intermediate

rw
Kinyarwanda

Elementary

Experience

 
 
 
 
 
ALMAnaCH Team, Inria
PhD Candidate
Oct 2021 – Present Paris, France

Robust Neural Machine Translation of User-Generated Content

  • 3 first-author publications at peer-reviewed NLP venues (2 conferences, 1 journal)
  • 2 participations in NLP shared tasks (1 in organisation team, 1 in submission team)
  • 10+ presentations (conferences, seminars, high school outreach programmes)
  • Peer-reviewed 4 conference paper submissions
  • Trained language and translation models in a High-Performance Computing environment
  • Submitted bug fixes and new features to NLP repositories on GitHub (fairseq, NL-Augmenter)

Tech stack: Python, PyTorch (Fairseq, Transformers), SLURM
Organisation: Github/Gitlab, Trello, Zotero
Office pack: LaTex/Beamer, MS Word/PowerPoint/Excel

 
 
 
 
 
Orange Labs
Research Intern
Jun 2020 – Dec 2020 Lannion, France

Inference of masked sequences

  • Conducted an extensive literature review on sequence models (seq2seq) decoding strategies (autoregressive, semi-autoregressive, non-autoregressive, monotonic, and non-monotonic)
  • Designed and executed experimental studies on decoding algorithms for reconstructing masked sequences of router logs from Orange

Tech stack: Python, TensorFlow, Keras
Organisation: Trello, Zotero
Office pack: LaTex/Beamer

 
 
 
 
 
Mean-In-Full
Software Development Intern
May 2017 – Jul 2017 Meylan, France

Implemented the integration of a third-party app (Opencast, a Learning Management System) with RoCamRoll (the company’s product), including troubleshooting API interactions, and ensuring smooth data flow between systems.

Tech stack: Erlang, HTTP

 
 
 
 
 
Laboratoire TIMA
Assembly Programming Intern
May 2016 – Jun 2016 Grenoble, France

Functional verification of an ARM7 microprocessor

  • Implemented the simulation of the microprocessor in VHDL
  • Implemented new features in C and their test files in ARM

Tech stack: VHDL, C, ARM

🏆Stage d’excellence (Excellence Internship Program) - Université Grenoble Alpes🏆

 
 
 
 
 
Laboratoire VERIMAG
Functional Programming Intern
Jun 2015 – Jun 2015 Grenoble, France

Implemented the simulation of GPS logs

Tech stack: Lutin, Lustre

🏆Stage d’excellence (Excellence Internship Program) - Université Grenoble Alpes🏆

Publications

Making Sentence Embeddings Robust to User-Generated Content
The MRL 2022 Shared Task on Multilingual Clause-level Morphology

Contact