Tiberiu Boros

Research Scientist - Machine Learning

About me

I am a PhD in Computer Science, specifically in the field of Spoken Language Processing. I'm am currently working for Adobe Systems and was previously a researcher at the Research Institute for Artificial Intelligence of the Romanian Academy.

Additionally, I maintain several Machine Learning open source projects (such as TTS-Cube and NLP-Cube) and I contributed to several other projects, such as the the DyNet Machine Learning Framework (developed by Carnegie Mellon University and many others). My current research focus is on applied Natural Language and Speech Processing.

Selected projects

  • NLP-Cube: End-to-end Natural Language Processing pipeline for tokenization, sentence splitting, part-of-speech tagging, lemmatization and parsing. [Project page] [Project code]
  • TTS-Cube: End-to-end speech synthesis, based on jointly optimized Large Language Models (LLMs) and Generative Adversarial Networks (GANs). [Project page] [Project code]
  • libLOL: End-to-end Natural Language Processing pipeline for tokenization, sentence splitting, part-of-speech tagging, lemmatization and parsing. [Project page] [Project code]
  • OSAS: End-to-end Natural Language Processing pipeline for tokenization, sentence splitting, part-of-speech tagging, lemmatization and parsing. [Project page] [Project code]

Selected publications

  • 2024

    • Boroş, T., Chivereanu, R., Dumitrescu, S., & Purcaru, O. (2024, May). Fine-Tuning and Retrieval Augmented Generation for Question Answering Using Affordable Large Language Models. In Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP)@ LREC-COLING 2024 (pp. 75-82). [Paper] [Code] [Model]
  • 2023

    • Boros, T., Dumitrescu, S. D., Mironica, I., & Chivereanu, R. (2023). Generative Adversarial Training for Text-to- Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling. Speech Synthesis Workshop [Paper] [Code]
  • 2022

    • Boros, T. et al. Machine Learning and Feature Engineering for Detecting Living off the Land Attacks. In IoTBDS. [Paper] [Code]
  • 2021

    • Boros, T. et al. A Principled Approach to Enriching Security-related Data for Running Processes through Statistics and Natural Language Processing. In IoTBDS. [Paper] [Code]
  • 2019

    • Boroş, T. et al. Tripod: Learning Latent Representations for Sequences. In ConsiLR [Paper] [Code]
  • 2018

    • Boroș, T. et al. "NLP-Cube: End-to-end raw text processing with neural networks." In CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. [Paper] [Code]
    • Boroș, T. et al. "GBD-NER at PARSEME Shared Task 2018: Multi-Word Expression Detection Using Bidirectional Long-Short-Term Memory Networks and Graph-Based Decoding." In LAW-MWE-CxG-2018. [Paper]
    • Dumitrescu, S.D., and Boros, T. "Attention-free encoder decoder for morphological processing.". In CoNLL SIGMORPHON. [Paper]

Patents

  • US Patent App. 18/163,170: Voice audio compression using neural networks [Link]
  • US Patent 11,146,580: Script and command line exploitation detection [Link]
  • US Patent 11,816,210: Risk-based alerting for computer security [Link]