Multimodal Large Language Model for the Dominican Dialect -

This project aims to evaluate existing models and develop a Multimodal Large Language Model (LLM) specifically adapted to the Dominican dialect, capable of processing both audio and text inputs from Dominican speakers.

The Dominican Republic has a rich cultural identity, expressed through its distinctive form of Spanish—filled with slang, idioms, and local expressions. This linguistic diversity poses challenges for current Artificial Intelligence systems, which are primarily trained on English-language data or general Spanish variants.

Even Spanish-specialized models may struggle to process the Dominican dialect, especially in contexts where clarity and precision are critical, such as emergency response systems. Therefore, this project aims to evaluate, fine-tune, and integrate models capable of understanding and responding appropriately in this linguistic context.

While the main focus is on improving emergency response systems, the resulting technology has broad potential applications in industries such as customer service, healthcare, and commerce.

Project FONDOCYT 2024-2-3A1-1057

Multimodal Large Language Model for the Dominican Dialect

GITHUB Page: https://github.com/lopezbec/Dominican_LLM_project