PhD candidate Sadia Nowrin, Computer Science, will present her PhD Final Oral Examination (defense) on Monday, July 29, 2024, from 3-5 pm EDT via Zoom online meeting. The title of Norwin’s defense is, “Programming by Voice.”
Defense Abstract
Programmers typically rely on a keyboard and mouse for input, which poses significant challenges for individuals with motor impairments, limiting their ability to effectively input programs. Voice-based programming offers a promising alternative, enabling a more inclusive and accessible programming environment. Through interviews with motor-impaired programmers, we found that existing voice-based programming platforms require memorizing unnatural commands.
In this work, we explore how programmers speak a single line of Java programs without adhering to strict rules. Aiming for a more naturally spoken programming system, we adopted a two-step approach. In the first step, we focused on recognizing single lines of spoken code using a large pre-trained speech recognition model, Wav2Vec 2.0. By adapting it with just one hour of spoken programs and leveraging existing natural English language data, we reduced the word error rate (WER) from 28.4% to 8.7%. Further improvements were made by decoding with a domain-specific N-gram model and rescoring with a fine-tuned large language model tailored to programming, resulting in a WER of 5.5% on our test set.
In the second step, we translated the recognized text into the target line of code. We used a large language model known for generating code from comments and adapted it to learn how to generate single lines of code. This adaptation led to a significant improvement in the CodeBLEU score from 53.5% to 72.0% on our test set.