The Transpiler Project
Minor Data Driven Business Lab
Client company:BDO
En-Ruo Lin
Mathijs Thiecke
Project description
The Transpiler Project is focused on using natural language processing (NLP) in order to automate or help assist the process of translating source code from one programming language into another. The goal of the project is to conduct research on how NLP can be used in order to achieve this, execute a proof of concept which consists of many small specific use cases based on BDO’s requirements as determined during the order to demonstrate the actual capabilities and to deliver a final verdict recommending a specific software package based on the results.
Context
BDO initiated this project to find a way to automate the translation of their legacy code, which currently requires a lot of time and manpower. By using NLP, BDO hopes to reduce the time and effort needed for code translation, as well as modernize their legacy code, simplify program maintenance, streamline programming language skills for employees, and minimize the risk of time delays and additional costs due to issues stemming from legacy code.
Results
The Transpiler Project aimed to explore solutions for automating the translation of source code from one programming language to another using Natural Language Processing. To start the project, research was conducted and multiple interviews were held to establish the requirements. Various solutions were looked at and evaluated for their ability to translate programming languages, explain code, and generate code based on a text prompt. From this, a shortlist of five solutions was created: GitHub CoPilot, OpenAI Codex, CodexGlue from Microsoft, TransCoder from Meta, and Ghostwriter from Resplit.
These solutions were tested using a criteria matrix based on the established requirements. After testing, it was found that OpenAI Codex and GitHub CoPilot had the highest success rates in terms of translation, explanation, and code generation. OpenAI Codex was particularly successful, with GitHub CoPilot being a close second. However, both solutions had some limitations, such as difficulty translating certain programming languages and being unable to translate larger pieces of code.
The other three solutions on the shortlist were not successful in translating programming languages. CodexGlue and TransCoder were too complex to get working within the limited time given, and GhostWriter was only able to explain and generate code, but not translate it.
Overall, OpenAI Codex and GitHub CoPilot showed promise in terms of their ability to translate programming languages, but there are still limitations to be addressed.
About the project group
The Transpiler Project is a project undertaken by En-Ruo Lin and Mathijs Thiecke, both 'ICT and Business' students at Fontys Hogeschool ICT in Eindhoven. During the minor 'Data Driven Business Lab', we were tasked by BDO to research how they can automate the translation of legacy code into up to date programing languages currently in use within BDO. For example from C# code to Python code. Throughout the semester, we conducted research in and discovered different software which aims on automating this process.