MANILA, Philippines — The world has become significantly smaller as communication barriers continue to be torn down by translator tools, Artificial Intelligence (AI), and other technology.
These tools, as developed by tech giants like Microsoft and Google, are applicable to major languages like English, Chinese, and even Filipino. Recently, they have also served to revive ancient languages like Aramaic and ancient Hebrew.
With these developments, mathematician Renier Mendoza — who also considers himself a keen advocate of preserving languages — had realized that “there [still] weren’t any, if at all, tools like that for Baybayin,” the country’s precolonial writing alphabet.
Database
“In comparison, tools like that for Japanese, Korean, and Chinese are already highly developed,” said the associate professor at the Institute of Mathematics of the University of the Philippines.
“So we wanted to develop our own tool using machine-learning algorithms,” he said.
Mendoza, fellow associate professor Rachelle Sambayan and master’s student Rodney Pino began developing in 2021 a software that they now call the first artificial intelligence (AI)-powered Baybayin translator tool, which can convert entire paragraphs and even pages of Baybayin text into something readable for Filipinos today.
Pino, whose thesis is this project, said the team hopes to contribute to efforts to decode pre-colonial texts and preserve the country’s ancestral alphabet. Two years ago, he started collating images for each of the 17 characters in Baybayin — “anything I could find that had Baybayin text on it,” he said.
After more than three months, the team was able to organize a Baybayin database of 17,000 images — each of the 17 characters represented by 1,000 images and their variations — then tested a support vector machine (SVM) on that assortment.
Sambayan described the SVM as an algorithm that is part of the translator tool and is able to recognize whether a character fed to that software is in Baybayin or Latin.
The team converted all the images into black and white to make their processing easier for the tool’s power and memory.
Demonstration
The software’s translations are still fairly straightforward, as a demonstration by the team to the Inquirer showed.
In translating “alab ng puso” (heart aflame) from Baybayin, for example, the tool showed all possible translations, including variations on the spelling “puso,” such as “poso.”
Mendoza said the tool still had “issues” in transliterating similar-sounding vowels like “e” and “i,” and “o” and “u.”
But the software will not translate anything outside the Filipino dictionary, Sambayan said.
Although the tool is still in its initial stage, she said it is the first such software that is able to translate large chunks of Baybayin text into Filipino.
The team is working on cutting down the tool’s speed to increase its volume capacity. Mendoza said it is also currently unable to translate from Filipino to Baybayin, as this would require a higher level of AI.
He said the software is also being developed further “on how to choose the better word for translation, depending on the syntax and context of the sentence — and that is where we would need greater linguistic expertise.”
Furthermore, the team hopes to develop a mobile version soon, “so there is a portable version for the public,” Mendoza said.
But it has made the Baybayin database public, for now, so that more researchers could use it in their studies on Baybayin and AI.
‘For modern times’
Many precolonial documents are written in Baybayin, and some historians may not necessarily know or understand this alphabet.
“So while it is a bit [of a] niche [field], this tool still has a lot of possible uses in other aspects,” Mendoza said.
He said the team is not necessarily advocating Baybayin as the primary writing system at this age, yet they hope to preserve the alphabet and help future scholars.
“We’re really hoping to inspire more efforts in preserving the Baybayin,’ Pino said. “We should be proud of our own alphabet and at the same time, digitize it for modern times.”