A few experiments making GPT-4 solve math problems in 16 different languages
It is said that mathematics is a universal language — mathematical concepts, theorems, and definitions can be expressed as symbols that are understandable regardless of language.
In this article, I test the mathematical capabilities of GPT-4 in sixteen different languages.
Early experiments showed GPT-4 scoring highly on the SAT Math and AP Calculus tests and on undergraduate-level mathematics. However, the majority of these experiments test GPT-4’s mathematical capabilities only in English. To better understand GPT-4’s mathematical capabilities beyond English, I prompt it on the same math problems in fifteen other languages.
So, how good is GPT-4 at math in different languages? In theory, it should be equally good (or bad) across all languages, but unfortunately (as you might have guessed), this is not the case. GPT-4 is much better at solving math problems in English. Depending on the language, GPT-4 could solve some of the problems. For traditionally under-resourced languages, however, such as Burmese and Amharic, GPT-4 was unable to solve the problems I gave it.
I use mathematical problems from the Project Euler website to test GPT-4. (This is also a throwback to one of my one of my earlier articles from this year, where I used prompt engineering using ChatGPT to solve a few Project Euler problems). Project Euler, named for the eponymous mathematician, is a website with hundreds of mathematical and computer programming problems ranging in difficulty. Started in 2001, they boast over 850 problems (as of October 2023) and release a new question approximately every week.
The great thing about Project Euler questions is that each problem has a numerically “correct” answer — this makes it easy to check if GPT-4’s answer is objectively correct or not. They also tend to be a lot more complicated than high-school or college-level math problems. Currently, there is no large-scale comprehensive understanding of GPT-4’s (or other large language models, for that matter) math…
This post originally appeared on TechToday.