Recently I came across a short article in Trends in Cognitive Sciences entitled “The language of programming: a cognitive perspective” (Fedorenko et al. 2019). The authors argue in it for a linguistically oriented path of programming education in schools—which is currently grouped with STEM disciplines—based on observed parallelisms between natural and programming languages. As they put it, “When you learn a programming language, you acquire a symbolic system that can be used to creatively express yourself and communicate with others.” Indeed, if something’s called a language and behaves like one, why not treat it like one?
Programming language viewed linguistically
Now what’s that supposed to mean? What is it that allows us to view programming languages linguistically? Take the familiar hello world
program for example (the implementation below is in Python 3).
print("Hello, world!")
The conventional way to understand this line of code is: print()
is a function that takes a string
argument and prints it to the terminal screen, and the line is a call to this function.
As we can see, this perspective, largely mathematical, is pretty neat, as it makes use of formal notions like “function” and “argument.” From a communicative perspective, however, no such terminology is necessary, because the function call is just a command (in the everyday sense of the word) or order—or in linguistic terms, an imperative clause. Natural languages have loads of imperative clauses, such as
- Put on your shoes!
- Gimme five!
So the above print
statement can be translated into English as
- Print “Hello, World!” to the screen!
Every imperative clause presupposes a listener or addressee. So does the print
clause. In this case the listener is just the computer, hence the conception that a programming language is a medium of human-computer interaction. Actually this idea has long been exploited, to varied degrees of success, as evidenced by the invention of programming-language-like artificial natural languages (such as Lojban) and that of natural-language-like programming languages (such as Shakespeare). There’s even a whole field called natural-language programming, though apparently its justifiability isn’t unanimous.
Programming languages are limited languages
Needless to say, the human-computer interaction enabled by programming languages is much more limited than the human-human interaction enabled by natural languages. For example, programming instructions like print
can only tell computers what to do, while imperative clauses in natural languages can also tell people what not to do, e.g.,
- Don’t be late!
how to behave, e.g.,
- Be quiet!
or in certain languages, what a person wishes another person to do, e.g.,
- (Hungarian) János kérte, hogy menjek. ‘John asked that I (should) go.’
(menjek is the imperative form of the verb menni ‘to go’)
None of the above meanings can be conveyed by quasi-imperative instructions in programming languages. We normally don’t tell computers not to do something (which is pointless) or how to behave (computers usually behave themselves anyway), let alone make wishes to them. Therefore, these aspects of communication, while absolutely necessary in human-human interaction, are simply out of the question in human-computer interaction.
Teaching or learning programming languages from a linguistic or communicative perspective is completely feasible, if no overkill. That’s because the inventory of grammatical phenomena one would come across in the most sophisticated programming language is still just a tiny subset of that in any natural language on Earth. To get a picture of this let’s go over some of the grammatical layers in the verbal/clausal domain of the variety of human language called English.
- Predicate-argument core (e.g., I eat cookies.)
- Voice (e.g., Cookies are eaten.)
- Aspect (e.g., I am eating cookies.)
- Tense (e.g., I ate cookies.)
- Modality (e.g., I might eat cookies.)
- etc.
Apparently none of the above categories except the first one (i.e., the predicate-argument core) exists in programming languages—and even that first category is a lot more simplified compared to the various predicate-argument relations in natural languages. For instance, many natural languages make case distinctions based on the particular types of structural or semantic relations between the predicate and the argument. The example below is from German, another variety of human language.
- Der Lehrer erklärt dem Schüler die Regel. ‘The teacher explains the rule to the student.’
There are three noun phrases in the above German sentence, which are in three different cases corresponding to three linguistic functions:
- der Lehrer ‘the teacher’ - nominative - subject
- dem Schüler ‘(to) the student’ - dative - indirect object
- die Regel ‘the rule’ - accusative - direct object
Programming languages don’t make the direct object vs. indirect object distinction, and they arguably also don’t need to highlight the subject, which in most cases is just the programmer. Thus, there’s no subject in a print
clause, just as there’s no subject in an English imperative clause (e.g., Sit down!).
That all being said, I think we should give those interested in programming vs. natural language comparison a chance rather than simply tell them it’s futile. Whether or not a research project can yield useful results (a scientific issue) is a separate question from whether or not it should be banned (an ethical issue), not to mention that great ideas have often risen from unorthodox experiments in human history. Therefore, I disagree with a user’s suggestion to close the StackExchange–Linguistics question “Any difference between natural and programming languages?” as off-topic based on the argument that “programming languages are outside the scope of linguistics.” Who are we to judge anyway?
Back to the short paper I mentioned at the beginning of this post, its cognitive comparison of natural and programming languages draws insights from both knowledge representation and computation, and it takes into account both language comprehension and generation. Below I briefly explain what these terms mean in the domain of language (click each term for further information).
- Knowledge representation: the mental representation of words, phrases, and linguistic rules
- Computation: the mental processes involved in language use
- Comprehension: what happens in the mind when a person understands speech
- Generation: what happens in the mind when a person produces speech
These all sound very scientific. And interestingly, all four terms have found their ways into AI, where researchers attempt to let computers emulate human intelligence. We are still far far away from humanlike AI (“nowhere near close” according to Forbes), but the terminological loaning suggests that a language component will be key to it whenever (or if) that eventually comes.
Beyond the programming vs. natural language comparison
The above-mentioned paper isn’t the only place where programming and natural languages have been put in comparison, but it is one of the very few places where a positive tone has been conveyed. Most other discussions on the issue I’ve found online are more focused on the design differences between the two types of language. For example:
- This Quora question: “What’s the difference between natural languages and programming languages?” (I find Daniel Ross’ answer especially informative.)
- This blog post: “Natural vs programming languages” (This one is for a similarity-based perspective.)
- This StackOverflow question: “What’s the difference between natural languages and programming languages in the context of their grammars?” (There’s only one response under this question but it’s a neat one.)
- This e-book chapter: “Formal and Natural Languages” (This one is from a textbook, so its style is pretty scholarly.)
- This Medium article: “Human languages vs. Programming languages” (This one is written by a linguist-turned-programmer and therefore both more comprehensive and more balanced.)
I support Fedorenko et al.’s (2019) proposal of a linguistically oriented path of programming language education and firmly stand with those who want to experiment new ideas (as long as they aren’t against the law). But that doesn’t mean I think such a new path would make CS education in schools any more efficient than the “old-fashioned,” mathematically oriented path. This is because mastering a programming language isn’t equal to mastering programming. There’s much more to the art of programming than just a bunch of function words and syntactic rules. According to many experienced programmers, what really makes programming difficult is not the linguistic part but the problem-solving (or “personality”) part. For example, I’m sympathetic to the following answers under the Quora question “What makes programming hard?”:
Steve Eng: Programming is hard because … solving problems from a programming perspective is a combination of art and talent.
Al Klein: Learning a programming language without first learning programming is like learning “chain saw” without first learning carpentry - then complaining that cutting scroll work into a wood block is hard.
Jussi Hämäläinen: Programming is hard simply because thinking is hard.
Baris Baser: What makes playing chess hard? …learning how to play chess can be considered relatively easy. Becoming a competent or great chess player, however, will not be easy.
See this and this post for similar views, and see this post for a step-by-step guide on how to really master programming after crossing the linguistic threshold.
Takeaway
- It is okay if you want to compare programming languages and natural languages, whether you are a linguist or a programmer. Fear not!
- While a linguistically oriented path of programming language education may be helpful in some way, it probably can’t resolve the real obstacle in learning to program: the art of problem-solving.
- So, perhaps a joint (i.e., half-mathematical-half-linguistic) path would prove more efficient in the long run.
Leave a comment