What Is Syntax in Programming?
Computers are assumed to have super computational capabilities such as doing millions of calculations in a fraction of a second. That is truly amazing, but what is even more interesting is that fundamentally all computers are as dumb as a rock. Computers work only when precisely told what is required of them so that they can utilize their computational brain. Now the real question is how exactly are computers instructed to do something or to perform certain operations? The simple answer is ‘using language’. It’s not the language humans speak or the one that animals use to communicate. It’s a slightly different language which is commonly regarded as programming language.
It’s the language that the programmers use to instruct the computers when, how, and what to do using the hardware connected to them. Just as there are numerous human languages, there are tens of programming languages. In an analogical sense to human languages, the programming languages differ because of their rules, and how to ‘pronounce’ their ‘words’. This set of rules is called the syntax of a certain programming language. Let’s explore a crude definition to answer the question “what is syntax in programming?” Syntax refers to a set of rules defined, organized, and updated by the inventors of a programming language. The landscape of syntax of a programming language often changes according to the nature and niches that the specified programming language intends to aim at.
To answer the question: “what is syntax in programming?” using a common man’s example, consider a scenario where a person made some calculations on two different calculators:
If he wants to clear the screen on a calculator on the left side, he must press the button in red color labelled AC (meaning clear all entries) whereas, should the same operation need to be performed on the calculator on the right side, the AC button cannot be pressed as it is non-existent altogether. For the consumer using the calculator on the right side, he must press the white button labelled ON/C (meaning clear). Note that the ultimate result obtained is the same, but how it is performed is completely different and it is dependent upon the button of each device. Similar is the case with the syntax of programming language.
Before elements of syntax in programming languages are explored further, the clarification of programming levels is necessary. The programming languages with syntax close (similar) to human language patterns are called high-level programming languages. For example, to add 4 and 9 and save the result in a variable, software developers can write the following code in Python (a high-level) programming language:
And the following code in Assembly - x86 (a low-level) programming language:
In both cases the numbers to be added are shown with red color, operation commands are shown in green color, and the result is stored variable in pink color. The blue color is keywords that in the assembly code example, are physical hardware registers where the values are needed to be moved before any operation (such as addition in our case) can be performed on them.
Role of IDE in Syntax Understanding
It is essential to know that computers don’t directly understand programming languages closer to human language patterns. Instead, the code in these specified programming languages needs the compilers along with interpreters to generate the code that computers understand. Similarly, a code editor is required to write in and run the program to see if it works as intended. These code editors, interpreters, and compilers are provided via a software environment called Integrated Development Environment (IDE).
Same as human languages, syntax refers to the elements of coding rules that may be categorized as must-have, must-not have, good to have, and do not matter to write an executable code using proper syntax for each specified programming language. For example, in the English language the following rules can be considered must have:
- To mark the end of a sentence, the use of a period or full stop () is necessary
- All new sentences must begin with a capital letter
On the other hand, the following rules are important and constitute syntactically correct sentences. However, if not used properly, they do not violate the ‘understanding’ of the language,
- The nouns and pronouns must begin with a capital letter
Similarly, the programming language has many rules commonly regarded as the proper syntax of a programming language, that include but are not limited to, (variable) naming conventions, code statement terminator, syntax errors and error codes, code line continuation, commenting styles, indentation, case sensitivity, inclusions/exclusions, indexing, and data type declarations to name a few. In this article, a few examples of these syntax elements are elaborated to explain the concept of “what is syntax in programming?”:
1. Case Sensitivity
Considering how people can understand words even when written in capital or small alphabets, the programming languages are case-sensitive about the code. For example, if there is a variable h=5, one can also assign another variable (and it will be considered a different variable) H=5. This is because of the fact the rules of proper syntax indicate that the programming language is case sensitive, and the programmer must remain careful to not make mistakes in capitalizing words as in the English language. Any single alphabet or character chose a little carelessly, and the program will consider it a completely different thing and will show a syntax error. Synthetically correct sentences in coding may show an error if it does not adhere to following the rules of correctly formed syntax strings via case sensitivity.
2. Code statement termination
Let’s say a python programmer writes a line of code. To mark this line of code as a complete statement (as in a complete sentence for a human language), Python’s proper syntax dictates that he does not need to insert any special character at the end; instead, a simple return key (¿) press suffices. However, in case the code is written in MATLAB language, a semi-colon is mandatory otherwise an error is thrown and the code does not execute properly. Consider the code blocks on left (MATLAB) and right (Python) for simple two numbers addition (a=15, b=19, result=a+b):
If a semi-colon is not added at the end of MATLAB code or added semi-colon in Python code, a syntax error will occur, and the code will not be executed. It is because, python does not need any statement terminator while the MATLAB code requires one. Therefore, the programmers must follow the proper syntax of the specific language they are using, to ensure the code runs without any error.
3. Code line continuation
Sometimes, the programmers must write a long code statement that is not completed in a single line of IDE. In such a case, a continuation of the statement in multiple lines is needed. The following example can be viewed for different methods of code statement continuation in MATLAB (left) and C++ (right):
The result in both cases is x=6, however, as can be seen, to continue a code statement in multiple lines, a proper syntax guides to add 3 dots followed by a space and the remaining code can be shifted to the next line in MATLAB. No such dots are needed in C++ code to have multiple line code statement. A terminator character (semi-colon) is, however, needed in both cases.
This one is quite an interesting syntax rule. Say a software developer firm added (by mistake if he is lucky) a space, a tab (equal to 6 spaces), or both mixed in one block of code, while some of the lines are non-indented, that violates the Python syntax, and the programmer will get a syntax error. In Python, a proper syntax would be to write a block of code with the same indentation, otherwise, the line of code with different indentation will be considered a different block of code. This is not the case with MATLAB. In MATLAB you can add as many spaces, tabs, or mixed as long as you end the statement with a terminator character i,e. a semi-colon (;). Check the following example with Python code on right with and without syntax error, and its counterpart MATLAB code on the right side:
Proper indentation is fine with both Python and MATLAB, but mismatched indentation generates syntax errors in Python. Syntax wise, MATLAB is still fine mismatched indentation:
Let’s take the example of building floors to understand the indexing in programming. There is a ground floor, and then there’re other floors on top of the ground floor. In some buildings, the floor above the ground floor is considered the 1st floor, because it’s the first floor after the ground floor. While in other buildings the ground floor can be considered the first floor because that’s the first floor you’re on when entering the building.
The same is the case with indexing in programming. Consider a list of 5 numbers saved in a variable called x, such that x=10,20,30,40,50. Now visually, at position 1, there is number 10, at position 2 there is number 20, at position 3 number 30, and so on. This positioning is called indexing in programming. Different programming languages dictate different indexing rules. For example, to get the first number from the list (x) of numbers, a programmer must write x in Python, because in Python 1st number is not considered to be at position 1. Instead, it is called position 0. This indexing is called Zero indexing, while in MATLAB programming, one indexing is the proper syntax for accessing the numbers.
6. Case Sensitivity
Case sensitivity refers to how the programming language treats various words. For example, are the words Foo and foo being considered the same in the language. If they’re the same (by rules of syntax) then this language will be called case insensitive. However, if they’re regarded as different words, then the language is case-sensitive. The case-sensitive languages include Python, MATLAB, and Java. Whereas the examples of case insensitive languages are SQL and Basic. Now, due to syntax condition, the possibilities of naming in a case insensitive languages is a little limited (because grades and Grades will be considered the same thing so you need to choose different variable name)
7. Commenting Style
In programming, various code lines are used for several reasons, and sometimes due to complex logics implemented in a single line for the sake of efficiency, the code semantics differ from common human-level understanding. To make sure such code lines are properly understood later, a comment is added before, after, or in front of the code. The comment is written such that it is not executed during the compilation process of the code. However, it remains available for further review later.
Various programming languages have different syntax to add comments in the code. For example, if a comment is to be added to the code in Python language, a hash symbol (#) is used. Whereas in MATLAB, a percent symbol (%) is used to indicate that the following line is a comment and not a code and should not be executed by the compiler during the compilation of a code. If a proper symbol is not used for commenting a syntax error may occur unexpectedly.
The syntax takes a central and vital role in learning and understanding any computer programming language. Without properly understanding the syntax of programming language, one cannot go farther in developing meaningful, understandable, and efficient code that can be utilized for any good purpose. The integrity and precision of an efficient code remain intact if the proper syntax of a specified programming language is followed. Similarly, reproducibility is a big challenge in programming-based projects and consistently following the proper syntax of a specified programming language can only help the debugging and troubleshooting steps should any issues arise in a bigger project.
Likewise, syntax also helps in understanding code when multiple smaller code snippets constitute a bigger project. In such a situation, following proper syntax leads to simpler, cleaner pathways, without the occurrence of any syntax errors.