Parsing is the process of analysing a string of symbols, expressed in natural or computer languages that will accord formal grammar. Expression Parsing in Data Structure means the evaluation of arithmetic and logical expressions. First, let’s see how an arithmetic expression is written:
An expression can be written with constants, variables, and symbols that can act as an operator or parenthesis. All this expression needs to follow a specific set of rules. According to this rule, the parsing of the expression is done based on grammar.
An arithmetic expression is expressed in the form of Notation. Now, there are three ways to write an expression in Arithmetics:
- Infix Notation
- Prefix (Polish) Notation
- Postfix (Reverse-Polish) Notation
However, when the expression is written, the output of the desired expression remains the same. Before getting started with the types of Notation, let’s see what Associativity and Precedence are in expression Parsing in Data Structure.
Read: Graphs in Data Structure
Before you get started, you need to know what Associativity property is; it provides you with the rules to rearrange parentheses in an expression to provide valid proof. This means a rearrangement of the bracket needs to give the same value as the parent equation. It provides a valid rule to replace the operators.
In an expression containing two or more operators, the operation performed does not matter unless the sequence of operands is not interchanged. If the expression is written using the brackets and in infix, changing the position does not change the value.
Because in Indo-European languages, expressions are read from left to right, most infix operators are left-associative; operators are evaluated in the same precedence. Rising in power is the rule used in considering the infix operators. Prefix operators are generally right-associative, and postfix operators are left-associative.
In some languages, operators and operands are given equal value, where the Associativity is not considered making this language sequence explicit. While in some languages, the operators are non-associative, this makes the use of complex expressions necessary for the use of brackets, which increases complexity for programmers.
Precedence in the Data Structure
Order of Precedence means what order the operators need to follow in a statement of expressions. This is commonly used while working with Infix Notation.
In the situation of <operator> <operand> <operator> operand between the two operators, the preference to allocate the operator is quite tricky. Hence, the operator Precedence rules are followed for calculation. E.g., Here, multiplication has higher precedence, and later the addition operation is performed.
- The most common but not so obvious rule is that multiplication and division operation must be performed before addition and subtraction. Typically, they are collected in the same fashion, so equal importance is provided for all the operators.
- Considering this operation in a logical format, variation is seen in “and” and “or.” Many languages provide equal importance, where “or” operation is given higher Precedence. Some languages consider multiplication or “&,” “&” addition “or” the equal Precedence, where most languages provide arithmetic operations with the highest Precedence.
- Overloading is caused due to no proper allocation of Precedence. Many languages provide negation (true/false) higher Precedence than vector algebra expressions, while some provide equal Precedence.
Also Read: Data Structure Project Ideas
Types of Notation
Now let us learn how the operator position decides the type of Notation.
1. Infix Notation
In Infix Notation, operators are used in between the operands. While reading an expression, Infix Notation is quite easy for humans. But it’s quite time and space consuming to process an infix argument when it comes to a computer algorithm. E.g: p + q
<operands> <operators> <operands>
Infix Notation needs additional information to perform the evaluation; rules are constructed into the expression language using the operator Associativity, Precedence, and brackets ( ) to override the rules.
For example: p * ( q + r ) / s
- Associativity rules suggest that expression needs to be performed from left to right, such that multiplication by p is done before the division of q.
- Similarly, rules for Precedence suggest that multiplication and division operation is performed before addition and subtraction operation is done.
2. Prefix Notation
Here operator is written first, followed by an operand. It is also named as Polish Notation. E.g. +pq
<operators> <operands> <operands>
E.g: p * ( q + r ) / s
Evaluation needs to be performed from left-to-right, and brackets do not alter or change the equation pattern. Here, addition needs to be completed before multiplication because the position “+” is left of “*.”
Here, every operator performs operations on values that are immediate to the left of them. For example, the “+” above uses the “q” and “r.” We can sum up brackets to make this overt:
((p (q r +) *) s /)
Thus, the “( )” considers and uses the two values after immediately preceding “p”, and the result of the +. Similarly, the “/” uses the outcome of the multiplication expression and the “s”.
3. Postfix Notation
Postfix Notation, primarily the operand, is written, followed by an operator. It is also named as Reverse Polish Notation, e.g., pq+
<operands> <operands> <operators>
As for Postfix, the same as the Prefix operation of the expression is left-to-right and “( )” are unnecessary. Here, operators perform on the two nearest values from the right. In the below example, brackets are added unnecessarily, to clear that there is no impact on the evaluation.
(/ (* p (+ q r) ) s)
Here “operator evaluation is from left-to-right” operation values are to their right, and if values themselves involve calculations, then there is a change in the order of evaluation. Taking the example listed above, see the “/” is the primary operator on the left.
It waits until the multiplication operation is completed. And primarily, the multiplication operation needs to be performed before the division calculation is started (and from the above example, it is clear that addition operation needs to be completed before the multiplication operation).
Because Postfix Notation operators use the value to its right; any values involving calculations will have the calculation already completed as we move to the left. So, we can conclude that expression calculation is not as same as the Prefix operator operation.
To highlight all three Notations, the operands come in the same order, and operators need to be moved to provide the right meaning during the calculation. This needs to be considered particularly when considering asymmetric operators “-” and “/”to make it clear p-q is ever q-r unless they have the same value; the values are equivalent to “pq -” or “- pq”
P+q ≡ +pq ≡ pq+
Infix- p * q + r / s
Prefix – pq * rs / +
Post fix – + * pq / rs
First, to perform the operation, multiply p and q and later divide r by s and, at last, add the results.
Below table briefs between the three Notations,
|Infix Notation||Polish Notation||Reverse polish notation|
Conversion between Notations
*To provide clear insight, the brackets are added in the expression,
|( (p * q) + (r / s) )||( (pq *) (r s /) +)||(+ (* pq) (/ rs) )|
|((p * (q + r) ) / s)||( (p (q r +) *) s /)||(/ (* p (+ qr) ) s)|
|(p * (q + (r / s) ) )||(p (q (r s /) +) *)||(* p (+ q (/ rs) ) )|
- You can start converting directly in the bracketed forms by operators in the bracket, e.g. (m + n) or (mn +) or (+ mn). Now repeat this in all the operators by removing the unwanted brackets.
- Now use this trick showed above to convert and parse trees – the equivalent parse trees for each node are:
Checkout: Data Structure & Algorithm in Python
Expression Parsing in Data Structure, Infix, Postfix, and Prefix Notations in Arithmetic expressions are quite different but have the same ways of writing expressions. Knowledge of these is essential in writing programs.
In a computer programming language, the expression is considered and parsed from the string. The Associativity and Precedence rule quite change in different languages.
Why choose a Data science course with upGrad?
Data science is one of the booming fields in computer science. Companies need programmers who have a good knowledge of the basics, which is fundamental for programming irrespective of coding language.
upGrad focuses on providing insightful and informative classes, covering every basic need for becoming a data scientist. upGrad’s 12-Month PG Diploma in Data Science, offered by IIIT Bangalore, is India’s 1st NASSCOM certified course, which comes with 1:1 personalised mentorship from Data Science Industry Experts, covering all the essential Programming Languages, Tools & Libraries. It provides you with the best foundation to start your high-paying data science job.