DWise1's Programming Pages

Evaluating (x + x * ++x)


Introduction

One night in Java class, the instructor wanted to demonstrate the trouble that can be caused by referencing a variable more than once in an expression that pre/post-increments/decrements it:
    x = 3;
    y = x + x * ++x;
    System.out.println("x = "+x+"; x + x * ++x = "+y+"\n");

Having always had the good sense to not try something like that, I simply applied the evaluation theory I had learned a couple decades prior and came up with an answer, 15. The instructor, however, came up with an answer of 20. But then he ran that Java program and was surprised when it came up with my answer (which I had not shared with him at the time, but shared with him afterwards via email).

It turns out that while there is a standard way specified to evaluate that expression when humans read it, the code that the compiler generates to evaluate it is implementation dependent and those implementation dependencies can change the results unexpectedly. And as my tests revealed, different languages and different types of computers will indeed generate different results, albeit in two different ways in this case. The bottom line and basic lesson from this test is: don't use the same variable more than once in an expression where you increment or decrement it. There can be a tendency to pack as much as you can in one line of code, especially in C/C++, but doing so could expose you to situations like this.


During the week after that class, I got to wondering about it so I wrote test programs in every language I had access to. I compiled my results and came to realize that the value of y depended very much on how the intermediate values of x were stored and retrieved. An assembly listing of the C code and reference to the Java virtual-machine specification confirmed my suspicions.

So I wrote up my results as posted below.


Sir:

I've been testing that problem we had trouble with in last week's 
Java class (Friday night):

    x = 3;
    y = x + x * ++x;
    System.out.println("x = "+x+"; x + x * ++x = "+y+"\n");

You were expecting y to be 20, whereas the Java program gave it as 15.

I have tried the same code in a variety of different C-like languages 
with the following outputs:

C (DOS program generated with Visual C++ v1.52):
for x = 3; x + x * ++x = 20

C (gcc under Linux):
for x = 3; x + x * ++x = 20

Perl:
for x = 3; x + x * ++x = 20

AWK:
for x = 3; x + x * ++x = 20


However ... 
Java:
for x = 3; x + x * ++x = 15

JavaScript:
for x = 3; x + x * ++x = 15 

PocketC (on Palm Pilot):
for x = 3; x + x * ++x = 15


It turns out that the result all depends on how the language retrieves 
the variable values.  Under Visual C++, I got the following exerpt 
from the assembly listing:

; y = fffc
; x = fffa
;     volatile int x;
; Line 6
;     int y;
; Line 7
;             
;     x = 3;            
; Line 9
    *** 00000d  c7 46 fa 03 00      mov WORD PTR -6[bp],OFFSET 3
;     y = x + x * ++x;
; Line 10
    *** 000012  83 46 fa 01         add WORD PTR -6[bp],OFFSET 1
    *** 000016  8b 46 fa        mov ax,WORD PTR -6[bp]
    *** 000019  f7 6e fa        imul    WORD PTR -6[bp]
    *** 00001c  03 46 fa        add ax,WORD PTR -6[bp]
    *** 00001f  89 46 fc        mov WORD PTR -4[bp],ax

Every reference to the x variable goes to that memory location (this 
turns out to be the crucial point).  The first operation performed 
on Line 10 is the pre-increment, which changes the value stored in x.  
After that point, the modified value for x is retrieved from that 
memory location for all the other references to x:

        x + x * ++x = 4 + 4 * 4 = 4 + 16 = 20

Both C compilers generated native Intel code, as I assume that both 
Perl and AWK had done as well.  Intel code does not normally perform 
arithmetic operations on a stack, but rather with registers and memory 
locations, as shown above.  Furthermore, it fetches the values from 
memory in the order in which they are used, in accordance with the 
precedence of the operators.


On the other hand, Java and JavaScript are not bound to a specific 
processor.  From what I have read about the Java Virtual Machine, 
it is very stack-oriented and does perform arithmetic operations on 
a stack.  That is in agreement with the theory I had learned years ago, 
which was not Intel-based (the 8080A had just become popular).

Here is what I understand that sequence of operations to have been in Java:

    y = x + x * ++x; --<convert to post-fix>--> y (x) (x) (x) (++) * + =

1. Fetch the first instance of x.  
    Read x from memory and push the value onto the stack.  
    Stack == [addr(y),3]  (top of stack is to the right)
2. Fetch the second instance of x.  
    Read x from memory and push the value onto the stack.  
    Stack == [addr(y),3,3]  (top of stack is to the right)
3. Fetch the third instance of x.  
    Read x from memory and push the value onto the stack.  
    Stack == [addr(y),3,3,3]  (top of stack is to the right)
4. Perform the pre-increment.
    Pop the top value, increment it, and push the result.  
    Also update x's memory location.
    Stack == [addr(y),3,3,4]  (top of stack is to the right)
5. Perform the multiplication.
    Pop the top two values, multiply them together, and push the result.
    Stack == [addr(y),3,12]  (top of stack is to the right)
5. Perform the addition.
    Pop the top two values, add them together, and push the result.
    Stack == [addr(y),15]  (top of stack is to the right)
6. Perform the assignment operation.
    Pop the stack and write the value to y's memory location.
    Stack == [] (ie, empty)
    
In standard post-fix evaluation, the values are read from the variables 
in strict left-to-right order and pushed onto the stack or combined with 
the top of stack if an operation is performed.  Therefore, the first 
references to x were read before the increment was ever performed.  On 
the other hand, the Intel-based code handled each reference to x by going 
back to the memory location even after the increment.

As we see from the PocketC test, a C program written on a stack-oriented 
platform yielded the same results as the Java program had.  That would 
mean that C code can compile and perform differently on different platforms 
(variations in the definition of integer types is well-known).  But since 
Java is designed to be platform-independent -- or rather dependent on a 
standard virtual machine --, I would expect Java programs to run more 
consistently across different platforms.  Once you know what to expect from 
a Java program, you should be able to expect the same results when you run 
it on any platform.

Though the real lesson is as you said: always test your code before you 
make pronouncements about it.

=================================================
LISTINGS:

test.c:

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
    int x;
    int y;

    x = 3;
    y = x + x * ++x;
    printf("x = %d; x + x * ++x = %d\n",x,y);
    return 0;
}

-----------------

test.pl

#! perl -w
# file: test.pl

my $x = 3;
my $y = $x + $x * ++$x;
print "x = $x; x + x * ++x = $y\n";

## END ##

-----------------

test.awk

BEGIN{ x = 3; y = x + x * ++x;}
END{print("x = " x "; x + x * ++x = "y);}

-----------------

Test.java

public class Test
{
    public static void main(String args[])
    {
        int x;
        int y;

        x = 3;
        y = x + x * ++x;
        System.out.println("x = "+x+"; x + x * ++x = "+y+"\n");
        System.exit(0);
    }
}

-----------------

test.html

<HTML>
<HEAD>
<SCRIPT>
function doIt()
{
//    int x;
//    int y;
        
    x = 3;            
    y = x + x * ++x;
    document.write("x = "+x+"; x + x * ++x = "+y);
}
</SCRIPT>
</HEAD>
<BODY>
<SCRIPT>
doIt();
</SCRIPT>
</BODY>
</HTML>

=================================================

Return to Top of Page
Return to My Programming Home Page

Contact me.


Share and enjoy!

First uploaded on 2008 February 08.
Updated 2020 January 19