random bugs in non random code

The last bug I had to fix took me 2 days to discover. I updated from python 3.2 to python 3.4. One of the tests failed randomly when it was executed with python 3.4

The error happened in the LR0 parser

======================================================================
FAIL: testArithmetic (tests.unit.test_Parser.TestLR0Parser)
----------------------------------------------------------------------
Traceback (most recent call last):
    File "/home/travis/build/nesaro/pydsl/tests/unit/test_Parser.py", line 129, in testArithmetic
     self.assertFalse(parser(['123','+','+']))


nose.proxy.AssertionError: True is not false

(build details)

It is a simple operator grammar, and that expression should obviously fail. But sometimes it didn’t.

How the LR0 parser works

It scans the grammar and generates a table with states and transitions. When the parser is called, the input moves the state until the end of the string is consumed or a failure state is reached. The test returns the final state with the given input

Looking for the bug…

So the first thing I tried was to test if there was any difference in the inputs of the functions. After that, I compared the LR table between a success and a failure. Other than the order of a dictionary (which is expected to vary), there was no difference.

I inspected every hash implementation in the code, and made a few classes singletons, but that didn’t change the behaviour.

Eventually I looked at the changes in the state of the table. For the same input, two different values were being returned. At the beginning I thougt that there was an issue with the hash function, so I reviewed them again unsuccessfully. How can a dictionary return different values for the same key?

The bug and the fix

Even worse, the dictionary apparently had the same key twice.

dict.keys()
>>>> ['*','*',')',.....]

and here is the function that gets an element from the dictionary:

    def insert(self, state, token):
        """change internal state, return action"""
        for symbol in self[state]:
            from pydsl.Check import check
            if token != EndSymbol() and isinstance(symbol, TerminalSymbol) and check(symbol.gd,token):
                break
        else:
            if token != EndSymbol():
                return {"action":"Fail"}
            else:
                symbol = EndSymbol()
        try:
            return self[state][symbol]

the function is returning the first element that checks if the token matches the symbol. Two very similar symbols are stored in the dictionary, both have the same string representation and both have the same grammar structure. The function iterates over the dictionary and two elements could be returned… so one of them could be returned randomly. The bug was in the construction of the table (two elements with the same meaning had different entries) and in the way the dictionary is accessed (return the first element that matches,and ignore the rest)

So why python 3.4?

the bug never happened before python 3.4. What’s new in python 3.4 to explain the new behaviour?

from the 3.3 release:

Hash randomization is switched on by default.

which means that the hash of the keys in the dictionary are now randomized and therefore the order of the dictionary may differ between two executions.