The point was in python 3.2:
foo⋅bar=42 # File "stdin", line 1 # foo⋅bar=42 # ^ #SyntaxError: invalid character in identifier ### This is another bug that is not in the scope of the post ### http://bugs.python.org/issue2382 print(ord("foo⋅bar"[3])) # 8901 foo·bar = 42 print(ord("foo·bar"[3])) # 183
A point is a punctuation mark, no? And variable names shouldn't use punctuation.
Plus it looks the same, shouldn't it be considered the same?
So I opened a bug and I was pointed very nicely to the fact that unicode characters "MIDDLE DOT" is indeed a punctuation but it also has the unicode property Other_ID_Continue. And as stated in python rules for identifiers, it is totally legitimate.
That is the point where you actively search for a good documentation to understand what in your brain malfunctioned. Then a Perl coder pointed me to Perl Unicode Essentials from Tom Christiansen. Even if the 1st third is about Perl, it is the best presentation so far on unicode I have read.
And then I understood my mistakes:
- I (visually) confused a glyph with a character: a same glyph can be used for different characters;
- unicode is much more than simply extending the usable glyphs (that I knew, but I did not grasped that I new so little).
By the way if you need a reason to switch to the current production version 3.3.0
remember Py3.3 is still improving in unicode support
py3.2 :
"ß".upper() # ß
which is a wrong result while in py3
"ß".upper() # SS
No comments:
Post a Comment