Eval is even more really dangerous than you think


Preamble, I know about this excellent article:
http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

I have a bigger objection than ned to use eval; python has potentially unsafe base types.

I had this discussion with a guy at pycon about being able to safely process templates and do simple user defined formating operations without rolling your own home made language with data coming from user input interpolated by python. Using python for only the basic operations.

And my friend told me interpolating some data from python with all builtins and globals removed could be faster. After all letting your customer specify "%12.2f" in his customs preference for items price can't do any harm. He even said: nothing wrong can happen: I even reduce the possibility with a regexp validation. And they don't have the size to put ned's trick in 32 characters, how much harm can you do?

His regexp was complex, and I told him can I try something?

and I wrote "%2000.2000f" % 0.0 then '*' * 20 and 2**2**2**2**2

all of them validated.

Nothing wrong. Isn't it?

My point is even if we patched python eval function and or managed sandboxing in python, python is inherently unsafe as ruby and php (and perl) in the base type.

And since we can't change the behaviour of base type we should never let people use a python interpreter even reduced as a calculator or a templating language with uncontrolled user inputs.

Base types and keywords cannot be removed from any interpreters.

And take the string defined as:

"*" * much

this will multiply the string by much octets and thus allocate the memory ... (also in perl, php, ruby, bash, python, vimscripts, elispc)
And it cant be removed from the language, keywords * and base types are being part of the core of the language. If you change them, you have another language.

"%2000000.2000000f" % 0.0 is funny to execute, it is CPU hungry.

We may change it. But I guess that a lot of application out there depend on python/perl/PHP ruby NOT throwing an exception when you do "%x.yf" with x+y bigger than the possible size of the number. And where would set the limit ?

Using any modern scripting language as a calculator is like being a C coders still not understanding why using printf/scanf/memcpy deserve the direct elimination of the C dev pool.

Take the int... when we overflow, python dynamically allocate a bigger number. And since exponentiation operator has the opposite priority as in math, it grows even faster, allocating huge memory in a matter of small iterations. (ruby does too, Perl requires the Math::BigInt to have this behaviour)

It is not python is a bad language. He is an excellent one, because of «these flaws». C knight coders like to bash python for this kind of behaviour because of this uncontroled use of resources. Yes, but in return we avoid the hell of malloc and have far less buffer overflow. Bugs that costs resources too. And don't avoid this:

#include <"stdio.h">

void main(void){
    printf("%100000.200f", 0.0);
}

And ok, javascript does not have the "%what.milles" bug (nicely done js), but he has probably other ones.


So, the question is how to be safe?

As long as we don't have powerful interpreter like python and others with resource control, we have to resort to other languages.


I may have an answer : use Lua.

https://pypi.python.org/pypi/lupa

I checked  most of this explosive base type behaviour don't happen.

But, please, never use ruby, php, perl, bash, vim, elispc, ksh, csh, python has a reduced interpreter for doing basic scripting operation or templating with uncontrolled user input (I mean human controlled by someone that knows coding). Even for a calculator it is dangerous.

What makes python a good language makes him also a dangerous language. I like it for the same reasons I fear to let user inputs be interpreted by it.

EDIT: format http://pyformat.info/ is definitely a good idea.
EDIT++: http://beauty-of-imagination.blogspot.ca/2015/04/so-i-wrote-proof-of-concept-language-to.html

6 comments:

xhevahir said...

Don't Lua's load() and loadstring() have the same characteristics as Python's eval(), though? How is Lua safer in this respect?

jul said...
This comment has been removed by the author.
jul said...

If the author of luaJIT says lua cannot be sandboxed, maybe we could trust him?
http://lua-users.org/lists/lua-l/2011-02/msg01606.html

So I decided to write my own POC of safe templating (http://beauty-of-imagination.blogspot.ca/2015/04/so-i-wrote-proof-of-concept-language-to.html) to solve the problem

jul said...

And yes it took me 1 day, and I am self taught in language theory/CS.

keselbingo said...

We're all glad you're here, so strong and bringing light to the ignorants.

But why is this blog in the Planet Python feed?

jul said...

Because I politely asked for it. Asking if it was ok.

And the code is in python, it makes 400 lines of code for base implementation of a language and the license is free enough for people to use it. And it is on pypi.

And it solves the unsafe python eval problem.

And yes you may be ignorant (or not), but I don't care.

Is it enough ? I have a full time exhausting job, I don't have time to do like other bloggers discovering the marvel of map/reduce, distributed systems, list/dict comprehension. I mastered this a long time ago.

Least and last, I am a programmer that can do python not a python programmer. I don't have any fan attitude toward this language, I use it for its good part, and try to fix the bad parts.

This post is about a solution for confined safe arbitrary code execution. Do you have a better pythonic solution?

I guess not.