Why we should Say Bye Bye to Hello world!

In physics, when someone comes with a wonderful idea, the next question is: can you show us something practical -often just fun- we could do with that?

When introducing development the classical "hello world" example is there to prove how easy it is and how wonderfully any one can code.

At my opinion, this example is making incompetent companies and absolutely unskilled developers alike getting attracted to a field that may provide us money, but also the burden of their misconceptions. And sometimes cowardly I dream I could trade money I earn for living in a more sane environment. Yes I can trade freedom for comfort at this level of insanity.

Follow my steps in the realm of the white rabbit on steroïds.

 What are you supposed to do to print hello world on a screen?

Short answer :
  • open file
  • write incantation called programming
  • save/compile/give to interpreter
  • Magic Happens
Neat, no?
It is exactly the harmful way of thinking.

This short example is a kind of an ode to the black magic of reusability that is plaguing our actual world.

After you have done the hello world, you still know nothing jon snow.

In fact you ignore the biggest problem of programming: resources

Resources you are spoiling by requiring Gigabytes of disk spaces, and memory, screen, keyboards,  an operating system, 250W power supply, just to be able to launch your editor. 
The prerequisite you expect from a computer to do that is huge.

In assembly language the hello world example is pretty straigth forward.

Given the fact that we are given the full control of the memory and have access to both memory and video memory (the framebuffer). Than it may hold in less than 6k.

I have no operating system. Just memory I can read and write.

First I must think ahead. Does a computer know how to print a character on a screen at startup?



For this, like in hello world, a firmware on the motherboard with no know capacity of the world boot, probe its devices, and make the devices come to life.

Actually the motherboard could launch a software to do it... if it only knew of the thousands of peripherals sold everywhere and the code to boot them patched with the latest versions. But since hallucinated people falsely claimed that programming was about black magic, every computer user have been expecting  computers with no operating system loaded to already being configured enough to print stuff on screen like:

This is ambios feng shui 2008
Let me show off and print stuff on screen while I should not have been programmed to do it so that customers is happy watching my giblish status.


So on top of my 6 thousands as simple instruction as hello world, to make hello world works I also need tenth of thousands of lines of code to just have the screen come to life.

But How do I get this code?

By black magic of PCI.

A decoder with fixed adress that when are wired on the right cable can exchange informations for turning them on through a very elaborate dance like

Knock knock adress pci id (lit) model id (lit) I am PCI investigator, Decline your claim

Yes, "I am ...." 256 world CV to print for your amazed eyes. I also want these relative addresses to be reserved for me, and register these alarms for me, and set these parameters, oh and call me back later

This is done, your extra info are ....

Oki, now you can go at this address and jump into my firware in pure assembly and you execute it. Thanks....

Youhou I love to execute arbitrary code ..... Oh! and now I can execute routines with fixed address positioning to call code before your hello world exists! in my "console"! I can print my status...
I am the mad CPU youhou....

I would be doing a device claiming to be secured, I would refuse to have a computer execute arbitrary code. I would be scared to see a screen working before MY code has set it up to work.

So we stack black magic on black magic for the sake of unrealistic expectations. We want computers executing arbitrary code from outside vendors that can be founded by people with interests potentially diverging from mine to be secured... That is unrealistic.

We expect all computers to be the same with a purely abstracted twisted model that does contradict reality.

claiming this is atomic // executed in one cycle

*p = 4;

is a lie.

Memory should not be presented as linear. Because causality matters.

Hugh. When you design a computer, the speed limit is computed by the longest wire in nanometers, and then with basically the frequency being related to length given the speed of light as a limit you cannot do much.

Open your computer and look: memory banks are CENTIMETERS away from the CPU. make your computation, look at the figure, look at the frequency and say! Wut?

Yep, basically if we could access memory in one cycle it would make our computers millions of time slower. And the more the memory, the more the physical distance. So the more linearly accessible memory you have the smaller your frequency is. So real memory is never linear and will not tend towards linearity.

That is the reason to be of address bus, of L1/L2/L3 cache.

RAM is accessed in thousands of cycles, cache in hundreds of cycles.

The more memory you want to access the more layer of indirections you put.

Thus resulting in diminution of speed with the more memory you use or the faster your cores are.

The ultimate L0 memory is called registers. That is where we can do programming by playing with a memory model of at most 64 words in one cycle. No programming language let you access it. That is where programming really happens. In a claustrophobic place on a dire of silicium inside what you call a core. This place you are pushed away from by programming language and operating systems. You never went there. This requires a pure hellish initiation from which you'll come back seeing the dark side of the world. Instead you let compilers, interpreters operating system transform what you said in a code that may not respect your intentions.

So when you access the other memories (not registers), all the requests are analysed, decoded rerouted by circuitry.
They are all handled by the CPU with complex enough circuitry to be considered as a computer itself when you have learned to program microcontrollers.  It presents you data not from where you asked, but from another location. The cache. Magically! In the middle it may have disrupted the other code flowing in the pipelines. Breaking the claim of isolation.

It does the magic of performance happens, by handling 3 level indirection to mask the real performances of the computer and perturbing other code. The one you hit "only in worst case".

The worst case being ironically the one you wish to reach: your computer fully occupied to justify the investment.

Where is worst case? It is near the transition to full work in best possible case for a business: a lot of customers with different contexts, doing a lot of different things. You wish for every business to be at the crossroads between a lot of different profiles of buyers and providers. That is you wish to become a defacto standard in a place you minimize risks. Risks are coming for not diversifying your sources of incomes and of providers.

Well, that is the worst case: the computer will probably kick more often than average page faults. Thus penalizing the best case possible. because of expectations you had. You will work for the locality of variables : you are gonna therefore aim to cluster things, reduce differences. That is the purpose of bug data. Create ways to reduce customers to a set of "relevant datas" that is we classify them as sex, age, location ... You also have devops justifying the use of computer at 100% of CPU to reduce OPEX and thus resulting in a degradation of performance resulting in more level of indirections redirections, dependencies, availability ... making you sucked in the spiral of diminishing OPEX using devops black magic mantras. Did I said it also makes your CPU leak information thus making security nightmarish?

Hell is paved with good intentions and bankruptcy with inept costs reductions.

Just for avoiding a worst case in a computer...

Just because some believe in black magic...
Just because people have no intuitions...
Just because idiots propagate the rumour that when you do an "hello world" you learnt programming. No! You dislearnt that there is a price for everything, and people have done a lot of questionable choices for you regarding cost/price issues.
Just because of this, your choices are constricting your profitable business incomes.

You may not need an operating system in the first place. You may not need firwmare, nor a bus to communicate with the hardware nor even level 3 cache memory. You maybe totally fine using a 1.5 V battery to light leds on a gigantic banderolle in 255 levels of color per pixel for the length of 3 days comicon convention because you may not obviously want to bring on a light trip a full computer that has a 17 inch screens and install a C compiler, linux, maybe python, apache.... drivers for the banderolle that you might have to write if it is custom resulting in more lines of codes for the sake of the god of black magic of interoperability. Just writing an  hello world module/driver for linux outweigth the complexity of lighting a led in assembly language.

Hello world value a very bad habit of magical thinking about the use of resources.

Whatever happens the printf equivalent implies used for hello world implies an abstraction of interface and assumption that reflects a certain model of what computer are that is totally disconnected of reality.

The first lie of hello world is that the context in which you program does not matter.
That resources are free.

All business should be about mastering costs and prices. And people not caring of costs should be barred from being near production lines.

It is another lie of programming; code does matter. Code matters, but not as much as data structures. Because the art of data structures is to map the data the closest  to the wires in a fashion that goes with the strength of the computers. For stuff to print on a screen you want a contiguous memory mapping to make the "media/STRing/blitter specialized" routines to kick.

If you want good data model for accessing data in memory with a level of indirection, you have to have your data structure map the reality of the hardware memory.

For every use data structures are the place where the most gain in productivity is found.

Read this article for a complete essay:You are doing it wrong (PhK)

Yep, knowing of B-tree is more essential then doing recursion or OOP, or even functional programming. The language matters less than the fine control of your data locations.

Hello world focus on code. Not on data.

And with 4 pages I did not even raised the issues of the bloatwarish dependencies involved in hello world; "standard" libraries, module control, exec/thread models, scheduling, io and devices, dynamic libraries, security/user model, permissions, the optional agument in the signature of printf, then handling of the function calls with the stack, implicit memory allocations...

hello world is an invitation do get what is programming the wrong way. It is pretending knowing the map is more important than the territory.

Nothing comes as a solution except the careful analysis of a problem. Hello world is just a universal solution to your programming problems: exepecting developers with limited knowledge of computers being able to make code work good enough in a business context. Resulting in cashflow being poured in solutions with randomly diminishing returns, thus making your growth antagonizing with your profits based on your initial figures with a toy proof of concept pretending to "scale".

Pushing Hello world as the first introductive step in programming is as misleading as the claim that being able to turn on a flash on a camera is your first mandatory step as a fully competent photographer... Programming is not about coding, and bringing solutions, it is about analyzing and solving problems eventually by creating code when that costs less money on the long run than not coding. For this the complexity of computer should not be hidden by stack of magic.

Everytime someone is writing "hello world", Chtulu shimmers, kitten are killed, and you can hear the Apocalypse's Knights nearing, one hello world at a step.

Hello world is magic, thus it should be excommunicated of the programming culture and burnt as a witch. All references and books printing this example should be burnt in the name of the enlightenment of the people by pure reason. No obscurantisms should be mistaken with careful mastering by the experience of trials and errors. Amen.

How I am experimenting about plants, stochiometry and ecology

So, I am bored.

And when I am bored I do something new. And this time I am growing plants.

Tomato from old times. From before mechanization because I hate gardening, but, I love tomato with taste [easy to harvest tomato sux]

In the process I learnt a lot of thinks.

In fact, since I knew my great father was working on the topic, by curiosity I went to see his research [on fumure profonde] becaue I was disappointed at my results with just water and how my plants were dying.

I also remembered my talk with my totally ecology crazy neighbour [ecological stochiometry]

This article is not exactly about his talks. His talks are about ratio of C / H / O / N / K / P and how their availability favours some species, and also how the equilibrium can happen as an evolution of a complex system.  Complex system, chemistry, physics, non euclidean geometry and our families and eating good foods have always been our favorite talks. Actually, the topic is broader than my own plants, or not...

Well ... I live in a small flat in Montréal, I don't have the space to build an ecosystem. So I have to cheat.

Hence my grand father. To cut short: without prior knowledge, by using his hints and making my own KNoP (Potasse, Nitrogen, Phosphate) cocktail and giving them at the right time I have actual results.
Chemistry works: effect of non optimal 20 - 20 - 20 on my window chilis

What is the alternative to me using my great fathers KNOP chemical ratios?

Healthy food growing. With compost.

Well. My plants would die, it would be slow and it would costs a lot. I would not be able to feed myself, and, worse, it is NOT sustainable.

A compost's composition is hard to control in its ratio of KNOP. It is also hard to control the level of acidity (plants wants a slightly acid soil), and it is easy to grow bacteria, and stuff that can pollute your food, make you sick or kill it. The worst is the energy efficiency of the process compared to the chemical process to create the same chemicals "naturally".

As a scientific, I would not trust my knowledge without experimentation.

And growing plants has convinced me of one thing: so called organic sustainable development is not.

It requires too much energy, knowledge, money, time. It also generates a lots of waste and inconvenience.

Actually food being more expensive and less easy to produce is Malthusianism. Every part of the city where healthy food appear, it is a part of the population that either cannot afford to eat, or have to cut on spending, or is forced to move away.

In the same time, their tomato are not even as good as the one my father was growing.

I love tomatoes, I want to be able to help the world and the modest to share this pleasure, so I want to herald My own message: don't listen to rich spoiled kids and obscurantists stupid people. Learn to grow your own plants, learn to value the knowledge, and make your own decisions according to where you are, and what you have access too. I don't have space, I don't have time, I go for KNOP, but it is totally legitim in a different environment to go for other solutions (like [these amazing swiss people])

Follow your way, and learn, and have fun and awesome tomatoes. And kick the "healthy food" morons in the balls.
My mad scientist kit, and yes I feed plants with blood, mouhahaha
I have vampire plants

IPv6 j'aime pas.

Comment fait on pour protéger une maison qui n'a pas de portes d'entrée?

Il en est ainsi d'IPv6:  on ne peut plus définir d'intérieur et d'extérieur. C'est comme une maison dont toutes les pièces sont accessibles de l'extérieur et pour lesquelles il faut remettre sonnette, serrure, alarme et interphone sur chaque portes.

Et quelle idée farfelue de mettre la mac dans une adresse IP : veut-on à ce point aider les gouvernement à suivre les traces des utilisateurs tout en aidant les attaquants?

C'est comme badger avec sa carte d'identité pour se déplacer dans toutes maisons la sienne incluse.

Comment adopter IPv6 quand la transition nécessite de changer tous les équipements et les savoir faire alors qu'il est déjà si difficile de trouver des admins réseaux compétents en IPv4 (bien plus simple)?

C'est comme passer des maisons en bois à 2 étages aux constructions parasismiques dont la taille minimum est de 40 étages, et s'étonner qu'on a une crise du logement, et que les plus pauvres soient exclus.

IPv6, HTTP2.0, Oauth2.0 même combat : ce sont des éléphanticiels qui font régresser internet de la presse de Gutenberg numérique ouverte à tous à une spécialité de tartuffes numériques digne d'une pièce de Molière.

Cette tendance à faire passer pour profond ce qui est obscur ça s'appelle de l'obscurantisme.

Fuck IPv6, fuck l'IETF moderne qui est devenu un repaire de savants acculturés, fuck unicode et l'informatique moderne.

PS lire nanog est instructif

PPS: je retourne poursuivre les recherches de grand père sur la culture des plantes hors-sol, et l'amélioration du quotidien de l'humanité par des techniques simples et que l'on peut enseigner facilement. Vive daddy glaenzer, j'ai pas été élevé pour devenir un gros con.

Coup de gueule : retour des maladies infectieuses nous sommes tous responsables

Darwin gagne les maladies infectieuses reviennent:

- retour de la rougeole en 2015 aux UK (antivaccins)
- premier cas depuis 1986 de diphtérie en espagne (antivaccins).
- retour de la rougeole; des oreillons (ça peut rendre stérile),  de la coqueluche et de la rougeole aux USA en 2015 (antivaccins)
- syphilis (canada) (raison que j'ai pas eu le temps de trouver)
- tuberculose (résistante aux antibiotiques, manque de soin des sans abris) (paris),
- poliomyélite (conflit)
- diphtérie (Europe de l'Est)(désorganisation des services de santé suite aux bouleversement politiques post guerre froide)

La tendance est à stupidement transformer le problème en antivaccin contre le reste du monde. Ce n'est pas vrai. Avec une bonne éducation scolaire, il n'y aurait pas d'antivaccins.

Les facteurs sont certes multiples (réchauffement climatique, conflit, aide insuffisante vers les pays émergents, flux migratoires, mégapoles, récession, dettes publiques). Mais dans une tendance de réémergence systémique, les cas "évitables" du fait de choix d'adultes envers des enfants qui sont vulnérables et n'ont pas de libertés à choisir ne sont pas tolérables, d'autant que ces choix peuvent affecter d'autres personnes : personnels médical et de secours, autres enfants en premier lieu...

Certes les antivaccins sont des gros cons et je les aime pas. Mais, ils sont l'arbre qui cache la forêt.

Les états sont encore plus responsables:
- désorganisation des services de santé (politiques basées sur des visions comptables de premier ordre);
- augmentation des coûts des soins, diminution de leur efficience (la prévention coûte moins chère que les soins);
- abandon des populations vulnérables qui deviennent victimes et vecteurs (sans abris, victimes de conflits, réfugiés, migrants, population défavorisées);
- désorganisation de l'éducation privilégiant la formation de main d’œuvre qualifiée au détriment de la formation de citoyens éclairés à même de comprendre ce qu'est un vaccin et une infection;
- conflits qui font perdre à des pays des décennies entière de progrès (zone sud, syrie (qui par expérience avait une médecine de qualité), irak, afghanistan, palestine, ukraine);
- discours anti science (négation du réchauffement climatique);
- réchauffement climatique/politique énergétique

J'espère qu'un jour la responsabilité pénale sera aussi effectivement appliquée aux hommes politiques. (scandale du sang contaminé, maladie de la vache folle .... qui ont abouti à rien)

Je ne donne pas mon blanc seing pour autant aux industries pharmaceutiques tentées de créer des innovations marginales afin de renouveler leurs brevets.

C'est pour ça que je soutiens mon pote Charles Gagnon​ quand il fait partie des rares infirmiers à protester contres les gels de salaires dans son hosto.

Et, si j'écris c'est aussi pour parler des pire: les citoyens apathiques bien qu'ayant la compréhension qui pensent égoïstement "à chacun sa graisse" car ils sont encore au chaud... pour l'instant.

Notre manque de solidarité nous tuera si nous ni prenons garde car nous sommes tout autant la cause que les victimes potentielles futures ainsi que nos enfants, neveux, nièces et proches.

Why big data is a fraud: the actual dot com bubble according to CS 101

Just yesterday I was told: you know this big O notation, this database index, it is 30 years old. It is not true any more.

And I answered: "well, Newton said that if you jump from the 7th stair of this bulding you should die, it is 200 years old, why don't you give a try. this knowledge is so old it should be obsolete"

The fact is big O notation still matters.

CS 101 cheatsheet

Basically, the first lesson in CS 101 is called learning about complexity. It is very basic (except for the readers of hackernews that needs D3js animation to think they understand something and cannot read text).

It says, whatever you do, the bigger the size of a container, the more it takes resources to retrieve an item. However, you can trade memory for speed (and vice versa).

Electronic 101 says: you can have all the memory you want, but the bigger the size the more it will cost you in wiring. However you can trade indirection (speed) for money. But linear addressing is growing more than linearly so you should trade speed for money. Which means you have a diminishing returns.

What does it means?

Imagine you are poor and have one pair of socks. How much time to get it? One iteration.

You are rich, you have 1000 pairs of socks and want to find one. How much iteration?
Well, it depends. If you are organized and have space (memory) you can pre organize you socks in a fancy way. It will take on average an order of magnitude of ~log(1000) to find your socks. If you are poor (or not knowledgeable in the art of organizing socks) you will have to to examine all of your socks with an expectation of 1000 / 2 odds on average to find your socks. (worst case being 1000 with a probability of 1/1000)

Hum 7 CPU cycles vs 500 seems pretty good.

After all, actual computer have up to 16 cores doing 2.8 cycles per seconds (limit is speed of light and heat dissipation).

Well, no.

Transactions (stuff that yield money) are requiring a time line. An order. This is guaranteed by using a simple core. So you are bound to 2.8 cycles per seconds for transactions as long as you don't have a relationship...

You may use multi threading, but you will have point of serialization (join/lock/memory barrier) and doubling the cores/process/threads/instances/containers tends to yield a +40% increase in speed. It has diminishing returns.

In the long run CS 101 basically says:  "the more data you stores, the more resources you will use to search for an item and resource". Costs are a monotonic growing in function of the size with an efficiency that is less than linear.

Basically if 100 customers cost you 1$ to handle, 1000 customers will always costs you more than 10$.

How bad is it?


Going in register is 1 cycle
Going in cache is ~150 cycle
Going in RAM  ~500 cycle
Going on a Hard Drive is 15000 cycles...

You can trade memory for speed, but the cost of indirection make the system have an absolute minimum.

You may use cache. But is just diminishes the latency as long as you are in under-run situation. It just mask the problem with a delayed effect.

But resource use however smart you are is more than linear function of the size of collection in which you search aggravated by relationships.

It basically means that even if you are google or facebook, the more data and customers you have the more your costs will increase.

What it means is in terms of economy: the more customers you have the lesser your profits.

Are dot com stupids?

Hell no. All is a question of opportunity. Thanks to QE the stock exchange is full of liquidities.

This diminishing returns are noticeable/measurable only after a big enough growth. Time series are growing linearly with time, and the customer base... grows its way (slowly most of the times and IPO is often before the full success).

Venture capitalists are in for the money. They don't care if the market will not be sustainable in 15 years, they aim for a profit in 10 years where the effects will not be noticeable.

Developers are ... well ... clueless or needing money to reimburse their student loans, and powerless.

Customers... well... if they don't adopt the new technology that seems to be 10 x less expensive (for now) they don't care about the 5 next years if they are wiped out by the concurrence in 2 years.

And founders are either clueless (and lucky), or they have enough money to mask the problem. (When you will sell your shares for 10 times your investment after 5 years, you can invest on diminishing returns for 5 years, it is just a problem of how wealthy you are, and in most occidental economies it basically correlates with the fact of being born wealthy).

I say it loud: a business with systematic diminishing returns is doomed. If the more customers, the lesser the profits then something is wrong.

The stock market is totally out of touch with the economical reality, and this is a sign of a bubble.


De mes hypothèses farfelues et réaliste pour éradiquer les punaises de lit

Je passerais sous silence une possible expérience dont je ne connais pas l'issue, mais dont les circonstances sont tellement abracadabrantesques que toutes les issues entâcheront certainement à jamais ma renommée d'homme de bon sens.

Néanmoins, j'ai une hypothèse.

Je pensais chercher un répulsif parce que c'est ce que l'on veut, et je me suis retrouvé à chercher une substance attractrice car c'est facile de vérifier qu'on à trouvé.

Par contre, c'est une analyse coût risque. Ce calcul certes rationnel économiquement (comment avoir confiance dans une recherche que l'on ne peut prouver est un vrai problème). On imagine à tort, que trouver un vrai positif est plus dur que de détecter un vrai négatif.

On imagine qu'il est plus facile de décrire ce qui est que ce qui n'est pas.

Comment je fais quand j'ai pas de punaises pour prouver que je les repousse avec une substance?

Je ne peux pas, je dois avoir confiance. La monnaie c'est la confiance après tout (fiduceo).

Et elles plus facile pour tout individu de croire en ce qu'il contaste ....

Personne n'achétera de répulsif sur le marché. Car les gens finiront (et pas toujours à tort) qu'on leur vend de la poudre de perlinpinpin.

Donc, les substances répulsives, pourtant celles nécessitées, ne peuvent être pour raisons de confiance et donc de logique achetées. En clair, si t'achète ça t'es peut être un gogo.

 L'éradication des punaises marchera par une révolution que je hais (puisqu'elle a abouti à l'interdiction de l'absinthe) de l'hygiène.

Pas la stupide genre : "fais pas çi, fait pas ça", mais plutôt d'impliquer les gens dans la création de cette nouvelle hygiène par leur propre intelligence afin de faciliter son adoption.

Si les gens peuvent expérimenter, bâtir leur propre corpus de connaissances dans lequel ils peuvent avoir confiance, ils adopteront plus facilement des mesures permettant de lutter contre ce fléau.

Je me sens certes dans une cause aussi importante que celle qui engendra une des premières jaqueries du Vexin pour interdire que le saumon fût servit tous les jours de la semaine (oui l'Oise était saumoneuse au moyen âge). Mais je m'y tiens.

Je suis comme St Louis, pas vraiment le blaze qui traînait dans le coin, mais le gadgo qui s'en revenait ahuri. Je sens comme Villon prêt à dire "Paris près de Pontoise". Une foule d'obscures inconnus dont la loufoquerie et la vaillance ont contoyé le tragique et la pleutrerie.

Comme ceux qui m'ont inspiré, je trouve certes farfelus et loufoque de dire qu'il faut que ce soient les citoyens qui prennent en main avec leur propre moyens intellectuels et physique la recherche dans ce domaine appliqué.

Ce n'est pas une recherche rentable, et de toute façon vous auriez raison de douter empiriquement de toute substance répulsive.

Il faut que vous fassiez partie de la résolution scientifique et expérimentales de ce problème pour virer ce problème. En apportant idée, et tant à reproduire les expérimentations pour les valider et les invalider. Vous devez aussi proposer des solutions, à valider ou invalider.

Quand je parle des punaises de lits. En fait, je pense que cela s'appliquer à tous les problèmes.

So I wrote a Proof of Concept language to address the problem of safe eval

I told fellow coders: «hey! I know a solution to the safe eval problem: it is right under my eyes». I think I can code it in less than 24 hours from scratch. It will support safe templating... Because That's the primary purpose for it.


I was told my solution was overengineering because writing a language is so much efforts. Actually it took me less time to write a language without any theorical knowledge than the time I have been loosing in my various jobs every single time to deal with unsafe eval.

Here is the result in python : a forth based templating language that does actually covers 90% of the real used case I have experienced that is a fair balance between time to code and real features people uses.

You don't actually need that much features.

https://github.com/jul/confined (+pypi package)

NB Work in progress


How I was tortured as a student

When I was a student, I was nicely helped through the hell of my chaotic studies by people in a university called ENS.

In exchange of their help I had to code for data measurement/labs with various language OS, and environment.

I was tortured because I liked programming and I did not have the right to do OOP, malloc, use new language .... Perl, python, new version of C standards...

Even for handling numbers scientifics were despising perl/python because of their inaptitude to safely handle maths. I had to use the «numerical recipies» and/or fortran. (I checked in 2005 they tried and were disappointed by python, I guess since then they might use numpy  that is basically binding on safe ports of numerical recipies in fortran). I was working on chaotic system that are really sensitive to initial conditions ... a small error in the input propagate fast.

The people were saying: we need this code to work and we need to be able to reuse it, and we need our output to be reproducible and verifiable : KISS. Keep It Simple Stupid. And even more stupid.

So I was barred from any unbound resource behaviour, unsafe behaviour with base types.

Actually by curiosity I recompiled code that was using C and piping output to tcl/tk I made at this time to make graphical representation of multi agent simulations and it still works... It was written in 1996.

That's how I learnt programming : by doing the worst possible unfunky programming ever.  I thought they were just stupid grumpy old men.

And I also had to use scientific equipment/softwares. They oddly enough all used forth RPN notations to enable users some basic manipulation.

  1. ASYST
  2. RRD Tools
  3. pytables NUMEPXR extension http://code.google.com/p/numexpr
And I realized I understood:

FORTH are easy to implement:
  • it is a simple left to right parsing technique: no backtracking/no states;
  • the grammar is easy to write; 
  • the memory model makes it easy to confine in boundaries;
  • it is immutable in its serialization (you can drop exec and data stack and safely resume/start/transport them)
  • it is thus efficient for parallization,
  • it thus can be used in embedded stuff (like measurement instruments that needs to be autonomous AND programmable)
 So I decide to give me one day to code in python a safe confined interpreter.

I was told it was complex to write a language especially when like I do, I never had any lessons/interests in parsing/language theory and I suck at mathematics.

Design choices

Having the minimum dependency requirements: stdlib.

 One number to rule them all 

I have been beaten so much time in web development by the floating point number especially for monetary values that I wanted a number that could do fixed point calculus. And also I have been beaten so many time by problems were the input were sensitive to initial conditions I wanted a number that would be better than IEEE 754 to potentially control errors.
So I went for the stdlib IEEE 854 officious standard based number : https://docs.python.org/2/library/decimal.html
Other advantages: string representation (IEEE 754) is canonical and the regexp is well known. Thus easy to parse.

In face of ambiguity refuse to guess

I will try to see input as (char *) and have the decoding being explicit.
Rationale: if you work with SIP (I do) headers are latin1 and if you work in an international environment you may have to face data incorrectly encoded that can also represent UTF8 and people in this place (Québec love to use accents éverywhere). So I want to use it myself.

It is also the reason I used my check_arg library to enforce type checking of my operators and document stuff by using a KISS approach: function names should be explicit and their args should tell you everything.

Having a modular grammar so that operators/base types can be added/removed easily. 

I evoked in a precedent post how we cannot do safe eval in python because keywords and cannot be controled. So I decided to have a dynamic grammar built at tokenization time (the code has the possibility to do it, it is not yet available through the API).

Avoid nested data structures recursive calls

I wanted to do a language my fellow mentors could use safely. I may implement recursive eval in the future but I will enforce a very limited level of recursion. But, I see a solution to replace nested calls by using the stack.

Stateless and immutables only

I have seen so many times people pickling function that I decided to have something more usable for remote execution. I also wanted my code to be idempotent. If parsing is seen as a function I wanted to guaranty that

parsing(Input, Environment) => output 

would be guaranteed to be always the same
We can also serialize the exec stack the data stack at any given moment to change it later. I want no side effects. As a result there will ne no time related functions.

As a result you can safely execute remote code.

Resource use should be controlled

Stack size, size of the input, recursion level, the initial state of the interpreter (default encoding, precision, number behaviours). I want to control everything (that what context will be for and all parameters WILL have to be mandatory). So that I can guaranty the most I can (I was thinking of writing C extensions to ensure we DONT use atof/atoi but strtol/f ...).

This way I can avoid to use an awful lot of virtual machines/docker/jails whatever.

Grammar should be easy to read

Since I don't know how to parse, but I love damian conway, I looked at Regexp::Grammar and I said: Oh! I want something like this.

There are numerous resource on stackoverflow on  how to parse exactly various base types (floats, strings). How to alternate and patterns... So that it took me 3 hours to imagine a way to do it. So I still know nothing of parsing and stuff, but I knew I would have a result.

I chose a grammar that can be written in a way to avoid backtracking (left to right helped a lot) to avoid the regexp to be uncontrolled.

I am not sure of what it does, but I am pretty sure it can be ported in C or whatever that guarantees NO nested/recursive use of resources. (regexp are not supposed to stay in a hardened version this is just a good enough parser written in 3 hours with my insufficient knowledge).

I still think Perl is right

We should do our unittest before our install. So my module refuse to install if the single actual test I put (as a POC) does not pass.


So it really worths the time spent. And now I may be in the «cour des grands» of the coders that implemented their own language, from scratch and without any prior theorical knowledge of how to write one. So I have been geeking alone in front of my computer and my wife is pissed at me for not enoying the day and behaving like an autist, but I made something good enough for my own use case.

And requirements with python and making tests before install is hellish.

(Arg ... And why my doc does not show up on pypi? )