Random thoughts on time symmetry, distributive systems.

On geometry, commutativity and relativity

TLDR; It all boils down to the definition of time and the rupture of symmetries.


A distributed system is a system on which code can be executed on more than one instance independent and will give the same results wherever it is executed. 

On ideal distributed system as a vectorial system.


For a distributed to work, you need a minimal property of the functions that are passed: the operation needs to be commutative (and distributive).

Be A a set of data, f, g functions that applies on the data and A[i] subset of data on instance i.

f(g(A)) == «Sum» of (f x g ( A[i])) on all instances/partitions.

Distributed system for avoiding SPOF are rerouting operations on any instances that is available. Thus the results should be idempotent wherever they are made.

We can either work iteratively on a vector of data, or in parallel on each element as long as there is no coupling between each elements (which can be expressed as for k, l with k!=l and k, l < i then A[k] dot A[l] == 0, or that each element are orthogonal/without relationships, thus the set of elements is a base of size i)

map reduce philosophy is about stating that data in n different location can be treated indepently and then reduced.


They are 2 kinds of functions (given you work on the base):
* Transformations ( V ) => V These functions applies a geometric transformations into space (rotation, translation, homothetia, permutation) also called Observables.
* Projectors  ( V ) => Vi that are reducing the number of dimensions of a problem.

Data is a ket |ai>  of states
Transformations are Operator applying on the Kets such as O|ai>  = |bi>
if there exists an operator O^-1 such as O x O^-1  = identity than O is reversible, it is a Transformation or mapping.

O is called functions
|ai> is input data
|bi> is called output



If dim(| bi >) < dim(| ai>) we have a projector
If dim(| bi >) > dim(| ai>) we have a local increase of information in a closed system.


Given a well known function that are linear we have for a composed function to be a transformation of the significant space of data the property that O x P =  P x O or that [P, O] = 0 (the commutator of f, g) then you can do out of order execution.



But sometimes Projectors and Transformations are commutative :

from __future__ import division
from random import randint
from numpy import array as a

MAX_INT =30
DATA_PER_SERIE = 10
MAX_SERIE = 100

data = a([ a([ randint(0,MAX_INT) for i in range(DATA_PER_SERIE) ]) for s in range(MAX_SERIE)])

print sum(data)/len(data)
print sum(data/len(data))


In actual CPU, DIV and ADD are NOT commutative.

time(ADD) != time(DIV), at the least reasons, because the size of the circuits is not the same and because min(time) = distance/c where c is the celerity of the propagation of the information carrier. If the information carrier is the pressure of the electron gaz in the substrate (electron have a mass, they travel way slower than light, but pressure is a force that is causal thus c is the speed of light). What is true in a CPU is also true when considering a distributed system.

Computer are introducing loss of symmetries, that is the root of all the synchronization problems.



It happens when we have less degrees of liberty in the studied system than in the space of the input.

When we do this, it means that we are storing too much data.

For storing enough data you need to have a minmal set of operators such as given O, P ... Z each operators commutating with each others. It is called a base.

Given a set of data expressed in the base, the minimal operations that are commutative are also called symmetries of the system.

Applied to a computer problem a computer scientist might be puzzled.

I am globally saying that the useful informations that makes you able to make sense of your data are not in the data, nor in the function but in the knowledge of the functions that as a pair commutes when applied to the data.

Knowing if two dimensions i, j in a set of data projected in the base is equivalent as saying that i and j are generated by two commutative operators
I am saying that I don't know the base of the problem and/or the coupling if I find to operator such as for any input [O,P]=0. // OP|a> = PO|a> THEN I discovered an element of the absolute pertinent data.  

given Actual Data |ai> and |aj> where max(i) = n
then <ai|aj> = 0 if and only if there exists 1 Projector I that projects the |ai> and |aj> on two different transformations.


The iron rule is the number of degrees of liberties of lost resulting by applying I must never results in having less dimension than the base.

First question: How to I get the first function?
Second one, how do I know the size of the base of the functions that combined together describes the system in its exact independent degrees of liberty (the minimum set of valuable data)?
And last how do I get all the generators once I know one? 

Well, that is where human beings are supposed to do their jobs, that is where our added value is. In fact, you don't search for the first operator of the base, you search for sets of operator that commutes. 

Determining the minimum sets of information needed to describe a problem exactly with independent informations is called compression.

So what is the problem with big data? And time?

Quantum mechanic/Vectorial/parallel computation is nice but is has no clock.

In fact I lie.

If given n operations [ O0 ... On ]  applied to a set of data there is one called P such as [ On, P ] !=0 then we can't choose the order of the operation.

The rupture of symmetry in a chain of observable applied to data introduces a thing called time.

As soon as this appears, we must introduce a scheduler for the operation to make sure the chain of observables commuting are fully applied before chaining the next set of operations. This operation is called reduce.

That is the point where a system MUST absolutely have a transactional part in its operations.

Now let's talk about real world.

Relativity tells us that time for any system is varying. On the other hand our data should be immutable, but data that don't change are read only.

And we want data to change, like our bank account figures.

And we also want that we don't need to go physically to our bank to withdraw money. And bank don't want you to spend more money than you have.

This property is called transactionality.  It is a system accepting no symmetry, thus no factorisation.

It requires that a chain of operations MUST not be commutative.

At every turn a non linear function must be performed:
if bank account < 0 : stop chain.


This breaks the symmetry, and it requires a central point that acts as an absolute referential (with its clock for timestamping).

Banks are smart, they just don't use fully transactionnal systems, nor distributed systems ; they just use logs and some heuristics. There must be a synchronicity time attack possible on this system.

On the other hand since operations are not possibly chronologically commutative on a computer and all the more on a set of computers, it means distributed system main challenge is «time stamping» the events.

We know since Einstein that clocks cannot be the same on a distributed system without a mechanism.
Every body thinks NTP is sufficent.

But, NTP has discreet drifts. This drifts that are almost non predictable (sensitivity to initial conditions) introduces a margin of uncertainty on time.

Thus for every system the maximum reliable granularity should be computed so that we can ensure the information has physically the possibility to be know before/after every change.

The bigger the system, the higher the uncertainty (relativity ++).
Given the reduced operations that can commute on set of data, the clock should also be computed given the length of the maximal operation.


  

An opinionated versioning system based on mapping versions string to numbers in weird base

While we have a convention in python for numbering: 
http://legacy.python.org/dev/peps/pep-0440/

We can mostly say that version numbering thanks to "Windows 9" has shed an interesting spotlight on version comparaison.

They are to tenants of version handling:
- the naïves who consider versions has strings;
- the picky people who consider version has a very dark grammar that requires to be parsed with an ABNF compliant parser.

Well, of course, I don't agree with anyone :) Versions are just growing monotonic numbers written in a weird base but they must have at least comparaison operator: equal, superior/inferior, is_in.

Naïves people are wrong of course

 

It gives the famous reasoning why windows might jump to windows 10:
But is it better than:
https://github.com/goozbach/ansible-playbook-bestpractice/blob/915fce52aa82034cfd61cfbfefad9cf40b1e4f48/global_vars.yml

In this ansible playbook they might have a bug when centOS 50 will come out.

So, this does not seems to hit only the «clueless» proprietary coders :)


Picky peoples are much to right for my brain


Yes, python devs are right we need a grammar, but we don't all do python.

Given Perl, freebsd, windows ... our softwares need versions not only for interaction with modules/libraries within the natural ecosystem (for instance pip) but it should also nicely fit in upper container version conventions (OS, containers, other languages convention when you bind on foreign language libraries ...). Version numbering needs a standard. And semantic versionning propose a grammar but no parsers. So here I am to help the world.

The problem is we cannot remember one grammar per language/OS/ecosystem, espcially if they are conflicting.

PEP 440 with the post/pre weird special case does not look like very inspired by the tao of python (at my wrongful opinion of someone who did not took the time to read all the distutils mailing list because he was too busy fighting against a lot of software bugs at his job, and doing nothing at home).

So as when there are already a lot of standards you don't understand or cant choose from ... I made mine \o/

Back to basics: versions are monotonic growing numbers that don't support + - / * just comparisons

 

Version is a monotonic growing number.

Basically if I publish a new version it should always be seen superior to the previous one. Which is basically a number property.

In fact version can almost be seen as a 3 (or n) digit number in a special numbering such as

version_number = sum(map(project_number_in_finite_base(("X.Y.Z").split(".")))

The problem is if we reason in fixed numbered based logic, we have an intel memory addressing problem since every X, Y, Z number can cover an infinite range of values they can be a loss of monotonic growth (there can be confusion in ordering).

So we can abstract version number as digit in infinite bases that are directly comparable

I am luckily using a subset of PEP440 for my numbering that is the following http://vectordict.readthedocs.org/en/latest/roadmap.html

By defining
X = API  > Y = improvment > Z = bugfix

I state for a user that: given a choice of my software I guarantee your versions number to be growing monotonically on X / Y / Z axis in fashion such has you can focus on API compatibility, implementation (if API stay the same but code change without bug, it is a change of implementation), correctness.

As some devs, I also use informally "2a" like in 1.1.2a to notice a short term bugfix that does not satisfy me (I thus strongly encourage people to switch from 1.1.2x to 1.1.3 as soon as it comes. I normally keep the «letter thing» in the last number

If people are fine with API 1 implementation 1 they should be easily able to pinpoint versions to grow to the next release without pain.

So how do we compare numbers in a n infinite dimensional basis in python ?

Well, we have tuples \o/

Thanks to the comparison arithmetic of tuple  they can be considered to be a number when it comes to "==" , ">" and these are the 2 needed only basic operations we should need to do on versions (all other operation can be derived from the latter).

Version is a monotonically growing number, but it is on a non fixed base.

Next_version != last_version + 1

if version is a number V comparaison of V1 and V2 has sense, addition or substraction cannot have.

One of the caveat though of version numbering is our confusing jargon:
if we decided version where X.Y.Z why do we expect version 2 is equivalent to 2.0.0 instead of 0.0.2?  Because when we say python version 2 we expect people to hear python version 2.x  and preferably the latest. Same for linux 2 (covering 2.x.y ...) it is like writing the number «20» «2» and expecting people to correct according to the context.

So the exercise of version comparaison is having a convention to know how to compare numbers according to API, implementation and bugfix dimensions hierarchically speaking in respect to the undetermination introduced by human inconsistent notation.


Just for fun, I made a parser of my own version string to a numbering convention including the later twist where 2 means 2.0 or 2.0.0 when compared to 1.0 or 1.0.0. It addresses the examples to solve given in PEP440


It can be seen here.


Wrapping up


For me a version is an abstract representation of a number in infinite base which figures are hierarchically separated by points that you can read from left to right.
I am saying the figures are made of a tuple of two dimensional space of digit and letters where digit matters more than letters. (Yes, I am putting a number in a figure, it is sooo fractal).

But most important of all, I think versioning string is a representation of a monotonic growing number.

I am pretty sure PEP 440 is way better than my definition is has been crafted by a consensus of people I deeply respect.

My problem, is that I need to achieve the same goal as them, with less energy they have on modeling what a version number is.

That is the reason why I crafted my own deterministic version numbering that I believe to be a subset of PEP 440.

Conclusion

 

My semantic might be wrong, but at least I have a KISS versioning system that works as announced and is easily portable and for wich I have a simple grammar that does quite a few tricks and an intuitive comprehension.

And human beings are wrong too (why version 2 is 2.0.0 if compared to 2.1.1 and 2 if compared to 2.1 or 2 if compared to 3), but who cares? I can simply cope with.

NB it works with "YYYY.MM.AA.number" (SOA) scheme too,

PS thinking of adding y-rcx stuff by slightly enhancing the definition of a figure.

PPS I don't like talking to people normally so I disabled comments, but for this one I am making an effort : http://www.reddit.com/r/programming/comments/2iejnz/an_opinionated_versioning_scheme_based_on_mapping/
because I am quite curious of your opinions

Perfect unusable code: or how to modelize code and distributivity

So let's speak of what and un/deterministic code really are.

I am gonna prove that you can achieve nearly chaotic series of states with deterministic code \o/

Definitions:

Deterministic: code is deterministic if the same input always yield the same output

Chaotic: a time serie of value is considered chaotic if knowing of sample of t-n samples cannot make you able to predict the t+1 term. 

Turing Machine: a computer that does not worth more than a cassette player.

Complex system: a set of simple deterministic object connected together that can result in non deterministic behavior.

lambda function: stateless functions without internal states.

FSM (finite state machine): a stuff necessary in electronic because time is relativistic (Einstein).

Mapping: a mathematical operation/computer stuff that describes a projection of input discrete dimension A to output discrete dimension B. 


Now let's play real life turing machine.

Imagine I give you an old K7 player with a band of 30 minutes and every minutes I tell the result of n x 3.
If you go at minutes 3 the K7 will tell 9.
If you go at minutes 5 you will hear 15. 

This is the most stupid computer you can have. 
My tape is a program. The index (minutes) is the input, and the output is the what is said. 

So let's do a python Basically we did a mapping from the index on the tape in minutes to integers that yields index(in minutes) x 3. 



So what do we learn with this?

That I can turn code into turing machines, that I can use as a code with a 1:1 relationship, I have a ... mapping \o/

What does compile does?
It evaluates for all the input possible that is an integer belongs to [0:255] all the output possible of boolean function. It is a projection of [2^8] input => 2 output
I projected a discrete space of input to a discrete space of output.

Let's see why it is great

My code is fully deterministic and is threadsafe because my code is stateless.

It is an index of all the 256 solutions for f(x) for every possible values.

if I encode a function that tells if a number can be divided by X another one by Y to have the function that tells if a number can be divided by (X * Y) I just have to apply then & (bitwise and operator) to the int representing the code.

An int is a very cool for a storage of function.
With div2 / div 3 I can by applying all the «common bitwise operator» create a lot of interesting functions:

div2xor3 : a code that indicates number that can be divided by 2 or 3 but not 6
not div2: every even/odd number
div2or3: multiple of 2, 3 and 6
div2and3: multiple of 6 only
....

I can combine the 16 bliter operations to directly obtain functions.

In functional programming you do partial function that you apply in a pipe of execution, here you can directly combine the code at «implementation level»


My evaluation is always taking the same number of cycles, I don't have to worry about worst case, and my code will never suffer from indetermination (neither in execution time nor results). My code is ultimately threadsafe as long as my code storage and my inputs are immutables. 


My function are commutative thus I can distribute them.

div2(div3(val)) == div3(div2(val)) (== div6(val))

=> combining function is a simple and of the code

Why we don't use that in real life

First there is a big problem of size.

To store all the results for all the possible inputs, I have to allocate the cross product of size of input * size of output.

A simple multiplication table by 3 for all the 32 bits integers would be 32 bit * 32 bits = an array of 4billions worlds of 32 bits. 16Gbytes!

Not very efficient.

But if we work on a torus of discrete value, it can work :)

Imagine my FPU is slow and I need cos(x) with an error margin sufficient to only work in 1/256 of radians? I can store my results as an array of precomputed cosinus value expressed in fraction of 256%256 :)

A cache with memoization is also using the same principle.
You replace computing code that is long by a lookup in a table.

It might be a little more evoluted than reading a bit in an integer, but it is globally the same principle.

So actually, that is one of the biggest use of the turing machine: efficient caching of pre computed values.

Another default, is that the mapping make you lose information on what the developer meant.

If you just have the integer representing your code, more than one function can yield the same code. The mapping from the space of the possible function to the space of the solutions is a surjection.

Thus if you have a bug in this code, you cannot revert back to the algorithm and fix it.

if I consider I have not a number of n bit as an input but n input of 1 bit constituting my state input vector,  and the output is my internal state, than I am modeling a node of a parallel computer. This «code» can be wired (few clocks costs) as a muxer that is deterministic in its execution time and dazzling fast.


What is the use of this anyway?

Well, it models deterministic code.

I can generate random code and see how they interact.

The Conway's Game of life is a setup of turing machine interconnected to each other in a massively parallel fashion.

So my next step is to prove I can generate pure random numbers with totally deterministic code.

And I can tell you I can prove the condition for my modified game of life to yield chaotic like results is that the level of similarity for every code on every automaton is low (entropy of patterns is high) AND 50% of the bits are 0/1 in code (maximizing the entropy of the code in ratio of bits).

 






Date convention: an extra-ordinary conservatism that is screwing are life and code

The other day, as distracted as I am, I had to reprint, re-sign and resend all my forms.

How?

I had to write the date in numbers.

Well september sound like seven/septus (in latin) thus it is 07 wrong it is 09. Every time I get caught.

What is this mess?

Days of the weeks are supposed to be the 7 observable planets (at the time the convention were set) of the solar system (used in astrology, a sign of scientific seriousness).

Some months are 31 days because some emperor wanted to be remembered as "as great (august) than other's" (julius).
It is not beginning on the shortest day of the year, else it would be a pagan fest discrediting the roman catholic empire. But it must have began in march once in the past (thus sept 7 ...december 10).
It is made of 12 solar month because of superstition that makes 13 a dangerous number...

And  in computer coding it is a fucking mess.

Not to add the local timezone unreliable informations who are so versatile because of political conventions that makes it impossible to time stamp information in a real reliable way.

Why all of this?

We spend a non neglectable time and energy with dealing with this hell, while trying to promote progress; something that easily substantially improve our lives.

We can change protocols, language in a snap of fingers to accelerate speed of software execution, but one of the biggest burden for reliable data storage is still there under our nose. And we are blind; it is the date mess.

Let's face it: a change would be welcomed and cool with mostly benefits.

Dropping superstition and politics from our calendar would have more advantages. Here is my idea:

First I like solar calendar (a calendar that begins or ends when the distance to the sun is either the biggest or the smallest). The shortest day being the winter solstice on 21st of december. It used to be a "pagan" fest. But who cares, it is fun to have a feast when the worst of winter (or the best of summer) is at its best.

Then 365 = 364 + 1 = 13 x 28 + 1

Well, 13 months of 28 days :) it is coincidentally a lunar month, which in itself would make it an hybrid catholic/islamic compliant calendar.

When to add an extra day and where?

Well obvisouly when given a certain point the day the shortest officially is not the same as the one from the sun. Which day? Let's say that the world special day of every year is new year's eve, it could be nice to have also a day off on the longest day of the year to enjoy it on bissextile years. Or share the pleasure of enjoying new year's tradition with people from the other hemisphere in the same season :)

This calendar will help the kids make sense of the world.

Knowing that Janus was an old forgotten two faced god representing the present existence (one head looking to the past/memories the other to the future/projects) worshiped by a gone civilization won't help kid much more than learning of seasons and when and why it happens this way.

So I have a simple internationally translatable for naming.

Kids will learn day and math at the same time:

The first day of the week will be 0D or zero day then 1Day/Tag/Dias/whatever, then 2D ... .  By convention to have given us the actual calendar latin representation of day could be D like the first letter for day in latin.
 
The second would be incremented by 1 and so long. Every culture could use whatever sound nice for day's name and try to relate them with numbers. It does not have to be said one, it could be mono, uno, ein, the day of whatever come by one.

This way, kids and people learning a new language learn basic numbering and week days at the same time...

I propose it begins like hours with 0 this way we have good algebraic rules for calculating and it makes day/time math consistent with each others. it begins with 0 like hours.

A work week is thus an interval [0 : 5]
the week day in 4 weeks + 5 days = d + 4 % 7

And NOW I could write my date in numerical format the same way I know the day in the "usual way of saying", I don't have any more traps when I want to know what is the day of the week in 16 days. I don't need conversions ... I am happy to forget a load of useless tricky informations.

Any way, with my idea, we introduce math and knowing days of the week at same time in school.

Okay there is the case of the intercalar day that is special. For this one there will be rule, but now, at
least time calculus become as consistent as the metric system. And we can eradicate this Aristotelian stupidity that all should be periodic and harmonic in a science to be true.

We make them learn calculus in different bases (24/60/7/13/365) and a tinge of introduction on the reality of the world (telling that days don't make 24 hours because there is an epsilon of chaos in the real world that don't match our first models, and that all periodicity don't always match real numbers. So it is cool. It is a lot more information packed in a single convention.

We learn kids how to deal with inconvenient maths and how we human face with it. And learning compromise is cool.

Kids can be taught to watch the moon, the sun, the season, and every things makes sense.

You can introduce trigonometry with calculating hour of the day with the course of the sun. You can tell the story of how with a camel and a stick you can compute the earth radius...

You can relate the abstract universe and its vastness with our all days life. You can encompass a great deal of our civilizations' progress in simply changing the date conventions. Indian, Arabs, Europeans, Mogol, Chinese, Babylonians ... they all gave us something worth in the new calendar. It is not a calendar based on forgetting the past, but at the opposite it is the topmost conservative calendar. It will more dense in knowledge and culture and values than ever. It becomes a compressed set of knowledge, easier to manipulate even for those without knowledge.

Sticking to date mess is not technical issue, but a civilization one: it is just us being cavemen hitting on our keyboards respecting weird divinities of the past in superstition. Superstition, even disguise in the technical words of cargo cult science is still superstition. Blindness based on spending billions on software bugs due to stupid convention seems less efficient than fixing the real world. It  would improve our lives even out of computer world.
 But I still write software to makes thing easy for people that want a progress that makes their life easier and not wishing to change what gives us trouble in the first place: not computer or math, just stupid conventions..

Practical sense and lazyness are pretty much the 2 main qualities of humanity I would like a system to make us share. Not a strongly superstitious anachronic history loaded artifact that screws my every day life. So that's my propposal for a better world.


Non linearity and clock in distributed system

How the word distributed matters

Let's imagine Einstein existed, and two observers live with different clocks.

According to their differential acceleration, their clock diverge according to a 3rd observers.

Let's name observers 3 the user. And two components of a distributed system A & B.

Let's assume A & B have a resource for treating information per seconds called bandwidth. Let's assume that the more load, the more it takes time to treat instruction. Let's say that it means the clock «slowed down».

Let's assume that tasks are in form f x g x h .... (data). where f g and h are composable functions. And let's assume A or B can treat 2 sets of functions that can be overlapping.

Now let's make a pun a choose only function that supports distributivity vs non linear functions and see what happens.

if  {f, g, h} is an ECOC (Ensemble Complet d'Opération qui Commuttent) then f x g x h gives the same results as h x g x f applied to the data.

Thus, the distributed system does not need a clock, because g(data) does not have any hidden relationship with f(data) in terms of chronology. It works perfectly has just message iteration between functions. The function can be applied in any order. No chronology needed.

Whereas if you use non linear functions you introduce time, and your system must have a central clock that will be the limiting factor of your «distributed system» aka the fastest speed at which you can guaranty transactionality.

Ex: banking system.

I cannot let an actor spends money if is credit/account = 0.

So if I send in parallel the actions of putting money and retrieving it there is an order. I cannot let a user trigger withdrawal after withdrawal (that are fast operations) as long as the slow accounting informations are not treated.

I have to put a lock. Something to ensure the absolute state at a reference moment called now. As a result I cannot distribute in an even fashion on a distributed system  my task. I have a global Lock that must exists.

Because of pure geometry acquiring and releasing this lock has a minimal incompressible delay.

Inside a single CPU : Speed of light, centimeters, nanoseconds.
Inside a LAN : Network latency, meter :  , X ms
Inside a WLAN: kilometer XXXms.
Worldwide: X Seconds.


Since there can be no global states without information transmitted, a global state requires the introduction of a unique absolute arbitrer.

The more distributed your system, the more redundant, the more your clock diminishes if you have transactions. That is an incompressible limiting factor.

What happens with Commutative operations?

Well, they are dazzling fast. You can implement a robust dazzling fast system distributed system.

Commutative functions can be executed out of order.

They just require to be routed as mush time needed in the right circuitry.

Meaning you can avoid a congested node easily, because you can reroute actively at the node level without knowledge of the global system, just the state of congestion of your neighbours.


An efficient distributed system should support only distributive operations that are asynchronuously delivered. And, reciprocally if you want want an asynchronuous system that can scale, you should not accept non commutative operations. 

The commutative part should be distributed, the non linear one should be more logically treated ideally on a strongly synchronuous not distributed system. Systems where the clock is ideally one cycle, thus ideally close to the metal using all the trick of atomic instructions.

Oh, another lad is asking why you can't use your fancy nosql distributed system to have a nice leaderboard for his game.
Well, a > b is a non linear system, sorting is non linear. So well, somwhere a clock or a lock or a finite state machine, or a scheduler has kicked in.
Non linear operations introduce a before and an after. With loss of informations.

if you need the result of a>b then a and b being 2 distributive operations, then you have to wait for both a and b before processing in non reversible way.

To ensure everything was made in the right order, in a non distributive way; there had to be time, so that action are made in the right order. Every non linear operation in a distributed system introduces an hidden clock.

Are non linear operation bad?

I call them filters. I think they are a hell of a good idea, but I say it is very hard to make them distributed. So we should live with them and architecture our distributed system accordingly.

(PS in the map reduce idea, Map can be seen as dedicated for stuff that are commutative, and Reduce for the non commutative one)


PS Let's imagine a distributed system with Forth or RPN

if
def Radius: square SWAP square SUM 
def square : DUP *
def sum : +

Can I write a²+b² as distributed system?

Data Stack : a b
Exec distributed Radius

Can I play the operation of square, swap square sum in any order?
No.

f(a, b) = f(b,a) thus it is commutative. But what is the problem?

The use of a stack introduce a scheduling because there is now a relation ship of order on the application of the operation hidden in the data structure/passing. so a queue is also a introducing a clock.
Distributed system should be wired (message passing) and not programmed (actively scheduling tasks).

Heavyside function: a systemic mathematical root of social inequity

Abstract 

Just a random theory for fun, nothing really serious.

Assumption

  1. I hate analytics, so it will be a formal reasoning;
  2. We consider that social inequity is the inequity in front of the differential between tax being paid, and tax being received as n nth order;
  3. We consider that the social agents interact has entities formed in a network and that they tend to be over represented the more «utility/wealth» they have (I still don't know of any hobo making it to the parliament);
  4. We consider that part of this interaction are with a special entity called «state» that have feedback loops on the agent:
    1. some for taking (VAT, IRS ...);
    2. some for giving back (education, health...)
    3. all these agents are interconnected and may have delay in propagation of the feedbacks;
  5. We consider that there is an agent called parliament that can interact with the «state» in such a way it can changes the network and functioning of the agent;
  6. We consider that each the utility function for a given agent to set his choices are based on a rationnality that is based on 
    1. sum of dis/imitation of a neighbourhood, 
    2. global rationnality (the mathematical choice that maximize my utility);
    3. and temperature (random factor where you put moral and stuff);
    4. temporal rationnality (based on a short term memory);
    5. partial access to the information related to utility/state vector of the agent;

By the systemic nature of tax reversation our societies are bound to tend towards extremely inequal society

So basically we have a complex system. A set of simple systems interconnected together. It belongs to a young branch of mathematics called «complex system».

These systems are quite unnice, it is very hard to analyse them mathematically even though some statiscal physics can help. Simulation can help. But reasoning is better.

So what my beef is all about?

The BAD Guy

This is the problem !


This function is a non linear function. If I introduce it in any equation, I cannot use any mathematical means to predict tendancies. Averages, trends, estimations cannot work per nature with these functions. So it means, mostly all predictive models based on «linear algebrae» such as matrix, average, derivate, estimation don't work.

And, I pretend I can solve it.

Juste let's acknowledge that laws have effects.
Let's acknowledge that law is often formulated with stuff such has : IF Income > xk$; THEN pay x% taxes;  ELSE pay y% taxes.

So we have clearly my bad guy hidden every where.

Now, let's be fun and imagine the utility (money) flowing from each of these cellular automata based on the hypothesis there are evenly distributed wealth at the origin and that interaction are randomly distrinuted.
At some turn out of n taxes are paid,
At n turns income can be randomly given based on the discrete state of the automata;
At some turns the automata rules are changed by a subset of the people with more utility.

So the question, is how will it evolve?

Well it is like visualizing a huge body with cells and  heart pumping. Which is nice.


I can predict that if there are heavyside functions used by the «state», then it will evolve more often, and with bigger amplitude towards unfair system than system with linear functions and no binary criterions...

The problem lies in the fact there are «acausal» stuffs in this system. Or retroaction delayed loop. And they tends to amplificate violently.

Acausality means that an effect can have an effect on the cause (but always later in time). Taxing too much people will impact next years potential income. The state wealth is like a gigantic bath tub but globally it requires sum(income) == sum(outcome) and the income are solely taxes (I cheat, I know, I am closing the «state» system whereas it is an open system).

You will notice that the time constant for a feedback loop varies. The Revenue taxes will need one year to retro propagate, while VAT feedbacks almost immediatly. Thus there are asymmetries both in time constant, and amplitudes of the feedbacks.

So know, We hit the run button of the simulation.

We follow agent 1 that is randomely chosen to need money from the state (food stamp? parental break? Sickness?...).Utility increase.

Another agent may lose money coincidentally at the same moment (parking ticket, donation, ...). Utility decrease.

Now, we could imagine it is already the turn for paying your annual income tax.

And, there is this heavyside function tearing appart 2, 1 and the rest into a cluster...

Time in this asynchronuous system, is the accident of the accident.  Every time a transaction is made amongst agents, time increases (discretly).

Randomly things will happens; with the same odd for everyone, unless their utility is null. When utility is null, you can't play outgoing interaction.

Now, 3rd turn, we are already playing the election, 1, 2 are either above or below the utility of the crowd, so their odds of playing the election games are disctinct.

At 0 utility you cannot play the game of election.

I make the following assumption: probability of being elected is represented by a non linear but growing function of the utility (wealth) that is 0 for 0. Given the right utility (if you have money, but no time, you don't have «available wealth» for an outgoing interaction). Rules name : "pas de bras, pas de chocolat".

So in the decision of presenting myself I have to set my decision based:
on my current mathematical interest;
odds of winning;
and my «imitation factors».

Statistically, it is small, and should be considered as the same kind of noise as random photons exciting the oxygen in the sky...

But, let's add a little realistic bias to the agents:
they have a small term memory;
they have all the more chance to predict future that they have education.
In fact, this is too much a strong hypothesis. Let's just say something regulated by the state create an assymmetry of information. I won't treat the case of multiverse rationnality per agent, but it should be treated. You can modelize them as sets of relationship to information some being random (religious interpretation or a star being behind the sun), some being relevant (bribery) plus a set of rules to edict a future outcome based on «values». These have of course a feedback loop from the taxes. These surset of individuals are moral persons, thus almost regular agents (polymorphism), like religion/schools/companies. For each of the surset an indivual is in, the agent has an access to rules and information based on a ratio  of «fit with my own interest according to my memory». And this agent can't be at the parliament, but they can increase the odds of winning for people belonging to their surset.

For the sake of realism, we will consider the lower this fitness variable is, the more an alteration of the information is, so we corrupt randomly either a relationship or a rule.

But, it is way to hard to code, so let's try the simple model that does not change much: a global major education (jacobinism) for the network of agent, and that education is mostly a question of where you live, inclusive or of how much money you have (through both your patrimponial, and indirect income).

The agent still have a short term memory and partial access to the information based on flags describing its cluster.

Well at turn free I have 2 possibles outcome: 2 clusters of 1 and 1 cluster of n - 2, and 1 cluster of 2 plus the other ones.

Not much.

But everytime an event happens with the simple fact there this gaps, it repropagates.

And since we said nothing about the height of this gap, it can make the difference between Charydbe and  Scilla.

Imagine that you go in jail? You cannot earn money, you can't play the game of election.

Imagine thanks to the tax system you have a wonderful contract from the DARPA. You are a cluster of one, but your utility for trying to change the system because of potential bad suprises raises. Who wants to pay taxes when you can rationnaly avoid it with less investment in utility?

Plus the more education you have, the more you share you rationnality with other agents the more you see the retroaction loops and can predict the future (given your rationnalilty favours your agent) and you acknowledge teh utility of sticking together. Karl Max's Capital at my opinion was more useful for the powerful to understand the need to act as a class because they benefit the most of it. I sometimes wonder how much Karl Marx helped the emergence of the capitalism he was strongly denouncing.

And remember, my situation impacts the neighbours on the network (wife, family ...).

Then the more you see the retroaction feedback loop join interest with yours, the more likely you are to adopt interacting with the agent... fast, and spreads the more in amplitude. It is strongly contaminating the more it benefits you.

Every agents have their time of reactions, based on the channel of informations.

Ex: some people knows the Fed's new rate before they are even announced on the market. [find the link with order at 14:00 in NY while announce at 14:00 Chicago]

So ... why does Heavyside make a difference?

This wheel of fortune non linear effects happens also in a system without gaps.

The difference is that if you happen to put a continuous function, the result will results in smoothing the non linear effect after n turn. New comers will come and live with the favored according to a progressive effect.

Their will still be local optimum with linear functions that will make small valleys of clusters. But the depth of the will be smaller. 

The heavyside function will of course clusterize MORE the population with more impacts. Putting a binary flag state for every single steps introducing discrete domains with distinct rationnalities, you try to know best the channel that favours you best while risking less. Some will have interest in changing the laws to favour the conservatism of the situation based on their interests, others will have a rationnality of changing the «winning domains». Just random stuff you could modelize. Without knowing anything you already know that the good rationality will have to favour cluster effects. Because the symmetry in the cause will have an impact in the effect. And it will be all the more efficient that it feedbacks positively. All winning rationalities in an Heavyside based complex system WILL favours strong discriminations that favours the clusters created by the intial Heavyside function. Here is the Capital's central thesis: there is a clear mathematical incentive for the more powerful to regroup together and since they are favoured in their probability of having a positive action on the system for them, to favour clustering for their better good. They should favours laws that works all the society if is (arbitrarily) fractionned by bigger gap.

I don't say every agents in the same conditions share the same rationalities. Warren Buffet or Bill Gates asking to pay more taxes seems to contradict me.

It is just an effect of number. Of imitation, spreading of information, of majority of behaviour, and of cumulative re enforcing effects.

The existence of the bias in the representation/power, favour the systemically the strong clustering of conflicting rationalities artificially. And it is at my opinion very hard to say whether education makes wealth or the opposite.
So saying the favoured clusters in terms of wealth OR education will have a tendency to be over represented in parliament is clearly the right way to say it.

Saying the more represented will favour their interest is kind of a trivial fact.

Rich people without education (no information, just lucky guys (won loto) won't care.

Poor people without education won't care.

Rich (favored by the clustering) and relatively poor people but acknowledging the bias (un favored by the clustering) with education will care to change the system.

Now, if we introduce the fact there is a clash when the tension is too big (we can measure an antagonist rationality between two clusters that is more than a certain amplitude) then it is becoming unstable.

Every agent will tend to try to choose the information node/rule set that will best its interests according to its rationnality.   

But, thanks to the cluster and the nature of the heavyside function and all this binary flags introduced by the heavyside functions, people will punctionate their income through differents paths that requires differents sets of informations.

Thus we have diverging rationnalities. And given enough education, their must be a conflict. If you see you have no chance of filling a gap, you don't try to filll the gap, you just change the gap.  People will mechanically fight other belonging to arbitrary domains.

The funniest conclusion is that in my model, the 99% should be called the 1%, and the 1% should be called the .01%

1% vs .01% is the fight between (have favouring clusters and access to information) vs (don't belong to the more interesting clusters but have access to enough information to see it, or is favoured but has an opposed rationnality).

The simple fact of criticizing the 1% is already a proof you belong to the 1%.

The 99% movement, the occupy wall street stuff is not about trying to solve the inequity problem, it is about asking for a new order because it is a rationnal choice for people that just want to be in the .01% and don't have access to it, yet.

Political disclaimer: I belong to the movement «we should all be the 100% and living happily ever after». The 100% in short.

So now, one big question. Is it intrinsically bad to have a system that is more unstable than it should be overwhise? Is the discrete clustering bad?

Let's rephrase, do you prefer the funkyness of war, or the boringness of happyness and peace? Well it depends of course if you have to die at war or earn money from it.

A system that induces systematically arbitrary clusters of population that amplifies have less chance to be stable than a system without clustering.

In natural language; a society were rules are applied without any discrimination on the nature of the citizen is less likely to tend towards unstability and strong auto amplifying discriminations.  These discriminations are purely mathematical amplified artefacts. Should we let artefacts rule our lives?

It is is kind of better when in a society people share the same rationality, and there is less paranoïa when the information is more symmetric.

Our systems are thus chaotic by nature, and more stochastically behaviouring than they should, just because of a stupid function that introduces an arbitrary amplification on discrimination. It should be fixed. The laws should be rewritten to get rid of the all the possible formulation like: IF BLah THEN this ELSE that.

I am agreeing strongly on the importance of sanctioning wrong behaviours or protecting the youngest (which are strong heavyside function), I don't agree with the multiplication of unnecessary non linear clauses (IF SEX | EARN more than x$.... THEN ....) in our social systems. They cluster us, and they make the effect of the law unpredictable thus arbitrary. And as a human, I prefer control.

How could I prove I am right/wrong?


Well, if I were serious I would bring proof. So I would have to make a simulation, give data, and make a model. Than I would claim to the world that I am an unrecognized genius, but I don't care. I am just waiting for my wife to come back, and it is my way of relieving the stress.


However, I gave a try at multi agent simulation. https://github.com/jul/KISSMyAgent
It could be used to modelise this. And, I am pretty sure by running a lot of simulation we will find the properties of our whole known systems (democratic, republicans, communists, monarchists) will be prone to this effect.

But I hated the programmation of this stuff. So I don't recommend it.

I went on an implementation based on distributed agent: https://github.com/jul/dsat

I began to use it in conjunction with graphite/carbon to store results. But, it is faster to run the simulations in my head than on the computer, so I prefer to go directly to the results. ;)

So there it was, a recreational theory that probably is useless, but it was in my brain. So I unloaded it. 

Just for fun, I just described a purely asynchronuous distributed system.
It means, that with too much non linear interactions, any real distributed systems (the cloud, big clusters of distributed applications) also have this unstability properties. 

Just think about it: I am saying that the cloud will be unstable one day by nature, I am saying the day it will break it will break in a massive violent snowball effects the more non linear rules are introduced (non linear: swithching traffic from interface, rejecting jobs on timeout, according more resources to tasks that are already greedy on CPU instead of fixing the algo...). And since the effect is non linear we have no assessment possible of when and how. I have a strong suspicion the breakout will be violent and undetectable. One day, you will wake up with an irreversible situation that will affect you without any possibility to foretell it.  Thus, no insurance can cover this phenomenon. No science... yet. We don't have mature analytical, theoretical and empirical tools to studies these. 

I would be you, I would not rely on systems that are chaotic and built by engineers that don't seem to see any problem with that. You just are having a system that is stable as long as a given piece of network equipment in China doesn't flap its small BGP wings too much but oddly resists to people trying to destroy its backbone with nukes....

I just hope it does not happen before I am retired. ;) On the other hand, I am just a single guy without any credibility, and I seem to be a little to dramatic. So, let's say it is just another stupid theory with no interest.

RFC 01 Human Handshaking protocol for instant messaging (work in progress)

Abstract


Instant Messaging (IM) can be disruptive and cognitively hard to handle because it requires context switching. This results in 2 potentially counter productive effects:

  • lowering the quality of the conversation for both parts that are not equally concentrated;
  • it can introduce a repulsion towards this protocol.

Since this is a human problem, this proposal is a human based solution.

Proposal


When you want to talk to someone you ask for «real availability» and a «time slot» and a «summary» of what you want to talk about  given a «priority». It is in the interest of both party to agree on something mutually benefiting.

The idea is to propose a multi cultural loosely formal flow of conversation for agreeing to a talk in good conditions.
 

Implementation. 


Casual priority is fine and is the only proposed level.
Default arguments are:
  • time slots : 10 minutes (explained later). NEVER ask more than 45 mins;
  • summary : What's up? (salamalecs explaiend later);
  • priority : casual (except if you want people to dislike you).

 

Time negociation


Ex: «hey man, can you spare some 10 minutes for me?»


The interrogative formulation should put your  interlocutor at ease so he understands he can refuse or postpones.

Asking for an explicit time slot helps your interlocutor answer truthfully.

If the receiver is not answering it means he or she cannot.

Don't retry the opening message aggressively. Spacing gracefully the requests should be based on the historic of conversation you had. If you had not talked to someone over 1 year, don't expect the person to answer you back in 5 mins, but rather in the same amount of time since you last interacted.

If you really want to push, multiply each retry by an order of magnitude. Min time for repushing should be done according to the how busy your interlocutor is, your proximity with the person, and your «average level of interaction» on a rough moving average of one month.

It should never go below 5 mins for the first retry (with a good friend you interact a lot with) and 15 mins for a good friend you have not talked in years.

(try to find a rough simple equation based on sociogram proximity)

Summary/context.

Announcing the context

At this point, the talk is NOT accepted.
A tad more negotiation may be needed.

It is cool for person to interact to have a short summary so that people can know if it will be "information" (asymmetric with a higher volume from the emitter) "communication" (symmetric), "advice" (asymmetric, but reversed).

Defaut is symmetric. Asymmetry is boring and if so you should think of NOT using IM.

Context: 

business related/real life related/balanced

Default:  balanced.

If you use IM for business related stuffs, I don't think this proposal applies to you. There are multiple ISO norms for handling support. People also tends to dislike doing free consulting in an interruptive way out of the blue. If you poke someone for asking him business related stuffs, you are probably asking for free consulting. Please, DON'T. There is no such things as free beers. If so you should propose clearly a compensation, even if it casual at the beginning.

Ex : Please, can you Can you give me 10 mins of your time between now and thursday on IEEE 802.1q? I will gladly pay you back a coffee sunday for your help.

Notice the importance of being polite. DON'T use imperative forms, they express orders. Use polite structured form. Give all the information in a single precise statement.

The more you are needing the advice, the less you should be pushy. It means you value this person much and you should not alienate her/his good will.

Default : Salamalecs (work in progress)

When greeting  each others you can't help but notice muslims/persians have an efficient advanced human protocol for updating news on a social graph called in french salamalecs.
http://en.wikipedia.org/wiki/As-salamu_alaykum

I don't know the religious part, but the human//cultural behaviour that results is clearly a handshaking protocol that seems pretty efficient.

I don't know how to transpose it yet in an occidental way of thinking, but I am working on it.

Receiver expected behaviour


People at my opinion tend to answer too much.

You have a life and a context. If you trust the person poking you, you expect him to know the obvious:
  1. you may not have time to answer;
  2. you may be dealing with a lot of stuff;
  3. it may be unsafe (either you are driving, or at a job interview)
  4. you may not be interested by the topic, but it does not mean you don't like the person.
Learn to not answer and not be guilty.

In the old days we tended to send an ACK to every sollicitations because network delivery could failed (poorly configured SMTP, netsplit....) and we could not know if the receiver was connected.

Today, we are receiving far more solicitations and we may forget about old messages.

If you did not answer, have faith in your interlocutor to repoke you in a graceful way. The x2 between every sollicitation is based on the law of «espérance»(find translation in english + reference) when having incomplete information about the measure of an event.
Believe me, mathematically, it is pretty much a good idea to make every solicitations if important spaced by a 2x factor (kind of like DHCP_REQUEST)

Once the topic/time are accepted, you can begin the conversation.
Content negociation SHOULD not exceed 4 lines/15 minutes (waiting/1st retry included). The speed of negotiation should give you an hint on the expected attention span of the receiver.
If you can't spare the time for negotiating DONT answer back. It is awkward for both parties.

Time agreement: When // for how long.


minimum time slot: 7 mins.

Experimentally it is good for better conversation, it makes you able to buffer your conversation in your head and be able to higher the bandwidth.

Using a slow start that is casual and progressively getting in the subject can be regarded as the human counterpart of old time modems negotiating for the best throughput.

You emitter is NOT a computer. Civility and asking questions about the context will help you adapt, it is not wasted time. It is clever to ask news that are correlated to the ability of your receiver to be intellectually available. Slow start means you should not chains the questions in one interaction.

ex: Are you fine? How are you kids? Is your job okay?

Multiple questions are NOT a good opening. Always serialize your opening.

Making a branch prediction with combined questions may give awful results.

What if the guy lost his wife and kids due to his tendency to workaholism?

Once the time is agreed you can set a hard limit: by saying : clock on.

It is cool to let the person with the busiest context tell the clock off.

It is fun to hold to your words about time. You'll learn in the process how chronophage IM are.

A grace time after the clock is off is required to close the conversation gracefully with the usual polite formulation. It should be short and concise.

Ex:
A :  thks, bye :)
B : my pleasure, @++

References: 


To be done

* netiquette (IETF RFC 1830?)
* multitasking considered harmfull
* something about RS232 or any actual low level HW protocol could be fun;
* maybe finding an outdated old fashioned book with funny pictures totally outdated with a pedantic title like «le guide de la politesse par l'amiral mes fesses» should be funny
* I really love salamalecs so finding a good unbiased article by an anthropologist is a must
* putting a fake normalization comity reference or creating one like HNETF could be fun: Human NOT an Engineer Task Force with a motto such as «we care about all that is way above the applicative OSI layer» to parody/make an hommage of IETF should be fun.
* some SERIOUS hard data to backup my claims (X2 estimations, concentration spans, ...)

TODO 


format that as a PEP or RFC
make RFC 00 for defining the RFC format/way of interacting to make that evolve
specify it is a draft somewhere
find an IRC channel for discussing :)
corrections (grammar/orthograph)
experiment, and share to have feedbacks, maybe it could actually work.
don't overdo it.
make a nice state/transition diagram
provide a full example with time line (copy/paste, prune, s/// of an actual conversation that worked this way).
add a paragraph about multi culturalism and the danger of expecting people to have the same expectation as you

EDIT : name it salamalec protocol, I really love this idea.