JLOUIS Ramblings: 12/01/2012

In UNIX there is a specific error number which can be returned from system calls. This error, EAGAIN is used by the OS kernel whenever it has a complex state in which it is deemed too hard to resolve a proper answer to the userland application. The solution is almost a non-solution: you punt the context back to the user program and ask that it goes again and retries the operation. Then the kernel gets rid of the complex state and the next time the program enters the kernel, we can be in another state without the trouble.

Here is an interesting psychological point: we can use our code to condition another persons brain to cook up a specific program that serves our purpose. That is, we can design our protocols such that they force the user to adapt certain behaviour to his programs. One such trick is deliberate fault injection.

Say you are serving requests through a HTTP server. Usually, people would imagine that 200 OK is what should be returned always on succesful requests, but I beg to differ. Sometimes—say 1/1000 requests—we deliberately fail the request. We return a 503 Service Unavailable back to the user. This conditions the user to write error-handling code for this request early on. You can't use the service properly without handling this error, since it occurs too often. You can even add a "Retry-After" header and have him go immediately again.

This deliberate fault injection has many good uses.

First, it enforces users of your service to adapt a more biological and fault tolerant approach to computing. Given enough of this kind of conditioning, programmers will automatically begin adding error-handling code to their requests, because otherwise it may not work.
Second, it gives you options in case of accidents: say your system is suddenly hit by an emergency which elevates the error rate to 10%. This has no effect, since your users are already able to handle the situation.
Third, you can break conflicts by rejecting one or both requests.
Fourth, you can solve some distribution problems by failing the request and have the client retry.
Fifth, simple round-robin load balancing is now useful. If you hit an overloaded server, you just return 503 and the client will retry another server.

I have a hunch that Amazons Web Services uses this trick. Against S3, I've seen an error rate suspiciously close to 1/500. It could be their own way of implementing a chaos monkey and then conditioning all their users to write code in a specific way with it.

The trick is also applicable in a lot of other contexts. Almost every protocol has some point where you can deliberately inject faults in order to make other clients behave correctly. It is very useful in testing as well. Use QuickCheck to randomly generate requests and let a certain amount be totally wrong. These wrong requests must then be rejected by the system. Otherwise something is wrong with it.

More generally, this is an example of computer programs being both formal and chaotic at the same time. One can definitely find interesting properties of biological processes to copy into computer systems. While it is nice to be able to prove that your program is correct, the real world is filled with bad code, faulty systems, breaking network switches and so on. Having a reaction to this by having your system be robust to smaller errors is definitely going to be needed. Especially in the longer run, where programs will become even more complex and communicate even more with other systems; other systems over which you have no direct control.

You can see fault-injection as a type of mutation. The programs coping with the mutation are the programs which should survive in the longer run.

Consider hacking the brain of your fellow programmers. And force them to write robust programs by conditioning their minds into doing so.

Thanks to DeadZen for proof-reading and comments.

I am no Alan Jay Perlis, nor am I really worthy.

Function parameters fornicate. If you have 7, they will quickly breed to 14.
Any "new" idea which a person thinks about has a 98% chance of having been researched better and more deeply before 1980. Thus most new ideas aren't.
Age rule for the young: If a concept is older than you and still is alive you must understand it. If it is in hibernation it may come back again. If there is no trace of it - some bozo is about to reinvent it.
Dynamic typing is a special case of Static typing.
Beware the scourge of boolean blindness.
Prefer persistence over ephemerality.
The program which can be formally reasoned about is usually the shortest, the correct and the fastest.
"We will fix it later" - later never occurs.
Project success is inversely proportional to project size.
Code not written sometimes has emergent behaviour in the system. Either by not having bugs or by executing invisible code infinitely fast in zero seconds.
Your portfolio of closed source projects doesn't exist.
Version control or doom.
Around the year 1999 the number of programmers increased 100-fold. The skill level didn't.
Program state is contagious. Avoid like the plague.
Business logic is a logic. Inconsistent logic?

0.01: The factor of human beings who can program
0.001: The factor of human beings who can program concurrently
0.0001: The factor of human beings who can program distributively

If your benchmark shows your code an order of magnitude faster than the established way, you are correct. For the wrong problem.
Debugging systems top-down is like peeling the onion inside-out.
A disk travelling on the back of army ants has excellent throughput but miserable latency. So has many Node.js systems.
Beware of the arithmetic mean. It is a statistic, and usually a lie.
Often, speed comes with a sacrifice of flexibility on the altar of complexity.
Sometimes correctness trumps speed. Sometimes it is the other way around.
Optimal may be exponentially more expensive to compute than the 99th percentile approximation.

The programmer is more important than the programming language
Programming languages without formal semantics is akin to a dumping ground. The pearls are few and far between.
The brain is more important than the optimizing compiler
The tools necessary for programs of a million lines of code are different than those for 1000 lines.
Specializing in old tools contains the danger of ending as an extinct dinosaur.
Like introduction of 'null', Object Oriented Programming is a grave mistake.
The string is heaven because it can encode anything. The string is hell because it can encode anything.

Idempotence is your key to network programming.
Protocol design is your key to network programming.
Sun RPC is usually not the solution. Corollary: HTTP requests neither.
Your protocol must have static parts for structure and dynamic parts for extension.
Only trust systems you have control over and where you can change the behaviour.
If a non-programmer specifies a distributed system, they always violate the CAP theorem.
In a distributed system, the important part is the messages. What happens inside a given node is uninteresting. Especially what programming language it is written in.
A distributed system can have more failure scenarios than you can handle. Trying is doom.
The internet has a failure rate floor. If your system has a failure rate underneath it, you are error-free to the customer.
If your system is doing a million $100 requests a year. A failure rate of 10 requests per year is not worth fixing.
If your system employs FIFO queues, latency can build up. Bufferbloat is not only in TCP.
Beware the system overload situation. It is easier to reject requests than handle them. You need back-pressure to inform.

JLOUIS Ramblings

Computer Science, Mathematics, Society, Relationships, Sex, 0xf00d, Pets, Leashes and Secretaries.

Hacking the brains of other people with API design.

View comments

Some Epigrams

View comments