Computer Science, Mathematics, Society, Relationships, Sex, 0xf00d, Pets, Leashes and Secretaries.
Tuesday, January 06, 2009
Common Erlang misconceptions
There is a common saying that Erlangs I/O layer is slow. Bullshit. The etorrent program can easily sustain 70 megabit with less than 5% of CPU usage on my old machines. The reason is that etorrent is shuffling 16k binary blocks around mostly. But that means you have a pointer to each 16k block and you are only a writev(2) call away from the disk.
This means I/O is not itself a problem. The problem that people often hit is Erlangs string handling. In Erlang a string can either by represented as an immutable array of 8-bit characters, called a binary; or it can be represented as a list of integers (ie, a linked list of integers). The latter is mutable in the sense that you can generate new integers and update the linked list from the front only.
Whenever you mutate the list of integers, garbage is generated. Allocation will happen and this directly translates to more garbage collection pressure later on. Had you had destructible arrays in Erlang it would have been very easy to circumvent the worst performance problems, but you will have to do with a limited zoo of types. In better languages, like Ocaml, Standard ML or Haskell, your type zoo is much richer here. You can choose between several immutable and mutable variants so the leverage is much greater.
First trick: Erlang has the concept of an IO list. These are lists of binary/string chunks. They are fast because they lend themselves to a writev(2) in the bottom of the system. Thus you don't have to concatenate strings in the Erlang language but you can hand a quasi-concatenated list to the VM and let C do the heavy-lifting for you.
Second trick: Mutating data as strings is slow. But if you have a abstract tree of tuples then it can be manipulated fast. If you think about the command [Bin | IOList] where we add a Binary to the front of an IOList, you'll know it happens in O(1) (constant) time. If we build a tuple {atom, D1, D2, ...}, we can build it in constant time. So the trick is to maintain your data in a non-string form for as long as possible and then only change it to a string at the very last point in time.
Third trick: Correct data structure. Ropes. Finger Trees. M-way trees. Don't brute force yourself over a string. Process the structure by pattern matching in a clever way so you minimize the amount of change needed. A good tree will have around logarithmic change whereas a list will have linear change. This leads to the next point: allocation.
Another crap problem in Erlang is that you have few (traditional) ways to minimize allocation. Any kind of operation will allocate data if you are not wary of it. Hence you should try to build programs that do not gloss over data and allocate all the time. Mapping of lists via list comprehensions is a good example. It is easy to understand but it will build a new list and garbage collection will cost you. If you can do multiple things in one list comprehension, you win traversals and allocations.
Whenever I hear about "Slow string processing" in Erlang I immediately think: "Another Ruby programmer trying to cram his language into Erlang". If you hit slowness the trick in Erlang is to redefine the problem! Don't try to string process yourself into oblivion. If the sole thing your problem concerns is string-processing, go use perl. Please. Or be clever and use Ocaml. Pre-process the data in other languages and make them into erlang terms which can be read fast by erlang. Erlangs power is the concurrency at which it is really good. There were no reason for giving it fast string processing.
Also, don't get bitten by the mnesia bug, please. It is almost always the case that mnesia and ETS are the wrong way to go. I got bitten by it multiple times in etorrent and now I am contemplating ripping all of it out again. Plain old data structures baby! There are some nasty caveats to mnesia and ETS and they are not related to the size limit of the database.
For mnesia, be aware that a failed transaction will be re-run! Now take 100 transactions tripping over each other. They will never finish. Etorrent did that when peers should update their state. Instant snail-speed application. Mnesia is crap at doing joins. Just forget it.
ETS is also problematic: Any store or get from ETS means a data copy. Don't do as etorrent and add large rows (128k+) to ETS. Your app will come to a grinding halt.
Don't message big data structures around. Any message incurs a copy (in the default VM). If you need to work with a big data structure, shove it into a process and then send the Pid of that process around or register him globally. You can think of a Pid as "Passing a pointer around" in this case.
And finally: Measure measure measure! Be scientific about your performance problems. You need to know exactly what causes it before jumping to conclusions. And don't worry too much about being parallel at the beginning. You can always split things up in finer granularity later.
Subscribe to:
Post Comments (Atom)
About Me
- Jesper Louis Andersen
- Lambda-loving CS Geek. Likes metal music. Likes dogs, cats. Does not like pictures of dogs and cats (unless they are lambdacats!)Has an unhealthy coffee addiction. Calls himself the coffee zombie in the morning (BEEEEANS!)Has a neverending curiosity gene, loves intelligence and passion.