1. Some torrent files contain a humongous amount of files. Thousands. This is one of the problems you have to cope with as a client-writer and I plan to take care of both etorrent and combinatorrent. However, the solution I've adopted for etorrent is sinisterly beautiful, so I decided to write it down.
    The Problem
    Open files are limited on UNIX systems. This is to protect different applications against each other and to circumvent an eventual resource exhaustion on the system. A typical limit is 1024 files, or fewer in some cases. In etorrent, a file is governed by a process which plays the role of that file. Whenever you want to do a file operation on that file, you get hold of the process Pid and send it a message. Simple.
    Resources are limited by a timeout in these file processes. When a file has not been in use in 60 seconds, the process governing it terminates and frees up the resources on that file. It works reasonably well. The problem, however, is that some torrents have more files than the file descriptor limit. When we check the file upon starting up, we unfortunately open more than 1024 and then hit the limit.
    The Solution
    The solution is deceptively simple. We add a janitor process to the game. Whenever a new file is opened the janitor gets informed and the file process enters itself into an ETS table. Whenever we do an operation on the file a timestamp is bumped in the ETS table. This goes on and on; if a process dies, a monitor in the janitor cleans out the entry from the ETS table.
    Now, whenever a new file is opened we check the size of the table against a high watermark, 128 by default. If more processes are opened, we extract the full table and order it by last bump. Thus, the first elements in the resulting list are the processes which have been used the farthest back in time. We then ask enough of these to terminate to bring ourselves back under a low watermark - ensuring we won't be hitting the resource collection all the time we add a new file system process to the game.
    Updating the ETS table is expected to be rather cheap. The table is public, so each file governing process maintains its own entry. I don't think they will spend much time waiting for each other on the table. And if they do, there is always {write_concurrency, true} we can set on the table.
    2

    View comments

Blog Archive
About Me
About Me
What this is about
What this is about
I am jlouis. Pro Erlang programmer. I hack Agda, Coq, Twelf, Erlang, Haskell, and (Oca/S)ML. I sometimes write blog posts. I enjoy beer and whisky. I have a rather kinky mind. I also frag people in Quake.
Popular Posts
Popular Posts
  • On Curiosity and its software I cannot help but speculate on how the software on the Curiosity rover has been constructed. We know that m...
  • In this, I describe why Erlang is different from most other language runtimes. I also describe why it often forgoes throughput for lower la...
  • Haskell vs. Erlang Since I wrote a bittorrent client in both Erlang and Haskell, etorrent and combinatorrent respectively, I decided to put ...
  • A response to “Erlang - overhyped or underestimated” There is a blog post about Erlang which recently cropped up. It is well written and pu...
  • The reason this blog is not getting too many updates is due to me posting over on medium.com for the time. You can find me over there at thi...
  • On using Acme as a day-to-day text editor I've been using the Acme text editor from Plan9Port as my standard text editor for about 9 m...
  • On Erlang, State and Crashes There are two things which are ubiquitous in Erlang: A Process has an internal state. When the process crashes,...
  • When a dog owner wants to train his dog, the procedure is well-known and quite simple. The owner runs two loops: one of positive feedback an...
  • This post is all about parallel computation from a very high level view. I claim Erlang is not a parallel language in particular . It is not...
  • Erlangs message passing In the programming language Erlang[0], there are functionality to pass messages between processes. This feature is...
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.