The Problem
Open files are limited on UNIX systems. This is to protect different applications against each other and to circumvent an eventual resource exhaustion on the system. A typical limit is 1024 files, or fewer in some cases. In etorrent, a file is governed by a process which plays the role of that file. Whenever you want to do a file operation on that file, you get hold of the process Pid and send it a message. Simple.
Resources are limited by a timeout in these file processes. When a file has not been in use in 60 seconds, the process governing it terminates and frees up the resources on that file. It works reasonably well. The problem, however, is that some torrents have more files than the file descriptor limit. When we check the file upon starting up, we unfortunately open more than 1024 and then hit the limit.
The Solution
The solution is deceptively simple. We add a janitor process to the game. Whenever a new file is opened the janitor gets informed and the file process enters itself into an ETS table. Whenever we do an operation on the file a timestamp is bumped in the ETS table. This goes on and on; if a process dies, a monitor in the janitor cleans out the entry from the ETS table.
Now, whenever a new file is opened we check the size of the table against a high watermark, 128 by default. If more processes are opened, we extract the full table and order it by last bump. Thus, the first elements in the resulting list are the processes which have been used the farthest back in time. We then ask enough of these to terminate to bring ourselves back under a low watermark - ensuring we won't be hitting the resource collection all the time we add a new file system process to the game.
Updating the ETS table is expected to be rather cheap. The table is public, so each file governing process maintains its own entry. I don't think they will spend much time waiting for each other on the table. And if they do, there is always {write_concurrency, true} we can set on the table.
View comments