JLOUIS Ramblings: 2008

Dec

10

The magic of puppetry
At work, we have several geographically separated machines. This is no problem for puppet, one just declares a variable for the location and then proceed to factor out differencies between the locations. I just naively went along and choose one machine from each geo-location as the defining one in that location for generic services. We use OpenNTPd for our NTP services. Use geo-location to pick out the right DNS-name/IP-address and then proceed to set it in /etc/openntpd/ntpd.conf. Ensure that the package is installed, and the service is running on the host. Do the very same for the logcheck tool, though the configuration is less specialized to geo-locations. Go to lunch. When I come back, three hosts has tripped in the spider web of logcheck. All because they are off on the clock. The magic is that the streamlining of hosts happens automatically. You don't have to hunt for the discrepancies, you just have to specify what the hosts should contain. I had no idea that some machines were without NTP. It is not because system administrators are lazy and have bad memories. It is because you need better tools to maintain your server farm, ie, puppet. The next thing I do is to get apt to run via cron (There are several tools for this, pick your poison), get it to update the database and to download new updates. Then the following little piece of magic from the Puppet website does wonders:
```
 file { "/etc/update_initiator":
  group => root,
  owner => root,
  mode  => 640,
  source => "puppet:///dist/config/update_initiator",
 }

 exec { "/usr/bin/apt-get -y dist-upgrade":
  refreshonly => true,
  subscribe => File["/etc/update_initiator"],
 }
```
When '/etc/update_initiator' is changed, then run the dist-upgrade. The update_initiator file also serves as the descriptive changelog of when an update was sent through to the machines. Update that file on the puppetmaster and then all machines happily install their updates.
Posted 10th December 2008 by Anonymous

Labels: puppet sysadmin

0
Add a comment
Dec

3

Master of puppets.
No, this is not a post about Guitar Hero 3. It is about system administration and puppetry! If you manage more than 5 hosts you will need configuration management automation. Either you write your own scripts (shudder), use CFEngine (more shudder) or you use Puppet from Reductive labs (There are .deb packages: puppet and puppetmaster). Choose a server that is going to be the puppetmaster. This guy should be fairly well protected as he will be the server handing out configuration to other machines. Give him a repository in /etc/puppet via either Subversion, Git, Hg, Bzr or your poison of preference. Make it such that a DNS lookup on puppet.[dnsdomainname] hits the puppetmaster. Enable the puppetmaster fileserver if not already done. Install puppet on the clients. Hand it the server address and it will try to communicate. It will fail miserably until you allow the clients Security CA's on the puppetmaster. Then you are ready for the fun: Assimilation! I started with something easy. Behold, sudo.pp:
```
class sudo {
       package { sudo: ensure => installed }

 file { "/etc/sudoers":
  owner => root,
  group => root,
  mode  => 440,
  source => "puppet://puppet/dist/apps/sudo/sudoers",
               require => Package[sudo]
 }
}
```
This incantation tells puppet that we need the 'sudo' package. Puppet is intelligent enough to use the right poison: portage, apt, aptitude, yum, pkg-install (Hi FreeBSD!), etc for installing the package. Next it will update /etc/sudoers with a file over the puppet protocol from the 'puppet' host (Remember the DNS setup? You'll love it in the long run ;) and take it from /dist/apps/sudo/sudoers. This path is prefixed with whatever is set in /etc/puppet/fileserver.conf and the file is copied, checksummed and configured. B00m! You now have centralized your sudo configuration. Next up: NTP!
```
class openntpd {
 package { openntpd: ensure => installed }

       file { "/etc/openntpd/ntpd.conf":
     source => "puppet://puppet/dist/apps/openntpd/ntpd.conf",
           require => Package[openntpd]
 }

 service { openntpd:
     ensure => running,
     enable => true,
     pattern => "ntpd",
     subscribe => [Package[openntpd], File["/etc/openntpd/ntpd.conf"]]
 }
}
```
Note we have a declaration 'require' that orders the dependencies among things. We'll be sure things trigger in the right order. Next we define that openntpd needs a service 'openntpd' which should be running and enabled. The 'pattern' part tells puppet that it should look for a program 'ntpd' to discern whether openntpd is running or not. Puppet is intelligent enough to force the enabling via 'update-rc.d' on my Ubuntu/Debian boxes. Final thing: Locales (shudder).
```
class locales {
 package { locales: ensure => installed }

 file {
 "/etc/environment":
  owner => root,
  group => root,
  mode => 644,
  source => "puppet://puppet/dist/config/environment",
  require => Package[locales];
 "/etc/locale.gen":
  owner => root,
  group => root,
  mode => 644,
  source => "puppet://puppet/dist/config/locale.gen",
  require => Package[locales],
  before => Exec["generate locales"];
 }

 exec { "locale_gen":
  command => "/usr/sbin/locale-gen",
  alias => "generate locales",
  subscribe => File["/etc/locale.gen"],
  refreshonly => true
 }
}
```
The new thing is the 'exec' declaration that will call '/usr/sbin/locale-gen' if our subscription '/etc/locale.gen' were updated (refreshed). Now you add the classes to the machines that need them through the 'node' facility of puppet. And then you have centralized control over ntpd and locales as well. Of course, puppet can also do things depending on host-types, fill in config-file templates, discriminate configuration in every conceivable way and so on. I can live with the fact it is written in Ruby. It happens to be brilliant. If you want to start using it, I recommend going straight to the root of a complete setup, and use that as a base for yours.
Posted 3rd December 2008 by Anonymous

0
Add a comment
Nov

27

Erlang Talk slides

Some days ago I gave a talk at the local LUG on Erlang. The basic idea was to spark the interest in the language while keeping it comprehensible to most people. Of course you need slides for such an occasion, and what tool but LaTeX is the solution? First, you'll need the BEAMER package. Ubuntu has a 'latex-beamer' package. Then you'll simply begin a LaTeX document with: Update: Changed 'fontend' to 'fontenc' which is the right thing to write here ;) \documentclass[20pt]{beamer} \usepackage[T1]{fontenc} \begin{document} ... Each slide page can then be added with the following construction \frame{ \begin{center} Concurrency\\ $+$\\ Parallelism \end{center} } which creates a single slide frame on which text is placed. Calling 'pdflatex' on the document then produces a PDF you can show in 'evince' or similar program. Run the document in presentation mode and you are set to go. The next part is graphics. One would think this is hard, but oh no! Not so! You just add \usepackage{graphicx} before the document environment. And then you can include images like I did in this slide (note that you can often omit the file-type ending and let LaTeX figure out that part by itself) \frame{ \begin{center} \includegraphics[scale=0.5]{400px-Skull_and_crossbones.pdf}\\ \textbf{\large{``Doom doom doom''}} \end{center} } Luckily, you are not limited to only PDF documents. I used PNGs, JPEGs and a couple of other things. The Beamer homepage Slides from the talk Guy Kawasaki: the 10-20-30 rule for Pitching on Youtube

Posted 27th November 2008 by Anonymous

3
View comments
Oct

12

Debugging Erlang programs 101
The first step in debugging Erlang programs is simple: Acknowledge that debugging concurrent programs is hard. Extremely hard. When you have that acknowledgement, the trick is to get your bag of debugging tools up to date. The very first thing you should do is to enable SASL. This can be done with the command
```
application:start(sasl).
```
or by using the boot-script that includes the SASL application. SASL will enable different kinds of information in the shell, but it can also be configured to output to a file. SASL will send reports in different situations. Whenever something CRASH-es or something PROGRESS-es, SASL will send a report to you. The next part of debugging is to understand how to dissect Erlang back-traces. Erlang will not report line-numbers where the error occurred. Hence, you should use many small functions, as the function will occur in the back-trace. You must also remember that the back-trace may does not include tail-calling stack-frames. How do you read a back-trace from Erlang? The first thing is to know what the different error messages means. The exit reason of a function will tell us what is wrong at the point where the error was raised. The next part of the stack-trace looks like:
```
** exited: {function_clause,[{foo,fact,[0.00000e+0]},
                             {foo,fact,1},
                             {erl_eval,do_apply,5},
                             {shell,exprs,6},
                             {shell,eval_loop,3}]} **
```
So what does this tell us? It says that the exit reason was a "function_clause" error. These are because no pattern matches in the foo modules fact function. The next term is a call to "foo:fact/1" and that comes from a call "erl_eval:do_apply/5" and so on. More evil are local "fun (...) ..." declarations which will be put into the stack as well. Watch out for these.
Assertions
The single assignment form of Erlang let us write assertions in a neat way in the code. If we set X = Y where both [X] and [Y] have already been defined, we define an assertion. Note this is also true if X or Y are (partially) constant expressions. This asserts that X and Y are equal. Use this to your advantage! Spray with assertions all over your code. Write down what you expect the value to be. Write functions that tests assumptions about the code. If the assertion is violated, you get an error right away. Use function guards. If you expect an integer, write
```
foo(X) when is_integer(X) -> ...
```
rather than foo(X) -> ... Sometimes, guards can't be used as only certain BIFs are allowed as guards. Then you, write a longer function that tests your assumptions and assert on that function. When things get concurrent, it is assertions that will save you. You should not worry too much about adding assertions to the code. Unless you add them in the cost-centre, it won't affect you much. Rather, you should worry about the correctness of the program. You can always remove assertions from critical parts and surround the critical part with a check if needed.
Dialyzer
One last debugging tool to mention is the dialyzer. It is a static checker for Erlang code and it is able to find many discrepancies in the code if you let it run on it. You should heed its warnings, for often they can alleviate a problem before it even becomes one. This concludes the 101 in Erlang debugging.
Posted 12th October 2008 by Anonymous

Labels: debugging erlang

0
Add a comment
Oct

1

Books equals code?

Suppose you were an author writing books. A publisher wants to publish a novel on a particular subject; a sci-fi novel say. Now imagine that the author will be paid for each month he or she works on the book with a nice pay, but once the book has been written, then it belongs solely to the publisher. The publisher will sell the book for the next 20-30 years and make a bit of money off it each month. The author will not get anything of these money. A mediocre author would -- perhaps -- jump at this. A bestselling author, however, wouldn't. She would only accept writing the sci-fi novel should she get considerate percentage of the earnings. She would know that her value is built by accumulating it in the books she writes because they will throw off a small bit of money each month for 20-30 years. Now imagine you are a programmer at a company. Most programmers happily give over all rights on the code to the company. They happily get their pay each month, and when they get fired or move on, their accumulated value is lost entirely. The paradoxical thing is: most programmers accept this. Some of their worth is in the code they wrote, but since it is entombed in the repository of the company they have access to that worth in their next job. I command this to stop. Now. Here is a set of ideas: First, you could tie the success of the company with your earnings, but be sure that you get something other than salary. It is gone when you leave. Stock options is one way. Or you may be one of the owners. Second, you could ignore the company and write code "in stealth" on the side of the company. When you have a product you could try to shrink-wrap it into something salable. With luck, you can then either get something fun out of it, or a lot of money. Note that many companies disallow such activity. If you are at one such company, I suggest you move on and let them outsource your seat to India. They will be sorry for trying to outsource the system kernel (it always fails). Third, you could have the company release part of their code as open source. This leaves you with a way to take your work with you to another company and with luck you also generate "street credit" in the Open Source community. In other words, you are generating accumulating value you can take with you when you leave. The company can get a feedback loop going on such open sourced software, so it might end up being beneficial to them. Fourth, you can write software and release the software under an open source license. But you only do work on the system by contract. If a company wants tailoring, it is negotiable for a price. They get the best programmer for the job and you back-fit the changes into your code generating even more value in the project. Another path is to take money for support on the code base.

Posted 1st October 2008 by Anonymous

2
View comments
Jul

21

The Law of secret weapons

So, one searches the a job index in denmark: ML: 0 Lisp: 0 Haskell: 0 Erlang: 0 Python: 1 - Manage a Java EE build system(!) Perl: 2-3 system admin jobs. The conclusion is clear: the Law of the Secret Weapons(tm) still holds true. You can pick anything from the list above and you will be able to enter almost any field with the best technical platform in 4-6 months time. I am sick of the Java and .NET jobs which let me code in a mediocre language with mediocre IDE-crap on some project I have absolutely no love for. I can understand the functional toolset and mindset of ML and Haskell puts off people. Lisp is harder since it has so much backing in the commercial world after all. Same goes with Erlang - given that it currently has some of the hype. But seeing that few Python jobs astounds me. It is fairly easy to learn and has some great backing in Google. It is not going to perish the next 10 years. And you can deliver great software in a short amount of time with it. Python also has the advantage that its dynamic typing makes it easier for the PHP crowd to pick up. Static types are awfully cool, but they require some training and routine to use effectively to your advantage. Most integration is also complete bliss to pull off in Python if you know what you are doing. So any company that wants to use anything of the above can have me. I would be delighted to rip some Java/C# programs apart with a real language.

Posted 21st July 2008 by Anonymous

2
View comments
Jul

19

fprof in few paragraphs

First you find a pid. Then you run fprof:trace([start, {file, "f"}, {procs, [pid(0,1292,0)]}]). and let it cook for a while. "f" will contain the profile raw data. When it has cooked for some time, you can run fprof:trace(stop). to close the file and stop the trace. Then you read in the data with fprof:profile(file, "f"). which builds a database in memory from the profile run. The analysis output is then generated in 150 columns with fprof:analyse([{dest, "f.analysis"}, {cols, 150}]). to the file "f.analysis". To understand the profile output, you can read the Tools Users Guide which has a chapter on fprof and describes the output format. Let the CPU-cycle hunt begin!

Posted 19th July 2008 by Anonymous

1
View comments
Jul

17

The Dialyzer tool, love at first sight

So I am hacking a lot of erlang at the moment. The problem is that erlang is a dynamically typed language and bugs are sneaking into my code. This is where the dialyzer enters the game. It is a system which will take erlang code, infer types for the code and use it in a static analysis in order to find errors. For me, this is a very valuable tool. While you can just run the code, it may take some time before you cover all cases, so periodically running the dialyzer can sometimes find bugs before they show up in running code. You compile your code and then you run dialyzer -r "path-to-ebin-beams" and the dialyzer will process your code and report problems. You will have to produce a PLT first. The PLT is the initial table with type information from other applications that you have stored in it. Creating the PLT takes time. A lot of time. If you run the dialyzer without a PLT you get hints on how to produce the initial PLT. I most often use the dialyzer before comitting a larger set of code or after merging a branch upwards (From less stable to more stable code). Sometimes it finds messy stuff. And then I can go and fix it!

Posted 17th July 2008 by Anonymous

0
Add a comment
Jul

9

etorrent - a bit of some history.

So, writing a torrent client goes through 3 phases: In the first phase, you implement what is in the spec. These are the obvious things you need and the things you use as building blocks for the later steps. In the second phase, you read the spec and figure out how things must fit together. Luckily, there is only one way to fit in most places, but there are some small nasty exceptions where you need to read code of other clients to figure out what should happen. In the third phase, you go from the spec to implementing all the features that are more or less undocumented, but crucial for speed. Here is an example:
It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked.
So this is simple enough to implement. Sort the peers on download speed to the client and go through them one by one, choking and unchoking accordingly. But you need to recalculate this on state changes. And then there is optimistic unchoking which must also be handled. Seems easy enough. The etorrent client currently has numerous problems with the above scheme. First of all, the measurement of download in clients must be running average, dampened and fine grained. Second, the 'four peers' part is a lie, and no modern client does that. Rather they choose the number based on the sum of the download rates and a heuristic. Of course this can only be understood if you read the source code of other clients. This is the undocumented hairy part of the bittorrent protocol.

Posted 9th July 2008 by Anonymous

0
Add a comment
May

4

Erlang and bittorrent?

I have been working on an erlang bittorrent client for too many years at the moment. The reason it takes so long to get the code up and running is that it is a hobby project and that I have plenty of time doing it right. So at the moment the code works, but it does not have the stability I would like it to have in the long run. But the code does work for downloading smaller things and I would like to have more hackers on the project. To document my setup, I better explain what is used from the perspective of development. The editor is emacs with the distel mode on top of it. See The Distel Googlecode site. The distel program allows me to dynamically upload recompiled code to a running erlang node so I can cut down the fix time to a minimum. For a dynamically typed language this is quite important as most bugs are related to wrong types. Of course I run the standard Erlang distribution, without HiPE. I have not found the reason to compile code for speed yet as the code eats less than 1% CPU power when running. All database table access in the current code is linear, so there are definitely room for improvement and it will be needed as you want to download more than a single torrent file. The code is kept in a GIT repository at repo.or.cz and the bug tracker is on the google code site. I tend to keep the master branch in a stable baseline state so if you want to hack you better base your work on that. Also, I tend to keep my own main working branch in the repository as well so you can track what I am doing which is not yet ready to go into the baseline. Do I like hacking Erlang? Yes and no. I really like the language, but the lack of a static type system is seriously pissing me off at the moment. But you can't have it all and Erlang does have a good production quality system which is rather hard to find with most other esoteric languages. The closest contenders being Haskell and Ocaml.

Posted 4th May 2008 by Anonymous

3
View comments

JLOUIS Ramblings

Computer Science, Mathematics, Society, Relationships, Sex, 0xf00d, Pets, Leashes and Secretaries.

Add a comment

Add a comment

View comments

Assertions

Dialyzer

Add a comment

View comments

View comments

View comments

Add a comment

Add a comment

View comments