1. So I spent the time necessary to look into this Behavior-driven-development idea. Summing it up in one sentence: "BDD is computable specifications". That is, you write a specification which is a program. This program takes as input a component and produces as output one of success or failure. A driver program then tries all the behavior-programs in succession and produces a report to the user or programmer.
    This is a natural extension of test-driven-development, where one writes a computable test for each component in the software system before the component itself is written. The insight is that these tests are specification behaviors in the system so one can name them as such rather than as "tests".
    The idea of building a model of the specification which can be computed is not new. There are strong similarities with Design by Contract (Meyer, 1986) and I am sure you can find older examples in the Lisp community with strong similarities. The BDD term is not defined rigorously, in tradition with most software development terms, sadly. Thus it means different things to different people. Some will say that DbC is BDD and some will say that it is not.
    Main insight: If you have a specification written in English prose, you have no way to automatically check if the program lives up to the specification. You can only test it manually, which is a tedious, cumbersome process prone to failures. If the process is computable, we can get the machines to work for us and work out the details.
    Getting the machine to understand the specification means it needs to be precisely specified, like a computer program. Our hope is that while the program specified may be complex, the specification itself is not. Thus we can specify the behavior of the program and have some assurance that the specification is correct. If the specifications are as hard to write as the main program it turns out to be turtles all the way down.
    There are several ways we could attempt to write down a formal specification. We could, for instance, use Hoare logic - meaning we specify the behavior of the program in an assertion language different from the implementation language. One then tests the specification by verifying that the rules of HL is fulfilled.
    Another path is to make the type system strong enough to capture formal logic and specify the program in the type system. To achieve this, one must have a type system much more expressive and powerful than what most mainstream programming languages provide. Testing the specification then amounts to a type check. Aside: Expressive type systems like the one found in Haskell or ML provides an approximation to this complete correctness. When we say that "type systems eliminate bugs in the program", we mean that the approximation caught a bug which would otherwise have slept into the program. Note that (static) type systems are a knob you can turn. The more expressive the system, the better an approximation to the specification. Turned to 11, the type system is the specification, but most programmers abhor the idea of programming at this level of precision. End of aside.
    The BDD path is to define the specification in the host-language as a program itself. Usually this can be done by a compiler from a domain-specific language (describing behaviors) into the language in which the tested component is written. The advantage of this approach is that you don't need a separate tool chain but can reuse the existing one for running your tests. Testing the specification is done by running the spec-program against the implemented components and observing their behavior.
    Thus, BDD is a runtime-oriented path to verifying the program specification. We check the program correctness by executing the test. I would imagine that we also design the behaviors such that they cover all paths of the component. This is fairly easily achieved by a coverage checker.
    Personally, I see the BDD-idea as a poor-mans formal specification -- it is what you end up with if you just use the available tools rather than define a spec-language. You end up with exactly the same problems as most formal specifications any way: Explaining and understanding what they mean is hard. The problem can somewhat be alleviated by clever tricks where the actual code is hidden in a textual presentation which normal people would have a chance at understanding. This part of BDD circles around the problem of defining a good User Interface for the specification in question. This is the part of BDD I find the most interesting.
    Also, the fine merits of BDD might be the first casualty to any deadline. Deadlines kill. Anything which can be placed into the scheme of "short-term good, long-term bad" and incurs technical debt is turned on as soon as the deadline approaches. Everything falls to a fast cost-benefit analysis. The problem here is that there are no measurable way to evaluate different software development methodologies so an equally good and faster analysis is thrown dice.
    0

    Add a comment

  2. Let there be no doubt, that I like to use PostgreSQL as my main database of choice. I like my databases to use SQL, a declarative language, for my operations and queries. I guess the reason for this is my strong knowledge and experience with functional languages in general. The path from functional programming to SQL is rather short.
    This is a list, in no particular order, why I love PostgreSQL. Note that many of these things are far better explained in the excellent documentation of PgSQL, so you can look up things there.
    • First, PgSQL takes types seriously. Data is typed from the outset and you need to convert data to match the given type. This means we rule out dates like '2009-02-31' by default. MySQL fails this blatantly for instance. Also, it allows us to import data into a typed language fairly easily as soon as we have a type mapping defined.
    • PgSQL has TOAST-storage. This includes several techniques like compression and out-of-line storage. What this means is that you can process data faster since disk I/O is what tend to kill database performance as soon as your data get beyond 'fits-in-memory'.
    • PgSQL can do partial indexes. I keep partial indexes on unprocessed data. Hence, even if the table contains billions of rows, I can quickly grab the unprocessed records and process them. I also keep some partial indexes around for performance.
    • PgSQL does bitmap index scans. This can combine several indexes into a bitmap telling the system what rows or pages to retrieve. Again, it lowers the amount of disk I/O needed to access data. Wonderful for complex queries.
    • PgSQL uses Multi-version concurrency control (MVCC). This means you can do backups of your database without having to worry too much about what happens while you take the backup. It also means you can be sure that your ACID compliance is done correctly without affecting the performance of individual queries.
    • You can disable autocommit by default in the psql client. No more accidentially destroying data.
    • Data Definitions (the DDL) are transaction safe! You can undo an ALTER TABLE statement if it goes wrong. It makes data manipulation into a blissful operation: Alter the table, CREATE VIEW to get the old view of the data back. COMMIT the transaction. You can do this on a live system, which is wonderful.
    • Need to add millions of rows? Use the COPY command (or its psql client \copy cousin). It is a lot faster than INSERT INTO for large amounts of data. My work desktop machine with an old SATA disk imports millions of (rather large) rows in a matter of minutes. This including raising a couple of indexes on the data.
    • Need to export data? CREATE TEMPORARY TABLE export ( ... ), fill it with a SELECT statement, fire up COPY. Done. I've built many reports directly in the guts of the psql prompt with a couple of VIEWs and this technique.
    • PgSQL is fast, especially for complex queries. This is especially true for recent PostgreSQL releases, where performance was the optimization target. And it paid off. Combine it with a query cache or a dedicated key-value store and you have a very stable, robust and fast platform which can scale to thousands of simultaneous users.
    • Built-in full-text-search. I've used this to answer search queries from users on the data with ease.
    And I get to combine PgSQL with Ocaml, awk and R. Don't underestimate the data manipulation power of awk, it is extremely fast.
    0

    Add a comment

Blog Archive
About Me
About Me
What this is about
What this is about
I am jlouis. Pro Erlang programmer. I hack Agda, Coq, Twelf, Erlang, Haskell, and (Oca/S)ML. I sometimes write blog posts. I enjoy beer and whisky. I have a rather kinky mind. I also frag people in Quake.
Popular Posts
Popular Posts
  • On Curiosity and its software I cannot help but speculate on how the software on the Curiosity rover has been constructed. We know that m...
  • In this, I describe why Erlang is different from most other language runtimes. I also describe why it often forgoes throughput for lower la...
  • Haskell vs. Erlang Since I wrote a bittorrent client in both Erlang and Haskell, etorrent and combinatorrent respectively, I decided to put ...
  • A response to “Erlang - overhyped or underestimated” There is a blog post about Erlang which recently cropped up. It is well written and pu...
  • The reason this blog is not getting too many updates is due to me posting over on medium.com for the time. You can find me over there at thi...
  • On using Acme as a day-to-day text editor I've been using the Acme text editor from Plan9Port as my standard text editor for about 9 m...
  • On Erlang, State and Crashes There are two things which are ubiquitous in Erlang: A Process has an internal state. When the process crashes,...
  • When a dog owner wants to train his dog, the procedure is well-known and quite simple. The owner runs two loops: one of positive feedback an...
  • This post is all about parallel computation from a very high level view. I claim Erlang is not a parallel language in particular . It is not...
  • Erlangs message passing In the programming language Erlang[0], there are functionality to pass messages between processes. This feature is...
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.