04.24.06

Free-software licenses: requirements vs. requests.

Posted in Programming and People at 6:39 pm by Brooks

I’ve been thinking quite a lot about free software licenses lately, and how important it is to get them right. A couple of recent events will serve to illustrate some of the important points:

  • The “listings” package for LaTeX is invaluable for including computer source code in a document. It’s something like syntax highlighting in a source code editor, but for typeset output rather than on-screen, and very easily user-customizable. Carsten Heinz spent almost a decade writing it and perfecting it, and it’s a marvel of intricate TeX programming — 218 pages of typeset user documentation and source code. Then, about a year ago, Carsten disappeared. As far as I know, nobody in the TeX community knows what happened to him; he simply stopped replying to email, and letters to his last known physical address just disappear. As a result, the official version of the listings package has effectively been frozen for the last year, despite a couple of important bugfixes that have been informally circulating. Luckily, this isn’t a permanent situation; it’s licensed under the LaTeX Project Public License, which provides a process by which the maintainership of the “official version” of a package can be transferred to someone else in such a case.
  • In a blog post I was reading this morning (I’m leaving this vague so as not to point fingers excessively at only one example of a common problem), the author mentioned a general-purpose data structure that he had included in his open-source simulation code, and provided a link to the source file for the implementation. Like many such implementations, it’s sufficiently general that it would be useful in many programs that I’m writing, and it’s a better implementation than I could come up with (and test!) in a day’s work. However, there’s a problem: The license for this software package is not quite the same as any common software license, and it requires that its particular set of conditions and disclaimer be included in any redistribution. Further, the license has the relatively common clause that any scientific publications which contain data resulting from use of the program must cite it.

I’ll start with the last item. It’s completely reasonable that the author of a piece of scientific software would want to be cited when their work is used, so what’s wrong with putting that in the license? The answer is a lesson that should have been learned years ago. It only makes sense for cases where what’s being redistributed is pretty much the same as the original work. Beyond the well-known problems described at that link (short form: someone else builds on it enough to merit citation and adds their paper, and after a few repeats of that it quickly becomes unweildy), there’s the problem of partial reuse: if I include this little data structure implementation in my program, according to the license I then have to cite their paper in all of mine, even though my work is in a quite unrelated field. And so does anyone who uses my program. While that might arguably be appropriate for an advertising clause, it’s not appropriate for scientific citations.

So, functionally, I pretty much can’t reuse their data structure in my own software, if I wish to do so under its published license. That’s completely contrary to the spirit of free software; even though it’s open source and I can read it, the license prohibits me from simply copying it and reusing it unless I’m willing to agree to inappropriate terms. This almost certainly isn’t the intent of the program’s author; he simply didn’t consider the possibility of someone reusing a small piece of his program when he decided how to license it.

In this case, there’s probably a simple solution. If I want to use this code, I can write to him and ask for permission to use it under a different license than the one it’s published under — specifically, one which doesn’t contain this citation clause. Probably he’ll agree, and that will be that.

However, this is where the first example is relevant. What if it was something in Carsten’s code that I wanted to reuse, and he had a similar sort of clause in his license? Plain and simple, I’d be stuck. Even if I were willing to ignore the license on the assumption that his heirs wouldn’t find out or wouldn’t care enough to sue me, I certainly couldn’t relicense the code with a clear conscience.

That’s why license compatibility is important, and why it’s a bad idea to add private clauses like the “cite my paper” clause. Without it, free software isn’t a commons — it’s a balkanized set of little gardens fenced off from each other and unable to cross-fertilize. We’ve had this problem with railroad tracks, with fire hoses, with all sorts of things early in the industrial revolution; as engineers, you’d think we’d know better by now. Four feet eight and a half inches probably isn’t the optimal solution for any given railroad in isolation, but it’s certainly better in practice in the real world for nearly all of them, because it’s what everyone else uses, and in the long run it’s just not worth the trouble of being different.

“But,” you say, “I still want people to cite my paper when they use my software!”

Consider this: Most researchers and users of your program are, by and large, honorable and honest people. They understand the value of citing sources, and they appreciate that you’ve written this software and made it available for them to use. For the vast majority of them, all that you need to do to get them to cite your paper about the software is to ask.

License terms aren’t “asking”. They’re a demand, one with legal teeth behind it. They’re what you put in there for the people who you don’t trust to do right, and for the corporations that are too big to have morals at all. They ought to be the final line in the sand for the things that are vitally important: you don’t claim you wrote this, you don’t claim I’ll support it, you don’t put restrictions on it that I disagree with. If it’s not that critical, and if you wouldn’t sue someone for ignoring it, then setting it in that sort of legal inflexibility does more harm than good.

Thus, I propose the idea of adding a “Requests” or “Moral Obligations” section to software licenses, for this sort of thing. When I release a program that’s large enough to merit a paper citation if someone uses it, somewhere near the license clause in the documentation will be a section that says something like this: “The requirements in this section are not legally binding; however, they represent things that the author would appreciate: If you write a paper based on work that uses this software, please include a citation to my paper, and send me a copy of your paper. If you create a derivative work based on a this software, tell me about it and let me know how I can get a copy, and include this request if it’s appropriate.”

In virtually all cases, that will work just as well as adding those as terms to the software license. (How do I know it will work? Well, it already works quite well for the papers themselves, though there the request is implicit rather than stated.) And, importantly, it will work equally well in the cases where something I asked for turns out to be terribly inconvenient for some reason that I didn’t think of when I wrote it, even if I’m not around to change the license terms.

6 Comments »

  1. TWAndrews said,

    May 23, 2006 at 1:47 pm

    Wouldn’t it be simpler to put something like the following in the license:

    “If it does not present an undue burden to use or reuse of , and a reasonable individual would understand the benefit of <code> in the licensed context, and the licensed work constitutes a substantianal fraction of the new work, licensees should cite the original work.

    In all cases the licensee must make a good faith effort to inform licensor. A detailed email to the following address which describes the nature of the reuse shall be considered consistent with fullfillment of this obligation. Where reasonable, the licensee should include a copy of the resulting software or paper in this communication”

    Essentially it’s just a matter of having flexible license terms for attribution. The advantage of this over a “Requests” section is that terms like “good faith effort”, “reasonable individual” and “undue burden” all have a history in law, so it wouldn’t be making up entirely new concepts. This would ensure that the originator of the work got cited where appropriate, but that at the point where it was an enormous pain in the ass, the attribution would no longer be required.

  2. Brooks said,

    May 23, 2006 at 8:57 pm

    Thanks for the comment! That would certainly be better than the original wording; I like the idea of phrasing it as a good-faith effort and allowing for undue burdens.

    I’m not really convinced that it’s simpler than a “requests” section, though. From a legal standpoint, the “requests” section isn’t making up a new concept either; there’s lots of history of contracts having preambles and other things that are legally meaningless, and from the standpoint of the law this is no different from those.

  3. TWAndrews said,

    May 24, 2006 at 6:59 am

    there’s lots of history of contracts having preambles and other things that are legally meaningless

    That’s true. But I still think it would be easier to convince people to use open licenses if they felt they had some legal rights in terms of receiving credit, particularly in the case where the development is done by someone who works for a large organization with lawyers and management who are, shall we say kindly, less likely to see the importance of enabling easy reuse.

  4. Dan Gezelter said,

    May 31, 2006 at 1:22 pm

    The flexible attribution section is a great idea. You should come up with a formal wording and a name for this license and submit it to OSI for certification as an open source license. I’d switch over to it.

  5. The OpenScience Project » Free-software licenses said,

    May 31, 2006 at 1:59 pm

    [...] Everyone should go read Brooks Moses on Free-software licenses: requirements vs. requests. His post has made me re-think the license we use for our group simulation code. I’ve never like GPL because it essentially guarantees that friends in the corporate world won’t be able to use our code in their products; the simplicity of the BSD-style license has always appealed to me. As many people who adopt the BSD-style license have done, I threw in this attribution clause: Acknowledgement of the program authors must be made in any publication of scientific results based in part on use of the program. An acceptable form of acknowledgement is citation of the article in which the program was described (Matthew A. Meineke, Charles F. Vardeman II, Teng Lin, Christopher J. Fennell and J. Daniel Gezelter, “OOPSE: An Object-Oriented Parallel Simulation Engine for Molecular Dynamics,” J. Comput. Chem. 26, pp. 252-271 (2005)) [...]

  6. naisioxerloro said,

    November 28, 2007 at 8:32 am

    Hi.
    Good design, who make it?

Leave a Comment