The terms "Open Source" and "Free Software" almost always describe the licensing state of a piece of software. For the software projects that they are attached to, they describe a crucial attribute — your rights as a software developer and user to read, modify, build on, and use the source code of a project for your uses. "Open Source" and "Free Software" are also implicitly attached to the concept of an open development process — the software is worked on, improved, and distributed by a self-selected group of developers to the benefit of all — but this is not always the case. This article will describe and discuss the development processes of four projects that I am familiar with in the Open Source GIS domain and will describe the role that I think that OSGeo should provide in ensuring open development for its member projects.
Karl Fogel's "Producing Open Source Software" is the book to read on this topic. It is the handbook of open development, describes the attributes of successful open source project that is openly developed, and it even covers sticky topics like how to work through issues like money entering the project, forking, and governance. If you get a chance, I highly recommend you read through it, if to only become familiar with the attributes of open source projects with respect to development and cues to help you evaluate them.
During the summer of 2005, ERMapper released the software to their wavelet compression/decompression engine called ECW. While the source code is technically open, the licensing terms do not meet any of the requirements of any of the mainstream open/free software licenses, and by all accounts, the development process of ECW is anything but open.
My opinion is that ERMapper is in limbo with respect to opening ECW. They want outside contribution, especially with respect to building and deploying the software and fixing bugs, but they are concerned about their format and its possible divergence from the tons of data already out there in ECW format. The possible licenses you can assume and fall under reflect this ambiguity, and each variant of the license reflects what you as a user can do with the software.
As a developer, your only real option at this point is to not bother. According to Fogel's criteria, ECW meets hardly any of the attributes of a successful open source project:
- There is no common public source code repository.
- There is no development mailing list.
- There is no public bug tracking.
- There is no public history of development — why the software is the way it is.
I think a project like ECW (or even MrSID) has the potential to be a highly impactful open source project that can reach beyond the geospatial domain. Wavelet compression is becoming an important topic, and participation in a project that utilizes this has the potential to catalyze and seed many other development efforts. Hopefully, ERMapper will see that their worries about format divergence will actually be mitigated by a truly open development effort. Time will tell on this one…
GEOS has an interesting history. It started as a port of JTS to the C++ platform by Refractions. It is the geometry engine underneath PostGIS, and it is used by many other open source projects in the ‘C' camp of open source GIS to provide topology and geometric algebra operations. Licensing-wise, it meets the criteria of open source software, and it is licensed under the LGPL (there is no explicit LICENSE.txt in the source code, but each source file is described as "licensed under the LGPL").
Here are the attributes of Open Development that GEOS has:
- A source code repository
- A public bug tracker
- Development list
GEOS meets most of the criteria listed in Fogel's book, but in my opinion it is missing couple of key components to truly qualify as open development (disclaimer: I am a source code committer on the GEOS project). First, it really has no community-oriented governance model. The project leader is ostensibly a maintainer that is paid by Refractions to organize and push forward GEOS' development. Major developments must be vetted by Refractions (and frequently funded through Refractions) before they can be undertaken. Releases are made to serve Refractions' business needs (PostGIS or client-funded improvements). These attributes make GEOS risky from the perspective of an individual, independent developer because your investment in GEOS as a developer may be thwarted if it is not in line with the interests of the company that "owns" the project.
A quote from Fogel's money chapter illustrates this point more elegantly than I can:
However, funding also brings a perception of control. If not handled carefully, money can divide a project into in-group and out-group developers. If the unpaid volunteers get the feeling that design decisions or feature additions are simply available to the highest bidder, they'll head off to a project that seems more like a meritocracy and less like unpaid labor for someone else's benefit. They may never complain overtly on the mailing lists. Instead, there will simply be less and less noise from external sources, as the volunteers gradually stop trying to be taken seriously. The buzz of small-scale activity will continue, in the form of bug reports and occasional small fixes. But there won't be any large code contributions or outside participation in design discussions. People sense what's expected of them, and live up (or down) to those expectations.
The second, and more important divergence in my opinion, is that the head of GEOS is disconnected from its body. Definition of how the software is supposed to work — design and architectural decisions — actually comes from JTS. Strict adherence to the "C++ port of a Java library" mindset, and GEOS' insistence on following JTS to the letter instead of in spirit means that technological advantages that C++ could provide can't really be taken. There's no feedback loop so that possible improvements made in GEOS make their way back to JTS. GEOS is the way it is because JTS is that way, and JTS defines what GEOS is. As a developer, only small improvements and changes can be made, as long as the parallelism between GEOS and JTS is not broken and the changes are in line with Refractions' interests, and these restraints, in my opinion, causes internal forking efforts to happen and retard the development of GEOS in what it aspires to be — a fast, correct, C++-based open source library for geometry and topological operations.
Frank Warmerdam's GDAL
GDAL meets most of Fogel's criteria for an open development project (disclosure: I am a committer and PSC member on the GDAL project). It has a source code repository with history going back almost to the first day that Frank checked code into the project. It has a bug tracker with thousands of bugs (most of which are labeled as "fixed"). It has a very active user community, a documentation website, and an IRC channel.
GDAL has historically followed the Linux model, where most changes go through Frank, and lieutenants are left to be in charge of certain areas of the software. From an open development and governance standpoint, GDAL is currently in transition. While we all believe that Frank has a time stopping machine hidden up in the deep Canadian woods with him that allows him to be so prolific, there are limits to what one person can do, and GDAL is increasingly approaching them.
The move to OSGeo for GDAL has brought about the emergence of a Project Steering Committee, and Frank has relinquished outright dictatorial control of the project to it. The release process is slowly moving to a more community-oriented affair, and members of the GDAL development community are stepping forward to take care of maintenance of larger areas of the software. Sweeping technological changes and addition of features now go through a proposal and committee process to ensure that everyone can be made aware of such developments. Explicit communication (and the implicit history this creates) about these things ensures that developers on the project are roughly moving in the same direction.
In my opinion, MapServer meets Fogel's criteria for an open development project (disclosure: I am also a committer and PSC member on the MapServer project). Unlike the other three projects that I described which are targeted almost exclusively toward other application developers, MapServer's audience is both user-oriented web developers and some standard GIS-type application development. This diversity manifests itself throughout the project, most notably in the governance structure and developer community, which I have already described MapServer's in a previous post. It also contributes to MapServer's creeping featuritis that is both its blessing and its curse.
Frank's MS RFC-1 has become one of the prominent models for project governance in OSGeo, and many of the member projects have copied or modified it to fit their culture and needs. MapServer has a ring (or two) of businesses that use it to their competitive advantage. Funding opportunities that arise from this ring feeds back into the software in the form of new features, general maintenance, documentation, and maillist support. Many of the businesses in the ring are represented in the PSC of MapServer. The project soldiers on, despite individual developers coming and going, and it still sees significant growth release over release as measured by software downloads, maillist posts, and bug submissions.
I think that MapServer is a very functional, but slightly imperfect model of open development. There are things described in Fogel's book we aren't doing yet, or aren't so good at, but we're open to suggestions and highly motivated individuals can have a lot of impact on the project. Technologically, the MapServer project is fairly conservative, and its not prone to withstand or put up with large refactorings. Its governance body is not elected by the community in an effort to sidestep the sticky issue of determining who the community is and if they have a right to vote on such things. Its diverse audience and diverse developer base pulls the project in many directions at once, and its focus, as defined by "MapServer is not a GIS!" leaves an awful lot of room to define what it actually is.
How OSGeo can play a role in ensuring Open Development
Hop on the bus, Gus
An important part of a project's migration to OSGeo and its incubation is the codification of its governance culture. One goal of this governance is to increase what Sean calls the "bus number," or the number of developers (or entities) a bus would need to run over to kill the project. I think it is important to make a fine distinction between the bus number of the project and the bus number of the software because they might not always be the same. OSGeo should aspire to ensure that the bus number of its member projects is much greater than one entity or person. In fact, I think this should be a requirement that must be met for incubation — if a project can't satisfy it, it should be clear that there isn't enough community support for the project to keep it viable. The problem, of course, is actually measuring the bus number of a project is a rather messy endeavor
Mo' money, mo' problems
Money coming into a project, as Fogel devotes an entire chapter to, can create significant tension within a project. In my opinion, an overriding incentive to form OSGeo by its initial member projects was money — in the form of direct support for the projects through some kind of collective pass-through funding, money for visibility and marketing and the leverage that an organization like OSGeo can provide, and money in the form of member projects sharing infrastructure that is common to all and commonly replicated. OSGeo must be clear, explicit, and careful about how money is distributed (if there is any to distribute). Perception is reality in this instance, and any shady stuff will probably be met with significant backlash.
Insight, foresight, more sight
Another role of OSGeo is to provide infrastructure to mitigate disputes and provide a neutral home for the project that is agnostic with respect to a specific entity. OSGeo could act as the grown-up in the case of intra-project disputes like Apache has done in the past. It aspires to act as the facilitator, in hope of congealing its member projects into a mass of non-overlapping, useful, and integrated technology. Finally, it can act as an educator, introducing the technology of its member projects and providing buoyancy in a way that a single project by itself could not do.
Some final thoughts for those of you who've actually read this far. Open source projects, even in the GIS domain, vary widely with respect to their development practices and how open or closed they are. As an open source developer, volunteer, and user, I'm attracted to projects that are openly developed. I will not invest much effort in something where my stake, in effort, is not recognized and respected. Finally, a significant measure of OSGeo's success or non-success in my mind is how good of a job it does at fostering and ensuring open development of its member projects.