Research Project Overview

Understanding Open Source Software Development

Vince Freeh, Greg Madey and Renee Tynan
University of Notre Dame

This research project seeks to understand the free/open source software (F/OSS) phenomenon and to predict the pattern of growth exhibited by F/OSS projects over time.  The F/OSS community is a genuine behavioral and technical puzzle, one with significant, far-reaching impact on the world's economy.  The F/OSS community has developed a substantial amount of the infrastructure of the Internet, and has several outstanding technical achievements, including the most popular web server (Apache), the most popular scripting language (PERL), and an operating system which successfully competes with Windows (Linux).  These programs were written, developed, and debugged largely by part time contributors, who in most cases were not paid for their work, and without the benefit of any traditional project management techniques.

We are developing a conceptual model to explain the motivations and key work processes underlying this extraordinary phenomenon.  Our preliminary analyses indicate that the F/OSS community can be usefully modeled as a social network, one that has the characteristics of a self-organizing, emergent system. Drawing on the social psychological theories of motivation, self-managing teams, and communication, we are creating a model of the social and task characteristics that predict the emergent properties of the system.

The F/OSS development community is a global virtual community. Thus, we have the advantage in that their digital interactions are archived and can be web/data mined. Data is being collected at the community, project, and developer levels, characterizing the entire F/OSS phenomenon, across multiple numbers of projects, investigating behaviors and mechanisms at work at the project and developer levels. The F/OSS phenomenon is being modeled using two complementary approaches: agent-based simulation using Java and Swarm, and social-network modeling. Preliminary data collection suggests that frameworks based on self-organizing systems, emergence, swarm intelligence and biocomplexity can accurately model the F/OSS phenomenon and that a power-law relationship is present in the developer and project data.

The research being performed within this project has much broader impact. It is important to understand the F/OSS developer network because virtual communities are increasingly more common. Corporations are enabling virtual self-managing teams and could benefit from lessons learned in the F/OSS movement. Other research projects investigating F/OSS from the software engineering, legal, governmental policy, economic, and business strategy perspectives will benefit from an explanatory model of the F/OSS phenomenon.

Release of the SourceForge.net Research Data

To advance the understanding of, and research on, the Free/Open Source Software phenomenon, portions of the data that may support such research, will be made available to academic or scholarly researchers. All requests for data must be submitted in writing (e-mail) to the Notre Dame PI, (Greg Madey). Only academic and scholarly researchers are eligible to receive the data. To receive the data, a short questionnaire and agreement must be completed, signed and returned. A wiki for users of the research data is available here. See the Research Data page for more information.

The material presented at this web site is based in part upon work supported by the National Science Foundation, CISE/IIS-Digital Society & Technology, under Grant No. 0222829. 

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

