Monday, April 25, 2005
April 2005 Summary Reports
The April 2005 Summary Reports have been posted:
Old reports:
- Programming Languages
- Intended Audience
- Operating Systems
- Registration Dates
- Project Status
- Project Topic
- License Types
Old reports:
How to use this data
(Note: This message is updated periodically with new info.)
The FLOSSmole project provides data about:
(a) all projects on Sourceforge
(b) all developers on Sourceforge
(c) all projects on Sourceforge AND who is developing for them, their roles, whether they are an administrator, etc.
(d) all Sourceforge projects and their programming languages, operating systems, user interfaces, end user audience, registration dates, etc (new: donations!)
(e) Edit, Oct-2005: much of the above, but for Freshmeat, also
(f) Edit, Jul-2006: also, Rubyforge
(g) Edit, Jul-2006: also, Objectweb
(h) Edit, Jan-2007: also, Free Software Foundation directory
(i) Edit, Feb-2007: also, SourceKibitzer donates data
We have done runs on Sourceforge starting in early 2004 and we have received donated Sourceforge data for December 2004 from Dawid Weiss in Poland.
We began also scraping Freshmeat, Rubyforge, and Objectweb, and we receive data from SourceKibitzer. Get the complete list of data sources here. (This is a list of each of our scrapes and the date and it's "datasource" ID.) The abbreviations for the forges are RF (Rubyforge), SF (Sourceforge), FM (Freshmeat), OW (Objectweb), FSF (Free Software Fndn Directory), SK (SourceKibitzer).
We are now collecting information from Sourceforge every 60 days, and from Freshmeat/Rubyforge/ObjectWeb/FSF/SK every 30 days (monthly).
You can get all the raw data files from the project file release system.
In addition to the text files (database dumps), we have a basic query tool. Details and tips for using the query tool are available here.
Hope this helps, and please contact me at any time (mconklin AT elon DOT edu) to discuss the data or what is missing, what you'd like to do with it, etc.
The FLOSSmole project provides data about:
(a) all projects on Sourceforge
(b) all developers on Sourceforge
(c) all projects on Sourceforge AND who is developing for them, their roles, whether they are an administrator, etc.
(d) all Sourceforge projects and their programming languages, operating systems, user interfaces, end user audience, registration dates, etc (new: donations!)
(e) Edit, Oct-2005: much of the above, but for Freshmeat, also
(f) Edit, Jul-2006: also, Rubyforge
(g) Edit, Jul-2006: also, Objectweb
(h) Edit, Jan-2007: also, Free Software Foundation directory
(i) Edit, Feb-2007: also, SourceKibitzer donates data
We have done runs on Sourceforge starting in early 2004 and we have received donated Sourceforge data for December 2004 from Dawid Weiss in Poland.
We began also scraping Freshmeat, Rubyforge, and Objectweb, and we receive data from SourceKibitzer. Get the complete list of data sources here. (This is a list of each of our scrapes and the date and it's "datasource" ID.) The abbreviations for the forges are RF (Rubyforge), SF (Sourceforge), FM (Freshmeat), OW (Objectweb), FSF (Free Software Fndn Directory), SK (SourceKibitzer).
We are now collecting information from Sourceforge every 60 days, and from Freshmeat/Rubyforge/ObjectWeb/FSF/SK every 30 days (monthly).
You can get all the raw data files from the project file release system.
In addition to the text files (database dumps), we have a basic query tool. Details and tips for using the query tool are available here.
Hope this helps, and please contact me at any time (mconklin AT elon DOT edu) to discuss the data or what is missing, what you'd like to do with it, etc.
Sunday, April 24, 2005
April 2005 Raw Data Released
I've released the raw data files for April 2005 Sourceforge scrape.
This is good stuff! Summary reports coming soon.
- Get the Raw List of Projects (full list of SF projects, registration dates, etc)
- Get the Raw Project Data (includes operating systems, programming languages, etc)
- Get the Raw Developer Data (includes developer list and developer-projects list, with new administrative flag!)
This is good stuff! Summary reports coming soon.
Saturday, April 23, 2005
data donations
Thanks to all who have donated and used FLOSSmole data. Here is a short explanation of who has collected data from us so far:
- Partially supported by NSF Grants 03-41475 and 04–14468. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Partial data donated by Dawid Weiss, Institute of Computing Science, Poznan University of Technology from a research funded by the European Commission via FP6 Co-ordinated Action Project 004337 in priority IST-2002-2.3.2.3 (CALIBRE), http://www.calibre.ie/
- Partial data donated by Megan Conklin, Elon University, Department of Computing Sciences.
- Partial data donated by Kevin Crowston and James Howison, Syracuse University.
- Partial data donated by Mark Kofman and Anton Litvinenko of SourceKibitzer.org
- [Your name here!]
| Datasource_ID | Donation Notes |
| 1 | Sourceforge full project data collected October 2004 by Megan Conklin. |
| 2 | Sourceforge full project data collected December 2004 by Dawid Weiss, donated and imported March 2005. |
| 3 | Sourceforge full project data collected January 2005 by Megan Conklin. |
| 4 | Sourceforge full project data collected April 2005 by Megan Conklin. |
| 5 | Sourceforge full project data collected July 2005 by Megan Conklin. |
| 6 | Sourceforge project data collected 2001-02-03 by Kevin Crowston, parsed and loaded by James Howison July 2005. |
| 7 | Sourceforge project data collected 2002-05-02 by Kevin Crowston, parsed and loaded by James Howison July 2005. |
| 8 | October 2005 SF run |
| 9 | Freshmeat, June 2005 |
| 10 | Freshmeat, June 2005 |
| 11 | Freshmeat, October 2005 |
| 12 | Freshmeat, November 2005 |
| 13 | Sourceforge, December 2005 |
| 14 | Freshmeat, December 2005 |
| 15 | Freshmeat, January 2006 |
| 16 | Sourceforge, February 2006 |
| 17 | Freshmeat, February 2006 |
| 18 | Freshmeat, March 2006 |
| 19 | Sourceforge, April 2006 |
| 20 | test Rubyforge run |
| 21 | Freshmeat, May 2006 |
| 22 | Sourceforge, June 2006 |
| 23 | Freshmeat, June 2006 |
| 24 | Rubyforge, July 2006 |
| 25 | Freshmeat, April 2006 |
| 26 | Freshmeat, July 2006 |
| 27 | ObjectWeb, August 2006 |
| 28 | Sourceforge, August 2006 |
| 29 | Freshmeat, August 2006 |
| 30 | Rubyforge, August 2006 |
| 31 | Rubyforge, September 2006 |
| 32 | Objectweb, September 2006 |
| 33 | Freshmeat, September 2006 |
| 34 | Sourceforge, October 2006 |
| 35 | Rubyforge, October 2006 |
| 36 | Objectweb, October 2006 |
| 37 | Freshmeat, October 2006 |
| 38 | Sourceforge, December 2006 |
| 39 | Rubyforge, December 2006 |
| 40 | Objectweb, December 2006 |
| 41 | Freshmeat, December 2006 |
| 42 | Freshmeat, January 2007 |
| 43 | Rubyforge, January 2007 |
| 44 | Objectweb, January 2007 |
| 45 | Free Software Foundation, January 2007 |
| 46 | Sourceforge, February 2007 |
| 47 | Freshmeat, February 2007 |
| 48 | Rubyforge, February 2007 |
| 49 | Objectweb, February 2007 |
| 50 | Free Software Foundation, February 2007 |
| 51 | SourceKibitzer, February 2007 |
| 52 | Freshmeat, March 2007 |
| 53 | Rubyforge, March 2007 |
| 54 | ObjectWeb, March 2007 |
| 55 | Free Software Foundation, March 2007 |
| 56 | SourceKibitzer, March 2007 |
| 57 | SourceForge, April 2007 |
| 58 | Freshmeat, April 2007 |
| 59 | Rubyforge, April 2007 |
| 60 | ObjectWeb, April 2007 |
| 61 | Free Software Foundation, April 2007 |
| 62 | SourceKibitzer, April 2007 |
Saturday, April 09, 2005
Sourceforge Bug Tracker data and analysis scripts
Just wanted to put in a pointer to the data and scripts that we used for our recent First Monday paper, The social structure of Free and Open Source software development. This data is part of OSSmole and Megan and I are working away currently merging out databases. But it is available now on the Syracuse FLOSS research site if people want to jump in.