Life at Eclipse

Musings on the Eclipse Foundation, the community and the ecosystem

Archive for June 2008

The best laid plans…

Some of you might have noticed that when Ganymede was first launched yesterday, the eclipse.org website was reeeaaallllyyyy slow for a little while.

By way of explanation, I offer the following:
Nathan Releases Ganymede

But more seriously, many thanks to Denis, Matt, Nathan and Karl for all of their work in getting our IT infrastructure ready for the deluge.

Written by Mike Milinkovich

June 26, 2008 at 9:34 am

Posted in Uncategorized

I Have Ganymede. Do You?

As a Friend of Eclipse, I was able to download Ganymede this afternoon. Not only did I get it sooner, I got it faster. In fact, thanks to the dedicated bandwidth it was crazy fast to download.

Thanks goes to Denis Roy and the webmaster team. It looks like the Ganymede distribution setup is well architected and running smoothly.

So friends get benefits!

Written by Mike Milinkovich

June 24, 2008 at 5:15 pm

Posted in Uncategorized

A Simple Thank You

Ganymede is just around the corner, and I wanted to take a moment to say thank you to the many Eclipse committers, project leaders and contributors who have all participated in its success. You have done our community proud.

I was tempted to list some of those who have made particularly significant contributions, but I am certain I would miss too many deserving people. Please feel free to provide their names in comments. From releng to packaging to developers to CQ reviewers to bug triagers to movie poster competition organizers to Ganymatic kickers, there are a many individuals who have chipped in to helping make Ganymede a reality.

One interesting note: as most readers are well aware, I’m pretty much useless when it comes to shipping these releases. My personal contribution is primarily in support of the press and analyst activities in support of the release. As this was the third release train, Ian and I were pretty much convinced that the press interest in Ganymede would be lower than in past years. You know: same old same old. We were dead wrong. There is a ton of interest in Ganymede, and how the Eclipse community continues to predictably deliver value and innovation year after year.

So: congratulations and thank you!

Written by Mike Milinkovich

June 23, 2008 at 7:08 pm

Posted in Uncategorized

Using Usage Data

Continuing from yesterday’s post on the Usage Data Collector, I thought we should also describe how we plan to use the UDC data. In the official, lawyer-approved Terms of Use, we have stipulated the following:

  • The Eclipse Foundation may, in its sole discretion, make available to organizations and individuals, on a case-by case-basis, the data that it collects through the Usage Data Collector, whether in raw or aggregated form.
  • The Eclipse Foundation will publish summary reports based on the data obtained. These reports will be made available in machine readable format that will allow individuals and organizations to undertake further analysis.
  • Potential uses of the summary reports may include, but not limited to: 1) Eclipse project committers who want to better understand how individuals are using their projects, 2) usage of Eclipse projects and third-party Eclipse plug-ins, and 3) an estimate to the number of individuals using Eclipse. It is expected that the summary reports and raw data may be used for other purposes that we have not envisioned at this point in time.

However, I think it would be useful to explain more about how the Foundation plans to use the data. First I think it is important to reinforce what and how we are collecting the data.

  1. UDC will only be included in the EPP packages. If someone does not want to download UDC, the ‘classic’ Eclipse SDK will be ‘udc-free’.
  2. UDC is opt-in, so each user must agree to send the data, along with any optionally selected general demographic information.
  3. UDC provides the ability to filter the data, so you can send only information about org.eclipse bundles or specifically not send information about bundles that have ‘xxx.yyyyy’. This is important if you want have sensitive plug-ins that you don’t want to share data about.
  4. All the data that is collected is anonymous. For each participant, a unique ID is created by the UDC that allows us to aggregate data for that participant. The unique ID does not allow us to identify the participant in any way. We are not collecting any IP addresses and can not aggregate the data by organizations or companies. To be really specific: we cannot trace the ip address from the upload to the keys contained in the uploaded files. The data is completely anonymous from the source.
  5. We do plan to capture the country location so we can report the data by geography.

So what do we plan on doing with the data? The first priority is to provide a service for the committers and projects. We intend to create and publish a series of reports that will only include information about bundles that include ‘org.eclipse’. Information about bundles from other organizations will not be included in these reports. We intend to make these reports publicly available. The committer community will not have access to the raw data.

We have already been approached by a number of academics in universities that are interested in analyzing the data as part of their academic research. In principle we would like to support and encourage this type of research. One valuable results of UDC could be a better understanding of how people use IDE’s and develop code. At this time, we don’t have a process to make the raw data available to academics but if we do the raw data would be made available under a confidentiality agreement that enforces the Eclipse Foundation privacy policy.

In the future we do think that organizations, in particular the Eclipse membership, will be interested in accessing the UDC data. We think they will be interested in understanding how their products are being used and how they compare with other products in the industry. If we provide this information, it would be in the form of machine readable reports, not access to the raw data. The reports would be scrubbed to ensure only information about the relevant products are included.

Finally, I’d like to address the question of ‘selling the data’. We will not sell the raw data to anyone. Period. However, we may sell reports of the data to organizations. At this time, we have no idea if there is any commercial value for any reports. We do hope there is some commercial value and we can develop either a future revenue stream for the Eclipse Foundation, or use the UDC reports to generate increased value for memberships at Eclipse.

As a reminder, the Eclipse Foundation is a not-for-profit entity that is funded by membership dues. If we want to provide additional services to the Eclipse community and its projects, we need funds. If we are able to create either new revenue stream, or enhanced membership revenue in a way that respects the privacy and integrity of the Eclipse community, it could be a good thing for the entire community.

I hope this provides a bit more insight into our current thinking. A lot will depend on how many people participate with UDC and the type of information we can report from the data. I am optimistic that this will be a great service for the Eclipse community.

Written by Mike Milinkovich

June 6, 2008 at 6:00 pm

Posted in Uncategorized

Collecting Usage Data

One of our challenges at the Eclipse Foundation is understanding how and what people are using Eclipse. Millions of people come to our web site to download the various projects, find different Eclipse based plug-ins (open source and commercial) and use them to create amazing software. If we can gain insight into how people use the different pieces of the Eclipse ecosystem we should be able to improve the overall user experience.

This whole initiative got started after I went to a talk at last year’s OSCON by Joe “Zonker” Brockmeier. At the end of his talk, he made that point that open source projects have a particular challenge in getting to know their users: we don’t ask people to register, and we don’t have even the most basic information we need to help improve our software. We lack the stats to make good decisions. His suggestion: ask your users to provide useful data. So that’s what we’re planning on doing. This is about helping our projects and our ecosystem to make Eclipse better.

For this reason, I am very interested in seeing the response we get to the Usage Data Collector (UDC) that we are planning to include in the Ganymede EPP packages. For those that might not have seen Wayne’s previous posts on this subject, UDC is a piece of technology that will track how and what people are using in Eclipse. UDC has been included in the EPP Ganymede milestones packages and over 1500 individuals have participated during the past four months. We have created some initial reports and I hope into the future we will be able to provide some interesting information for our committers and the wider Eclipse community.

As you can imagine with any data collection technology, privacy is a huge concern. Therefore, to be clear, UDC is 1) opt-in, so only people that agree to send the data will participate, and 2) completely anonymous. No personal data, including IP addresses, is being collected. In addition, the Eclipse Classic package will not contain any UDC code at all, so there is a simple option for users who really want to avoid this. For those who are interested you can review the code in CVS.

So far it seems that our approach to UDC has been well received by the community. No one has expressed any concerns to date, and 1500 opt-ins has more than met our expectations during the development phase.

Coincidentally, in the last month the Mozilla community has begun talking about a somewhat similar data collection program. In Mozilla, some strong opinions have been expressed about collecting data at all. Therefore, I want to make sure everyone in our community has an opportunity to respond to this program before we make the final decision to deploy it.

We are very excited about the potential of UDC but we also want to ensure we respond to any community concerns. Please feel free to contact me (mike at eclipse dot org) or better yet leave a comment letting me know your thoughts on UDC if you have any feedback.

Written by Mike Milinkovich

June 5, 2008 at 1:29 pm

Posted in Uncategorized