Mathias Hasselmann

No Privacy for FOSS Developers?

Benjamin, thank you for ruining this year's Chrismas days for me. I know, you didn't do this intentionally, but I have really bad days since I am aware of the privacy strip each Free Software developer does due ill-considered "services" like ohloh.net.

I don't want anyone to do statistics about FOSS contributions. Those stats make me feel naked. What follows is some longish explaination of my feelings regarding such "services", those not interested should just skip to my proposal.

Also I do not like the stats at ohloh.net, because they draw a very incomplete picture of my FOSS contributions, when you search for my real name. The only way to change this, is registering at ohloh.net, and associate all my shell accounts. I don't want to register at any random web community, just to force them to publish accurate information about me. I don't want this, as it costs my time, and I don't want this, as it restricts my freedom by accepting their terms of use.

What makes people believe they are allowed to do such stats? Simply the fact, that this information is available without much effort? Drastically spoken, it's also not much effort to break into houses, or to commit homicide - does the little effort make this actions legal?

A though argument to counter my fears is "but Google does similiar things for years, and nobdy complains". Really hard to counter, and haven't really found good words yet - so just some sketchy keywords: Websites are public by purpose, and their Google Groups archive is controversial. Actually - when checking for its name - I couldn't even find it anymore from their front page.

I need to resolve this privacy issue for me, as it dominates my thoughts for days now. That dramatically, that I came to insane ideas like revoking any FOSS contribution I ever made, or like starting a service like ohloh.net, that also publishes inaccurate information, but requires monthly fees for correcting information. Yes, those ideas are close to insanity, but maybe they help to understand which harm, services like ohloh.net do to me.

I started contributing to Free Software projects, since I like to improve my tools, since I working on exciting code. I never contributed for turning into a person of public interesst - like services in the style of ohloh.net obviously assume.

Well, enough of whining, let's figure out how to address the problem.

One possibility I see for not getting listed on sites like ohloh.net would be using your copyrights to add a message like this to all of your ChangeLog entries and commit messages:

The information associated with this change shall only be used for
improving and distributing this software. Any other use, like doing
statistics for instance, is not permitted.

I don't know if this works, as I am no laywer. I also agree that it would be quite ridiculous, to have this message assoicated with each commit. Instead I suggest formalizing our AUTHORS files in the spirit of our MAINTAINERS files:

 Mathias Hasselmann
 E-Mail: mathias hasselmann gmx de
 Userid: hasselmm
 License: code-only

Licence would contain a list of permissons:

I hope this post helps my upset brain's to move focus away abit form this topic now - Chrismas days are comming, damnit!

But also I hope for some constructive feedback, without getting flamed to much for my differing opinion regarding sites like ohloh.net.

Comments

bkor commented on December 22, 2007 at 11:11 a.m.

That site seems specifically made to track people. That will likely conflict with European privacy laws. Just request your information to be removed. This won't help with the next unknown service, but I don't think the next service will understand (for them) random text in a AUTHORS file.

Andre Klapper commented on December 22, 2007 at 11:48 a.m.

either they grep the ChangeLog (that's what i expect for ohloh.net, i have no other explanation why you yourself have to "collect" and mark your contributions from different projects if you set up an account there) or they check for the svn account name used (like e.g. http://cia.vc/stats does).

i disagree with your comparison to break into a house - our code is public, so is our source and our ChangeLog. :-/

Thomas Vander Stichele commented on December 22, 2007 at 11:57 a.m.

Matthias,

the reason homicide or breaking into a house is illegal is because the law says it is illegal.

AFAIK collecting commit stats from projects is not yet denied in the law, so it would only be illegal if you make a case before a judge and he agrees.

Erik Snoeijs commented on December 22, 2007 at 1:16 p.m.

I disagree Thomans, The site is collecting and distributing information about a private person without the knowledge or consent of that person.

Yes the information is publicly available, but they change the context of the information enough in my eyes to let it become new information.

Which in some countries might very well be against the law.

Yes it doesn't rank up with homicide or breaking and entering, but it might still be illegal.

And also i can see how this information that you don't even know exists might at some point cost you a job, because your new boss thought you commented your code too little, without ever seeing the code itself.

A service like this should be ok, but it should really only be showing stats about people whom sign up.

Thomas Vander Stichele commented on December 22, 2007 at 1:39 p.m.

Erik, I was just commenting on the fact that the law specifically states you can't kill someone or break into someone's house.

It could still be argued to be found illegal, but it would have to be done before a judge.

As for where I actually stand on this - my personal feeling is indeed that if the code is open, and your repository is open, I don't see ohloh doing anything wrong.

If it costs you a job because you comment to little, that's probably a) a good thing (comment some more) and b) not different from your employer googling your name, finding you have worked on project X, and noticing that your commits to project X don't have many comments.

But that was not the original point I was making. You may be right on ethical grounds, I was just pointing out where today the difference is with killing or breaking in.

ulrik sverdrup commented on December 22, 2007 at 2:06 p.m.

I agree with Mathias.

It's mostly the availability of data that we think is really deep and hidden, like at which time we commited certain changes or the ability to search across all tarballs we ever made (google code) that makes this all feel wrong.

It doesn't matter to me if it's illegal here and right now, I think this is the kind of trend we have to resist.. just like we don't want our societies to end up with big data mills where information of where you were or what you did is available in abundance.

Mikkel Kamstrup Erlandsen commented on December 22, 2007 at 2:08 p.m.

In Denmark there is no doubt that Ohloh is illegal. You are not allowed to collect personal information and publish it in any media or form without consent.

Another thing is that it is to my knowledge entirely illegal (even with consent) to cross-reference sources (fx. two different source repos) and even store/analyze/publish that data!

I assume laws (maybe a bit weaker) govern the EU as a whole...

kiwi commented on December 22, 2007 at 2:26 p.m.

I have similar problems with my privacy. I work in a totally different field, and need to have my name clear from associations in the web.
That's why I have decided to use a false name when contributing to open source projects. Even my blog is under a false name.
I don't want to be a public person in the open source coding world.
I also change my nick quite often to give no-one the feeling of actually knowing who I am. That of course leads to the fact that no-one respects my opinions, like they would, if they knew all the things I've done (not to say, that I've done that much)... I'm always a newcomer to every community and get treated that way.

Matthew W. S. Bell commented on December 22, 2007 at 3:54 p.m.

Wow. That's so anti-open... I think you may be taking part in the wrong community.

Dominic Lachowicz commented on December 22, 2007 at 3:54 p.m.

It's not about the information being available "without much effort". It's about information being freely published by you, publicly, and others having access to that. If you don't want your data to be collected, don't publish it publicly and freely on the internet. It really is that simple.

Your "breaking and entering" example falls flat. It's more like your deciding to publish your house's contents on a billboard outside (or constructing your house entirely out of glass), and then complaining that passers-by can know what's inside. You have taken no reasonable steps to protect your privacy. In fact, you've done the polar opposite.

What one does in the public square isn't "private", and other members of the public have the right to talk publicly about what they saw others do there. FOSS development is by definition development in the public square. ESR called his paper "the catherdral and the bazaar" with good reason.

That you wrote some code on $date is a fact, and facts (or collections thereof) aren't protected by copyright in at least the US (Feist v. Rural).

You don't want to register @ sites like ohloh to correct any errors because those actions would "restrict your freedom". But asking them to not talk about what you've done in public restricts their freedom of speech. If they publish innacurate information, it is not unreasonable to ask them to correct or retract any inaccuracies. It seems that they already have a simple policy in place for doing precisely that.

If you don't want people to aggregate CVS/SVN stats, ask the GNOME sysadmins to lock out or anonymize anoncvs/anonsvn (or take whatever other measures that are necessary so that ohloh can't get at the statistics). Or don't contribute. Or contribute under a pseudonym. No one is forcing your hand here.

You can't revoke your copyright on code already released under the L/GPL. The licenses give everyone the right to continue distributing your already-published code under those licenses.

Besides, I fail to see the harm. You're a superb engineer and I'd be proud of your accomplishments. If someone doesn't want to hire you because of something they saw on ohloh, they're an idiot.

Patrick Wagstrom commented on December 22, 2007 at 4:41 p.m.

Mathias,

I understand your concern for privacy, for the most part it's a completely valid concern. It's disturbing how much stuff that was once deep and private is becoming public. Although, I don't believe that Ohloh is as bad as the ill conceived irseek, which secretly logged IRC chatter, a normally unlogged medium.

Yet, I don't think you'll ever get around saying that you can't use the data for statistics. While I don't see a ton of value for ohloh, there is a body of individuals doing statistical work on artifacts (bugzilla, email, subversion) from GNOME. This comprises not only commercial entities, but academic researchers, and there has been some talk of the GNOME marketing team doing analysis to better understand how to market the product. It seems that under your definition providing a figure about the number of bugs closed, of commits made, or heck, running a blame program or diff would be considered some sort of statistics.

In my personal case, I've analyzed many years worth of data from GNOME to see what impact the influx of commercial participants has had in the community. This, in turn, helps the community identify behaviors that contribute to a strong community, and which firms may just be using the output of the community for private gain.

So, the question remains, do you have privacy? Sorta. Ohloh and other associated services are not aggregating information about your behavior online. The GNOME foundation protects information about individual developers -- I should know, I've asked them about attendees at GUADEC and BGS several times and been shot down. The information that is out there on you, was mostly revealed by you. It remains out there because there is value to it -- value for new users, value for developers, and yes, even value for people doing stats on the data trying to make the community a better place.

Murray Cumming commented on December 22, 2007 at 5:31 p.m.

I perrsonaly think that privacy is going to be non-existant in the near future, due to technology, just as restrictions on copying data are being made irrelevant. We might want it to be otherwise, but it can't be stopped. I think I can get used to it - all of us having all the data is better than having just some people having all the data. I do wonder if humans in generael will be able to adapt.

Quentin Hartman commented on December 22, 2007 at 6:22 p.m.

I understand your desire to maintain privacy. I share that same desire. As a result, I am very conscientious to be very selective about what information about myself I make publicly available. Participating in an open, trust-based community implies giving up a level of privacy. I'm sorry that you are regretting the level of exposure you gave yourself, but you should have thought about that before you joined in. You can't get it back, and your thoughts on how to do that are at best a pipe dream. "Putting information on the Internet is like pissing into the ocean; once it's out there, there's no getting it back."

If this really concerns you, I'd suggest contributing under a pseudonym.

Patrick Faux commented on December 22, 2007 at 7:43 p.m.

The problem is hardly with ohloh; you can get a lot of aggregated information directly from viewcvs on gnome.org. You should create a fake identity and contribute with that.

nona commented on December 23, 2007 at 1:01 a.m.

I agree with Quentin, the problem is in a way similar to the RIAA or MPAA trying to stop unauthorized distribution. Once the data is out there, it's nearly impossible to contain it. You might get lucky and just be unpopular, but if there's any value in your data, chances are that people and companies all over the world (with different intentions, different morals, different legalities) will munge it and redistribute it.

The only way around it is to avoid it is to start using a number of pseudonyms, and avoid compromising your identity.

Anonymous commented on December 23, 2007 at 9:03 a.m.

The license you propose, or any other license that has the same effect, would make any material licensed under it not Free Software or Open Source Software. It would thus prevent packages of your software from including changelogs, AUTHORS or MAINTAINERS files, patches, or any other information derived from the data you license under it. Unless phrased very carefully, it might even make the entire software package non-free.

I don't see any reasonable way you can prevent collection of these statistics while still contributing to FOSS projects.

More to the point, I agree with several previous commenters who pointed out that you gave up both the right and the ability to keep this information private the moment you published it on the web under a license that lets people use, copy, modify, and redistribute it. I can understand your concern that you did not foresee this use of your contributions. However, the same applies to many other uses of FOSS. Some people find it disturbing if their work gets used in a military application, or by a government or business they don't like, or in other ways they did not expect. Contributing to a FOSS project means that your code will get used in ways you don't foresee; that essentially *defines* FOSS.

(Regarding some comments that suggested EU laws potentially affecting such statistics collection, I would hope that such laws do not apply to information *published publically with explicit permission to modify and redistribute*. If they do, that seems like a bug in such laws, and an unreasonable restriction on people.)

As you said in your post, "Websites are public by purpose"; exactly the same applies to FOSS contributions. You made those contributions public intentionally, and put them under a license that lets people use, copy, modify, and distribute it.

That said, if other people feel as you do, Ohloh might do well to provide some means of opting out, ideally without registering for an account. (I see no good reason to require people to opt *in*; as I said, you published the information on the web.) Out of curiosity, would you consider it acceptable if you still appeared in the list of contributors for a project but not in any other statistics? (Otherwise, editing that contributor further would prove difficult.)

Yevgen Muntyan commented on December 26, 2007 at 4:15 a.m.

Nice post Mathias. Sure thing, they have all rights to do what they do, but perhaps some who care will see that "opt in, not opt out" isn't only for spam. Or perhaps not, after all we all love our brilliant ideas and can't believe they are not as great...

Scott Collison commented on January 3, 2008 at 10 p.m.

When I began thinking about Ohloh a couple of years ago, I realized that one of the fundamentally appealing things about open source is the transparency of open source. I felt it was one of the things that made open source software better software. This was in stark contrast to my brief experience at MSFT, where I learned that lack of transparency often leads to mediocre software. I felt that by enhancing open source transparency, we would contribute to the open source movement in our own modest way. And that's how Ohloh began.

So I was pretty surprised when I came across this post about sites like Ohloh impinging on the privacy of FOSS contributors, specifically GNOME contributors.

It's a fascinating debate that strikes at the core of what we mean by privacy and what the implications are when we release information about ourselves and our activities into the largest public forum in history, the Internet. My position in the debate can be summed up in the following points:
1. This is publicly available information that people publish freely on the Internet. This means the information is not private under any legal system that I've examined.
2. If you don't want to reveal your activities on the Internet then don't attach specific personal information to your contributions like your name.
3. Ohloh and every other participant in the Internet community has the privilege of free speech insomuch as we should be able to comment upon or analyze publicly available information.

So, I really don't believe that we violate the privacy of open source contributors or Ohloh users.

Yevgen Muntyan commented on January 3, 2008 at 10:40 p.m.

Scott, your points are correct, I believe. But you got to understand that people who you are publish information about may not like it, be it because of privacy concerns, or because of the way you do it (e.g. there are two or three me on ohloh.net, but I am lucky, I use my last name as a nick, not some "tbf"; or perhaps I do not want to be "a guy with kudo rank of 6", I want to think I am better than Linus and ohloh.net shows me I am not).

So, you are not violating my privacy. But you are ignoring my wish not to be on ohloh.net, so I get an unwanted "reward" for my contributions to "open source movement".

For the record, I asked to remove my information from ohloh.net, and got no reply.

Yevgen Muntyan commented on January 3, 2008 at 10:47 p.m.

I did receive a reply:

"""
We are very serious about privacy issues. However, in this case we are merely reporting about the existence and development activities of the contributor, "muntyan," which is listed in many public sources like viewcvs on gnome.org.

I recommend that you contact the owners of the the projects you have worked on and have them change your alias. If the alias, "muntyan," is changed on these projects, then it will be changed on our site.

We also are serious about the accuracy of our metrics. Is there something on the site associated with "muntyan" that is specifically inaccurate? If so we would like to know so that we can take steps to correct the error.
"""

And this is what I wrote:

"""
Could you please remove my information from
ohloh.net? All three matches for "muntyan" in
the people search are mine.
"""

Mathias Hasselmann commented on January 3, 2008 at 11:10 p.m.

Muntyan: Ohloh's response definitely sucks. But guess they did some legal resarch before choosing this reply. Berlin's district court did a ruling recently, that supports Ohloh's position[1]. Actually their words seem to follow the justification giving in that ruling. :-/

Call me inconsequent or lazy. I've decided to polish my Ohloh profile. _Forcing_ Ohloh to remove me would cost alot of effort and money, and then there still are other sites, like cia.cv, or shaunm's pulse[2]. Who knows what other stuff exists. Who knows what other stuff will come.

The internet is such a privacy hazard. Well, this crap won't go away, as much as we wish. I wish I had the foresight to contribute using a pseudonym, now it's too late.

Actually I am quite serious about forcing new contributors to have a Facebook or whatever (crap) account, to educate them about the implications of contributing to Free Software.

[1] http://translate.google.com/translate...
[2] http://www.gnome.org/~shaunm/pulse/gn...

Scott Collison commented on January 4, 2008 at 12:04 a.m.

Yevgen, Thanks for keeping the dialog at a constructive level. From what I gather you don't like to be compared to other FOSS developers. This brings up an interesting point. Should Ohloh supply a KudoRank only for users that have registered on Ohloh? I continue this discussion on the Ohloh Blog, since I feel our users should also engage in this dialog.

http://www.ohloh.net/blog/privacyandk...

Mathias, To be more precise I did some legal research *before we started the company*, specifically because I did not want to impinge on peoples' privacy -- not because I wanted to craft a cunning legal response to Yevgen.

Yevgen Muntyan commented on January 4, 2008 at 1:17 a.m.

Scott, it's not quite like that. It's okay if someone compares me to other FOSS developers: like if some person takes his time to think about me and other developers, and do some comparison, and make some conclusions. Then I won't mind if he says that Linus did more for free software than I did. But I do hate all sorts of automatic ranks. Call me a fool, but I don't like when a script assigns some number to my name. Unless that's a highest number of course, since I like to be the best.

It's similar to the privacy thing: if someone mentions me in his writing on internet, it's okay (if he says bad things, I'll be mad about those things, not about the act of posting); but automatic gathering and publishing information about me without my consent is no good. You are saying that you are just publishing information about 'the contributor, "muntyan"', but I don't care - it's me that "muntyan" you see, a real person Yevgen Muntyan (and one of me on ohloh.net is listed like that - "Yevgen Muntyan"). Good or bad, but that's me and I care about me.

And so on. Mathias, sorry for using your blog again, but ohloh.net requires me to log in to leave a comment. I promise I will stop this flood!

Mathias Hasselmann commented on January 4, 2008 at 10:42 a.m.

Scott: I guess we agree, not everything that's technically possible should be allowed. In ancient times crossings of this border had to be justified case by case. At some point of history people got bored and invented law. So law is a good thing, but by its nature it's always incomplete. Therefore I have the opinion, that not everthing that's permitted by law, is legal/correct/good/right.

Ideally I'd like to live in a world were people say: "Oh, that action hurts you? Ok, it is not essential for mankind, I'll stop it". Even better it'd be, if people would act that thoughtful in advance. Well, and I still fail to see what essential service Ohloh provides mankind, by publishing contribution stats (KudoRank, number of commits)[1]. All usefulness I see in Ohloh is of private kind: I see the contributor's benefit, who wants to compare his achievements. I see the benefit for human resources directors, or head-hunters, who want to evaluate the skills of a candidate.

I see the benefit of hiding stats for the individual contributor, who wants to keep this little piece of privacy. I fail to see the harm of hiding stats.

Yevgen: No reason to excuse, no need to stop. This blog allows comments - without registration - by purpose. Reminds me I should provide registration nevertheless, to safe my friends from the Captcha pain.

[1] Btw, hiding contribution stats doesn't mean, you cannot use them for calculating KudoRank.

Bluebird commented on January 4, 2008 at 1:32 p.m.

With a stretch, I can understand that you don't want all the things that you did publicly to show up
on ohloh. But I wonder where you draw the line between "can show up, it's ok" and "can not show up,
it violates my privacy".

Is it ok for people to know that your posts on a public mailing list come from you ? That your
contribution to whatever free software are from you ? Do you want the knowledge of what you did to
be restricted to a limited circle, like only the inner community of the project you contributed to
? What about tools like websvn, is that a privacy violation from your point of view ? What about
commit digests like there is for KDE, which reports the top committers, top bug fixer and so on ( http://commit-digest.org/issues/2007-... )?

You also seem to make a distinction between "anybody can find what I do" and "somebody with strong
motivation can find what I do", the second being ok but not the first.

If your stats were not publicly accessible on ohloh but only accessible to ohloh registered users,
would that be ok ? Being a ohloh registered member would then be an expression of the desire of the
user both to be recognised as a contributor and to recognise other contributors.

One could also think as an opt-out option: "this person has chosen to keep his contributions
private". This has the huge drawback that you must register to opt this option. But deciding to hide
a public information is a volontary and unnatural process.

I personally like ohloh a lot. It gives some hard numbers on open source. We can now know which
programming language is more popular, to get a better idea of what it mean for a project to be big:
number of contributors, daily commits, ... I also find the kudo rank interesting, it reminds me of
advogato. Many studies showed that one of the primary motivation for contributing to open source
(besides fun and pleasure to learn) is peer recognition. Ohloh gives a very concrete peer
recognition scheme. For me, ohloh is great to get a better understanding of many social aspects of
open source.

Mathias Hasselmann commented on January 4, 2008 at 2:16 p.m.

Bluebird: This lack of control is exactly the reason, why I finally decided to register at Ohloh. I hide those stats - as much as I wish - so let's ensure at least, that they are halfway accurate.

Regarding drawing the line: That's exactly the problem with privacy. It's always a very personal decision, if publication of some information hurts privacy, or not. Guess that's the reason why privacy rights erode that quickly: There's always some guy how crosses someone's privacy border and refuses to go back.

Weak privacy laws, lack of punishment, and the fact, that also judges are individuals with personal judgement of privacy makes privacy very hard to protect.

Seeing all this problems wise lawgiver ruled, privacy concerns must be handled per opt-in in Europe. Unfortunatly US lawgivers/US society doesn't have this wisedom, also leading to permanent degradation of privacy rights. I am not the one to stop this.

Regarding "deciding to hide a public information is a volontary and unnatural process": See comments on opt-in above. Ohloh would print the online handle - it's public information that is in the commit logs anyway, but there would be no link pointing to some user profile/stats and there would be no icon showing any KudoRank. Well, and if there would be a link, the associated page would print "Foobar has not joined the Ohloh community yet. Click here to invite." Quite natural, ain't?

Mathias Hasselmann commented on January 4, 2008 at 2:17 p.m.

Bah, "I hide those stats" should read: "I cannot hide those stats".

Lynoure Braakman commented on January 23, 2008 at 1:50 p.m.

I have not checked Ohloh yet, but I wonder useful the stats are when it comes to measuring productivity. I'd expect a person's way of working affect the number of commits they make way more than their usefulness as a coder. Some work iteratively over time, some do things in one go.

Anonymous commented on June 26, 2008 at 5:57 p.m.

You're a fkucing nutjob.

Wait, I'm making a point here. Notice the value I typed into the "Your Name" field? I put in "Anonymous". It's because I don't want my name associated with this comment.

Maybe you should do the same for your own public actions, if you don't want your name associated with them.

Lynoure Braakman commented on June 27, 2008 at 8:13 a.m.

The problem with being "Anonymous" is that people calling others "fkucing nutjobs" are giving it a bad name.