Privacy of open source app developers

When developing (or contributing due to reporting bugs) at public git repositories, one leaves traces of his activity with exact timestamps:

  • git commit timestamps
  • (at least at GitLab) timestamps of pushing
  • time of creating a release/ tag
  • times of responding to issues

This allows to gather data about developers (and a bit about the users) - when are they normally active, were there offline for a few weeks/ on holiday, … .

I found a solution/ built a helper tool (Jonas L. / private-commit · GitLab) to reduce the first problem. However, I don’t see a solution for the other ones. Has anyone experiences related to that or ideas?

2 Likes

That’s exactly the information you want to have in a VCS. That’s why git exists.
Otherwise you shouldn’t use git / force-push everything per cron-job daily and remove all but the latest tag, still this abuses the idea of git.
I’d worry much more about the data I give my social network of choice / messenger / fitness stuff / apps / friends give about me with their phone.

My problem aren’t the timestamps, it’s that they are more exact than required. No normal user/ developer cares if I did a commit in the morning or at the evening, but this data exists and allows some analyzing. As an example, take https://gitlab.com/fdroid/fdroid-bootstrap-buildserver/graphs/master/charts. Most commits are from Michael Pöhn according to https://gitlab.com/fdroid/fdroid-bootstrap-buildserver/graphs/master. I can see that most work happens from Tuesday to Friday from 8 to 16 o clock. Is this something which anyone needs to know? I guess not.

Edit: I don’t use social networks and only open source apps (so there is not that much data collected using other channels).

Thanks for the pointer to “private-commit”. Rounding to the start of last week sounds a bit drastic though, I most often need day-precise commit dates for my own work.
More generally I’d like to nosify this data but this isn’t easy since it has value for me and the other devs.

People have been parsing commits data for a while, I remember lecturing a researcher about this years ago when he was commenting on the sleeping habits of a developer :wink:

I added different commands for a different rounding. Rounding to day or hour is now possible too.

nosify = randomize? That works technically, but for a linear time, one would need a fake start time per day and a fake range of time between commits which could be chosen randomly. This would look more real, but I don’t have a problem if someone can see that the graphs at GitLab are wrong. Moreover, this would need some configuration and could be still easily detectable (or one makes the fake time depend on the size of the diff but even that is not always realistic assuming searching and fixing a bug …).

I think you question the whole concept of VCS / Git. :thinking:

Regarding reporting bugs: you could use a new account for each issue. But that might harm the GitHub / GitLab community because the number of “fake” accounts would rise.

The safest option would be to stop using electronic devices though. :ok_hand:

Plain git has no ticket system (and only saves timestamps which I can modify), so much less privacy problems.

The only problem I am talking about are the very exact timestamps. I know that there are more privacy problems (in git too), but to fix these I would have to stop using electronic devices connected to the internet (locally saved data is still a problem, but is not analyzed by third parties).

I could develop and report bugs using synonyms without links to my real identity (but it would be easy to make one small mistake …). However, I would personally not really trust something from the nowhere. The problem I have is that more data then required is collected and publicly available; I accept that some data is required and saved.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.