Talk:VCS package guidelines

From ArchWiki
Revision as of 13:25, 10 September 2015 by Scimmia (talk | contribs) (Guidelines on updating version numbers: Re)
Jump to navigation Jump to search

Numeric vs Date Standard in Git pkgver

I'd like to see a standard emerge in AUR Git packages, but standards have to work down to the lowest common denominator. Only a small minority of GitHub projects properly tag their releases, so there can be only two LCD standards for Git packages: the Numeric standard, e.g. ("r1581.2b039da") and the Date standard, e.g. ("2014-01-01"). The date standard is much more readable, and actionable, and has the additional benefit of being backwards compatible with older Git packages.

I don't know if any standard can emerge even if we wanted it to, but please allow me to at least add the Date standard to the VCS package guidelines wiki (as opposed to scrubbing it, :).

Using the commit date would be incorrect because it doesn't refer to a specific commit and the dates are not always sequential. It would be even more wrong if it was the build date, as that doesn't refer to anything to do with the version. -- thestinger 21:33, 23 January 2015 (UTC)
it doesn't refer to a specific commit
Neither does "r1581.2b039da" refer to a specific commit without the .git repo cmdline interaction, and it is considered best practice to prune $pkgdir of .git:
       find "$pkgdir" -type d -name .git -exec rm -r '{}' +
Can you explain what you mean by "the dates are not always sequential"? The Date standard looks like this:
       git log -1 --format="%cd" --date=short | sed "s|-||g"
It displays the latest commit date from the upstream Git repo.
In the edge case of breaking changes occurring upstream in Git repos on the same day, then package maintainers should increment pkgrel. In practice, "2014-01-01" is much more readable and useful than "r1581.2b039da". The readability benefits of "2014-01-01" vs. "r1581.2b039da" should not be underestimated due to edge cases. --Atweiden (talk) 18:39, 24 January 2015 (UTC)
A git commit id certainly refers to a specific commit. I don't know why you think this has anything to do with shipping the VCS repository in the package. The commit date of the most recent commit may be from 20 weeks ago while the earlier commits were from last week. Git does not have a linear history. It's not going to be changed to use a version format where it can go both backwards and forwards with no way of tracing it back to the commit. Readability is worthless if the information isn't meaningful... There are 2 hard requirements: newer revisions must have higher package versions without exceptions, and it must be possible to trace it back to a commit to properly identify and report bugs. -- thestinger 18:53, 24 January 2015
I am quite certain of what a commit ID looks like, e.g. 2d5749d0315a59f4a8bf73ebbe757b227c8c17a3, not r1581.2b039da.
Can you show me where to find "r1581.2b039da" on a GitHub/BitBucket repo? It doesn't exist anywhere that I can see. You need to have the repo locally, then cd into the Git repo and run:
   git rev-list --count HEAD
That is work-intensive, and it conflicts with the best practice of pruning AUR packages of .git repos.
I previously used the Wiki-recommended Numeric standard, so it's not like I haven't used it. I switched to the Date standard, updating hundreds of packages to reflect it. Why do you suppose that is?
Readability is worthless if the information isn't meaningful
You'd be hard pressed to convince me that the date "2014-01-01" is somehow *less* semantically meaningful or actionable than "r1581". I say this after maintaining for years, hundreds of Git packages in the AUR. You can tell by grepping `pacman -Q` exactly which AUR git packages are out of date with a two-second visit to GitHub. The Date standard is incredibly actionable and pragmatic not to mention simple and immediately understandable by anyone.
newer revisions must have higher package versions without exceptions
Not a problem: "2014-01-01" < "2014-01-02"
it must be possible to trace it back to a commit to properly identify and report bugs
Given how frequently "stable" Go packages and C packages pull in Git dependencies as part of their build process without specifying a certain commit, I disagree with this proposition. Why would recording the version of acmedaemon as 1.5, when it pulls in Git packages without specifying commit ID, be all that different in bug reports than "2014-01-01"? In both cases, you will not be able to track down the exact cause of the bug by commit. In both cases, you will in all likelihood end up consulting the build date. --Atweiden (talk) 03:32, 25 January 2015 (UTC)
I am quite certain of what a commit ID looks like, e.g. 2d5749d0315a59f4a8bf73ebbe757b227c8c17a3, not r1581.2b039da.
In your scenerio, 2b039da is the short version of the commit hash and does point to a specific commit. As for where you find it on Github, it's listed on the commit log, of course. Just look at the right hand side of the page.
You'd be hard pressed to convince me that the date "2014-01-01" is somehow *less* semantically meaningful or actionable than "r1581".
In this case, human readability really is meaningless. Machine readability is all that matters.
Not a problem: "2014-01-01" < "2014-01-02"
And what happens when the commit after that is dated 2013-12-31? As thestinger pointed out, Git is not linear.
Scimmia (talk) 03:40, 25 January 2015 (UTC)
2b039da is the short version of the commit hash
Can this information be easily compared on GitHub to your local package version to check if it is out of date? The answer is no.
It is not actionable, except by cloning the repo yourself, running the Git command, and then manually scanning the Git log for short versions. It's very work-intensive.
Git is not linear
Neither are stable packages on PyPi:
Pacman views the upgrade of 0.9.9 -> 0.9.9a as a downgrade. See for a real world example of this edge case. Edge cases should be no reason to prevent readers of the wiki to make informed choices about Git packaging standards. --Atweiden (talk) 04:08, 25 January 2015 (UTC)
Yes, the short hash of each commit is listed in the commit log. Also note that the revision count (the "r" number listed first) is also listed at the top of each repo's page.
As for PyPi's versioning scheme, their choice is different than what Arch's vercmp assumes. The packager needs to take this into account, just as all packagers must take upstream's idiosyncrasies into account. I don't see how it's relevant here.
You also never answered my question, what happens when the last commit is dated before the commit previous to it? This question is critical, as it would cause your versioning scheme to go backwards by any and all measures. That makes it unacceptable.
Scimmia (talk) 04:15, 25 January 2015 (UTC)
   printf "r%s.%s" "$(git rev-list --count HEAD)" "$(git rev-parse --short HEAD)
The output of the above command does not correspond to any piece of information on GitHub.
If PyPi packagers should take edge cases into account for PyPi packages, then they should do the same for Git packages. Edge cases aren't reason enough to scrub a competing standard from the wiki. --Atweiden (talk) 04:23, 25 January 2015 (UTC)
I'm sorry if you can't find this information on GitHub. It's very, very plainly displayed in the places I already pointed you to. If you would take a minute to check instead of just jumping at me, you might find it.
Scimmia (talk) 04:31, 25 January 2015 (UTC)
the short hash of each commit is listed in the commit log. Also note that the revision count (the "r" number listed first) is also listed at the top of each repo's page.
This is (partially) incorrect. The short hash of each commit is not listed. What is listed in the commit log is the truncated commit ID.
The revision count point I will retract. It didn't appear to work for my own Git repos, but that was because I ran the command from a Git subrepo (edge case :). The date still has the advantage of being more human readable than r1582. If a package has version 2011-10-22, I will know this project is either unmaintained, or it's been a really long time since I last rebuilt it. All without leaving the terminal. --Atweiden (talk) 07:33, 25 January 2015 (UTC)
It's also very easy to look up or link to commits via the abbreviated commit hashes or proper identifiers produced by git describe. -- thestinger 04:33, 25 January 2015 (UTC)
That has nothing to do with non-linear history. It has to do with an incompatibility between Pacman's version comparison and the one chosen by the project. There are ways of dealing with that, but it has nothing to do with this discussion. -- thestinger 04:24, 25 January 2015 (UTC)
The ideal version is the one produced by git describe, because it's easily understood by humans (last tag, commits since last tag, abbreviated commit), is also understood by the various Git commands and GitHub and is monotonically increasing thanks to the inclusion of the last version and revision count since then. The versions it produces (10-7-g8537989) can be directly used with commands like git show. Unlike the build date, it identifies a unique revision of the project, including the state of any submodules.
The only case where another method should be used is when the project doesn't yet have any tagged release or does the releases in a way that's not very sane. The releases should be tagged on master with additional tags in stable branches for fixes backported to old releases. If this isn't the case, the git describe output needs to be emulated. A date is a poor way of doing that, because it's not connected to a specific point in the revision history.
The only case where the commit date would work is if Git is being used as a crippled centralized version control system. If all commits get created against master (no merging branches or patches) in a centralized repository where the system clock is never moved backwards, it could work... but that's not the reality of non-linear, distributed version control.
A build date is even worse, because it only refers to the local state of time and the local state of the repository. Commits from before that timestamp will often be pushed to the repository.
-- thestinger 04:24, 25 January 2015 (UTC)
A packaging standard must be compatible with the lowest common denominator. A minority of Git repos are properly tagged. The Date works as a standard, tags do not.
   git log -1 --format="%cd" --date=short | sed "s|-||g"
The output of the above command shows the date of the most recent commit. If the most recent commit date happens to come before the date of a previous commit, I do understand this is suboptimal, but I consider it an edge case. Even in big projects with contributors in different time zones you will typically see branches merged in with PRs. PRs reset the date to something recent. The linearity of Git is rarely problematic, but I do see where you're coming from (I'm also a happy user of archversion for keeping packages up to date, though archversion works with the Date standard too :) --Atweiden (talk) 04:59, 25 January 2015 (UTC)
The git describe versioning system is the standard defined and used by Git itself. The dashes need to be converted into dots for pkgver and an r may be inserted to make it clearer which part is the revision number. If version-based tags on master are available, this is the only sane way to do it.
Git defaults to fast-forwarding / applying patches without merge commits, as long as there are no conflicts. GitHub will always generate a merge commit but that's not true for anything applied / merged by the developers and then pushed to master. -- thestinger 05:24, 25 January 2015 (UTC)
You've made the claim that most git repos aren't properly tagged, not only now but to me previously. Do you have anything to back that up? IME, the large majority of projects that are in a usable state are either tagged or get tagged as they become stable.
Scimmia (talk) 05:27, 25 January 2015 (UTC)
   git init
You now have your very own untagged Git repo :).
The majority of Git repos are untagged. If you want to have a standard compatible with the lowest common denominator, by definition you cannot depend on tags being there.
I regularly package Git software in the alpha stages and software maintained by people who are unfamiliar with tagging or aren't bothered to tag. --Atweiden (talk) 07:15, 25 January 2015 (UTC)
As suggested in VCS package guidelines#Fallback, I think the only date that can be used is the build date: for example I've recently used it in a PKGBUILD for a repo without tags or branches, also appending the first 7 characters of the commit hash (the repo is not cloned, so I could only use git-ls-remote). — Kynikos (talk) 03:30, 25 January 2015 (UTC)

Updating a CVS repo

I don't use cvs. How can you describe the pkgver for cvs (for pacman 4.1)?
-- Dracorp (talk) 09:31, 6 April 2013 (UTC)

CVS is not supported in pacman 4.1 like the other VCS tools. You will need to update pkgver manually until CVS support is added.
-- Jstjohn (talk) 22:44, 15 April 2013 (UTC)
Yeah, but how about mentioning that in the article (as well as giving a download example)? Even if it's not that common anymore.
--Det (talk) 22:39, 2 May 2013 (UTC)
The download example can still be found in /usr/share/pacman/. The next version of the ABS package should update it a bit so the download happens in the prepare function where it belongs. As for pkgver, I think the generic example using date covers that, as there's not a way to get a version number from a CVS repo. Maybe a note to that effect?
-- Scimmia (talk) 07:17, 15 May 2013 (UTC)
That makes the most sense, but it might also be a good idea to rename the "Fallback" section to something like "Fallback / CVS" to make it more obvious even when you're just checking out the table of contents.
But as for ABS, as far as I can tell the last commit was over 8 months ago.
--Det (talk) 05:54, 19 May 2013 (UTC)
Hmm, there were a number of patches submitted last month for cleaning up the prototypes, looks like none have been committed yet. I do remember a discussion (IRC maybe?) questioning the proper place for the prototypes, so maybe that's why? Looking at the patches, I was mistaken anyway; they didn't update the darcs or cvs prototypes. Simple enough, I'll send in a patch myself.
--Scimmia (talk) 08:22, 19 May 2013 (UTC)
I use this dirty hack: cvs history -c -a | cut -d' ' -f2 | sort -u | tail -n 1 | sed 's|-||g" ; could probably be improved.
--Buhman (talk) 18:00, 6 June 2013 (UTC)

pkgver function for hg based on tags

I recent came across a way with hg to show the most recent tag, as well as the number of commits from this tag (similar to the output of git describe.)

pkgver() {
  cd local_repo
  hg log -r . --template '{latesttag}.{latesttagdistance}.{node|short}\n'

Please could this be included in the page.

Garyvdm (talk) 09:03, 23 July 2013 (UTC)

shorten hg version

To prevent long package file name, It is proper to use this format

pkgver() {
  cd $_repo
  _id=$(hg identify -i)
  echo $(hg identify -n).${_id:0:4}

--Dlin (talk) 05:30, 26 August 2013 (UTC)


pkgver() {
  cd local_repo
  _id=$(cat manifest.uuid 2>/dev/null)
  echo ${_id:0:4}

--Dlin (talk) 05:36, 26 August 2013 (UTC)


Re [1], I hope I wasn't the straw that broke the camel's back. I still fail to see why the Git section should be that large though, also in comparison to the other sections. -- Alad (talk) 11:26, 6 March 2015 (UTC)

set -o pipefail

This change mentioned set -o ... as one of the reasons for reverting, nevertheless one example using set -o pipefail still remains: VCS_package_guidelines#Git (the last example in that section). -- Lahwaacz (talk) 17:41, 6 March 2015 (UTC)

Guidelines on updating version numbers

There are a few version control AUR packages out there which use the correct combined format of release and git commit. However, sometimes the maintainers refuse to update the default pkgver upon new releases of the software, stating that it's unnecessary since it's updated automatically during installation anyway. While this is true, I personally feel it should still be the maintainer's responsibility to update the default pkgver upon actual releases (not individual commits) as otherwise a user will never receive an automatic update and instead has to force-reinstall all(!) such packages regularly.

Should we add this to the guidelines or how do others feel about this?

—This unsigned comment is by Airblader (talk) 09:51, 10 September 2015‎. Please sign your posts with ~~~~!

There's no official update mechanism for AUR packages, so this argument only applies to those using unofficial AUR helpers. Otherwise there's soname bumps; if the package is a library, then pkgver should likely be updated. If it is a library the package depends on, I don't know of a reliable way as pkgrel is stuck to 1 on VCS packages. After all, a soname bump could happen between releases or even commits. -- Alad (talk) 11:00, 10 September 2015 (UTC)
If you're using a VCS package, you should be paying attention to what is happening upstream and choose when you rebuild the package. Managing them the same way as release packages makes no sense.
Scimmia (talk) 13:25, 10 September 2015 (UTC)