Talk:VCS package guidelines

From ArchWiki
Jump to navigation Jump to search

Numeric vs Date Standard in Git pkgver

I'd like to see a standard emerge in AUR Git packages, but standards have to work down to the lowest common denominator. Only a small minority of GitHub projects properly tag their releases, so there can be only two LCD standards for Git packages: the Numeric standard, e.g. ("r1581.2b039da") and the Date standard, e.g. ("2014-01-01"). The date standard is much more readable, and actionable, and has the additional benefit of being backwards compatible with older Git packages.

I don't know if any standard can emerge even if we wanted it to, but please allow me to at least add the Date standard to the VCS package guidelines wiki (as opposed to scrubbing it, :).

Using the commit date would be incorrect because it doesn't refer to a specific commit and the dates are not always sequential. It would be even more wrong if it was the build date, as that doesn't refer to anything to do with the version. -- thestinger 21:33, 23 January 2015 (UTC)
it doesn't refer to a specific commit
Neither does "r1581.2b039da" refer to a specific commit without the .git repo cmdline interaction, and it is considered best practice to prune $pkgdir of .git:
       find "$pkgdir" -type d -name .git -exec rm -r '{}' +
Can you explain what you mean by "the dates are not always sequential"? The Date standard looks like this:
       git log -1 --format="%cd" --date=short | sed "s|-||g"
It displays the latest commit date from the upstream Git repo.
In the edge case of breaking changes occurring upstream in Git repos on the same day, then package maintainers should increment pkgrel. In practice, "2014-01-01" is much more readable and useful than "r1581.2b039da". The readability benefits of "2014-01-01" vs. "r1581.2b039da" should not be underestimated due to edge cases. --Atweiden (talk) 18:39, 24 January 2015 (UTC)
A git commit id certainly refers to a specific commit. I don't know why you think this has anything to do with shipping the VCS repository in the package. The commit date of the most recent commit may be from 20 weeks ago while the earlier commits were from last week. Git does not have a linear history. It's not going to be changed to use a version format where it can go both backwards and forwards with no way of tracing it back to the commit. Readability is worthless if the information isn't meaningful... There are 2 hard requirements: newer revisions must have higher package versions without exceptions, and it must be possible to trace it back to a commit to properly identify and report bugs. -- thestinger 18:53, 24 January 2015
I am quite certain of what a commit ID looks like, e.g. 2d5749d0315a59f4a8bf73ebbe757b227c8c17a3, not r1581.2b039da.
Can you show me where to find "r1581.2b039da" on a GitHub/BitBucket repo? It doesn't exist anywhere that I can see. You need to have the repo locally, then cd into the Git repo and run:
   git rev-list --count HEAD
That is work-intensive, and it conflicts with the best practice of pruning AUR packages of .git repos.
I previously used the Wiki-recommended Numeric standard, so it's not like I haven't used it. I switched to the Date standard, updating hundreds of packages to reflect it. Why do you suppose that is?
Readability is worthless if the information isn't meaningful
You'd be hard pressed to convince me that the date "2014-01-01" is somehow *less* semantically meaningful or actionable than "r1581". I say this after maintaining for years, hundreds of Git packages in the AUR. You can tell by grepping `pacman -Q` exactly which AUR git packages are out of date with a two-second visit to GitHub. The Date standard is incredibly actionable and pragmatic not to mention simple and immediately understandable by anyone.
newer revisions must have higher package versions without exceptions
Not a problem: "2014-01-01" < "2014-01-02"
it must be possible to trace it back to a commit to properly identify and report bugs
Given how frequently "stable" Go packages and C packages pull in Git dependencies as part of their build process without specifying a certain commit, I disagree with this proposition. Why would recording the version of acmedaemon as 1.5, when it pulls in Git packages without specifying commit ID, be all that different in bug reports than "2014-01-01"? In both cases, you will not be able to track down the exact cause of the bug by commit. In both cases, you will in all likelihood end up consulting the build date. --Atweiden (talk) 03:32, 25 January 2015 (UTC)
I am quite certain of what a commit ID looks like, e.g. 2d5749d0315a59f4a8bf73ebbe757b227c8c17a3, not r1581.2b039da.
In your scenerio, 2b039da is the short version of the commit hash and does point to a specific commit. As for where you find it on Github, it's listed on the commit log, of course. Just look at the right hand side of the page.
You'd be hard pressed to convince me that the date "2014-01-01" is somehow *less* semantically meaningful or actionable than "r1581".
In this case, human readability really is meaningless. Machine readability is all that matters.
Not a problem: "2014-01-01" < "2014-01-02"
And what happens when the commit after that is dated 2013-12-31? As thestinger pointed out, Git is not linear.
Scimmia (talk) 03:40, 25 January 2015 (UTC)
2b039da is the short version of the commit hash
Can this information be easily compared on GitHub to your local package version to check if it is out of date? The answer is no.
It is not actionable, except by cloning the repo yourself, running the Git command, and then manually scanning the Git log for short versions. It's very work-intensive.
Git is not linear
Neither are stable packages on PyPi:
Pacman views the upgrade of 0.9.9 -> 0.9.9a as a downgrade. See for a real world example of this edge case. Edge cases should be no reason to prevent readers of the wiki to make informed choices about Git packaging standards. --Atweiden (talk) 04:08, 25 January 2015 (UTC)
Yes, the short hash of each commit is listed in the commit log. Also note that the revision count (the "r" number listed first) is also listed at the top of each repo's page.
As for PyPi's versioning scheme, their choice is different than what Arch's vercmp assumes. The packager needs to take this into account, just as all packagers must take upstream's idiosyncrasies into account. I don't see how it's relevant here.
You also never answered my question, what happens when the last commit is dated before the commit previous to it? This question is critical, as it would cause your versioning scheme to go backwards by any and all measures. That makes it unacceptable.
Scimmia (talk) 04:15, 25 January 2015 (UTC)
   printf "r%s.%s" "$(git rev-list --count HEAD)" "$(git rev-parse --short HEAD)
The output of the above command does not correspond to any piece of information on GitHub.
If PyPi packagers should take edge cases into account for PyPi packages, then they should do the same for Git packages. Edge cases aren't reason enough to scrub a competing standard from the wiki. --Atweiden (talk) 04:23, 25 January 2015 (UTC)
I'm sorry if you can't find this information on GitHub. It's very, very plainly displayed in the places I already pointed you to. If you would take a minute to check instead of just jumping at me, you might find it.
Scimmia (talk) 04:31, 25 January 2015 (UTC)
the short hash of each commit is listed in the commit log. Also note that the revision count (the "r" number listed first) is also listed at the top of each repo's page.
This is (partially) incorrect. The short hash of each commit is not listed. What is listed in the commit log is the truncated commit ID.
The revision count point I will retract. It didn't appear to work for my own Git repos, but that was because I ran the command from a Git subrepo (edge case :). The date still has the advantage of being more human readable than r1582. If a package has version 2011-10-22, I will know this project is either unmaintained, or it's been a really long time since I last rebuilt it. All without leaving the terminal. --Atweiden (talk) 07:33, 25 January 2015 (UTC)
It's also very easy to look up or link to commits via the abbreviated commit hashes or proper identifiers produced by git describe. -- thestinger 04:33, 25 January 2015 (UTC)
That has nothing to do with non-linear history. It has to do with an incompatibility between Pacman's version comparison and the one chosen by the project. There are ways of dealing with that, but it has nothing to do with this discussion. -- thestinger 04:24, 25 January 2015 (UTC)
The ideal version is the one produced by git describe, because it's easily understood by humans (last tag, commits since last tag, abbreviated commit), is also understood by the various Git commands and GitHub and is monotonically increasing thanks to the inclusion of the last version and revision count since then. The versions it produces (10-7-g8537989) can be directly used with commands like git show. Unlike the build date, it identifies a unique revision of the project, including the state of any submodules.
The only case where another method should be used is when the project doesn't yet have any tagged release or does the releases in a way that's not very sane. The releases should be tagged on master with additional tags in stable branches for fixes backported to old releases. If this isn't the case, the git describe output needs to be emulated. A date is a poor way of doing that, because it's not connected to a specific point in the revision history.
The only case where the commit date would work is if Git is being used as a crippled centralized version control system. If all commits get created against master (no merging branches or patches) in a centralized repository where the system clock is never moved backwards, it could work... but that's not the reality of non-linear, distributed version control.
A build date is even worse, because it only refers to the local state of time and the local state of the repository. Commits from before that timestamp will often be pushed to the repository.
-- thestinger 04:24, 25 January 2015 (UTC)
A packaging standard must be compatible with the lowest common denominator. A minority of Git repos are properly tagged. The Date works as a standard, tags do not.
   git log -1 --format="%cd" --date=short | sed "s|-||g"
The output of the above command shows the date of the most recent commit. If the most recent commit date happens to come before the date of a previous commit, I do understand this is suboptimal, but I consider it an edge case. Even in big projects with contributors in different time zones you will typically see branches merged in with PRs. PRs reset the date to something recent. The linearity of Git is rarely problematic, but I do see where you're coming from (I'm also a happy user of archversion for keeping packages up to date, though archversion works with the Date standard too :) --Atweiden (talk) 04:59, 25 January 2015 (UTC)
The git describe versioning system is the standard defined and used by Git itself. The dashes need to be converted into dots for pkgver and an r may be inserted to make it clearer which part is the revision number. If version-based tags on master are available, this is the only sane way to do it.
Git defaults to fast-forwarding / applying patches without merge commits, as long as there are no conflicts. GitHub will always generate a merge commit but that's not true for anything applied / merged by the developers and then pushed to master. -- thestinger 05:24, 25 January 2015 (UTC)
You've made the claim that most git repos aren't properly tagged, not only now but to me previously. Do you have anything to back that up? IME, the large majority of projects that are in a usable state are either tagged or get tagged as they become stable.
Scimmia (talk) 05:27, 25 January 2015 (UTC)
   git init
You now have your very own untagged Git repo :).
The majority of Git repos are untagged. If you want to have a standard compatible with the lowest common denominator, by definition you cannot depend on tags being there.
I regularly package Git software in the alpha stages and software maintained by people who are unfamiliar with tagging or aren't bothered to tag. --Atweiden (talk) 07:15, 25 January 2015 (UTC)
As suggested in VCS package guidelines#Fallback, I think the only date that can be used is the build date: for example I've recently used it in a PKGBUILD for a repo without tags or branches, also appending the first 7 characters of the commit hash (the repo is not cloned, so I could only use git-ls-remote). — Kynikos (talk) 03:30, 25 January 2015 (UTC)

Updating a CVS repo

I don't use cvs. How can you describe the pkgver for cvs (for pacman 4.1)?
-- Dracorp (talk) 09:31, 6 April 2013 (UTC)

CVS is not supported in pacman 4.1 like the other VCS tools. You will need to update pkgver manually until CVS support is added.
-- Jstjohn (talk) 22:44, 15 April 2013 (UTC)
Yeah, but how about mentioning that in the article (as well as giving a download example)? Even if it's not that common anymore.
--Det (talk) 22:39, 2 May 2013 (UTC)
The download example can still be found in /usr/share/pacman/. The next version of the ABS package should update it a bit so the download happens in the prepare function where it belongs. As for pkgver, I think the generic example using date covers that, as there's not a way to get a version number from a CVS repo. Maybe a note to that effect?
-- Scimmia (talk) 07:17, 15 May 2013 (UTC)
That makes the most sense, but it might also be a good idea to rename the "Fallback" section to something like "Fallback / CVS" to make it more obvious even when you're just checking out the table of contents.
But as for ABS, as far as I can tell the last commit was over 8 months ago.
--Det (talk) 05:54, 19 May 2013 (UTC)
Hmm, there were a number of patches submitted last month for cleaning up the prototypes, looks like none have been committed yet. I do remember a discussion (IRC maybe?) questioning the proper place for the prototypes, so maybe that's why? Looking at the patches, I was mistaken anyway; they didn't update the darcs or cvs prototypes. Simple enough, I'll send in a patch myself.
--Scimmia (talk) 08:22, 19 May 2013 (UTC)
I use this dirty hack: cvs history -c -a | cut -d' ' -f2 | sort -u | tail -n 1 | sed 's|-||g" ; could probably be improved.
--Buhman (talk) 18:00, 6 June 2013 (UTC)

pkgver function for hg based on tags

I recent came across a way with hg to show the most recent tag, as well as the number of commits from this tag (similar to the output of git describe.)

pkgver() {
  cd local_repo
  hg log -r . --template '{latesttag}.{latesttagdistance}.{node|short}\n'

Please could this be included in the page.

Garyvdm (talk) 09:03, 23 July 2013 (UTC)

shorten hg version

To prevent long package file name, It is proper to use this format

pkgver() {
  cd $_repo
  _id=$(hg identify -i)
  echo $(hg identify -n).${_id:0:4}

--Dlin (talk) 05:30, 26 August 2013 (UTC)


pkgver() {
  cd local_repo
  _id=$(cat manifest.uuid 2>/dev/null)
  echo ${_id:0:4}

--Dlin (talk) 05:36, 26 August 2013 (UTC)


Re [1], I hope I wasn't the straw that broke the camel's back. I still fail to see why the Git section should be that large though, also in comparison to the other sections. -- Alad (talk) 11:26, 6 March 2015 (UTC)

set -o pipefail

This change mentioned set -o ... as one of the reasons for reverting, nevertheless one example using set -o pipefail still remains: VCS_package_guidelines#Git (the last example in that section). -- Lahwaacz (talk) 17:41, 6 March 2015 (UTC)

Guidelines on updating version numbers

There are a few version control AUR packages out there which use the correct combined format of release and git commit. However, sometimes the maintainers refuse to update the default pkgver upon new releases of the software, stating that it's unnecessary since it's updated automatically during installation anyway. While this is true, I personally feel it should still be the maintainer's responsibility to update the default pkgver upon actual releases (not individual commits) as otherwise a user will never receive an automatic update and instead has to force-reinstall all(!) such packages regularly.

Should we add this to the guidelines or how do others feel about this?

—This unsigned comment is by Airblader (talk) 09:51, 10 September 2015‎. Please sign your posts with ~~~~!

There's no official update mechanism for AUR packages, so this argument only applies to those using unofficial AUR helpers. Otherwise there's soname bumps; if the package is a library, then pkgver should likely be updated. If it is a library the package depends on, I don't know of a reliable way as pkgrel is stuck to 1 on VCS packages. After all, a soname bump could happen between releases or even commits. -- Alad (talk) 11:00, 10 September 2015 (UTC)
If you're using a VCS package, you should be paying attention to what is happening upstream and choose when you rebuild the package. Managing them the same way as release packages makes no sense.
Scimmia (talk) 13:25, 10 September 2015 (UTC)
Can you explain why it doesn't make sense to manage them the same way? One example is the package i3-git. As of writing this, it's version in the AUR is 4.10.2-xxxxx. The current i3 release is 4.10.4. So why *shouldn't* the package maintainer feel responsible to update it to 4.10.4-xxxxx? It's part of the Arch "feel" to be close to current development of software and I feel that if you maintain a package, you are responsible to update its version even if the actual installation process doesn't change. Sure I can rebuild my git packages, but that's really time intensive. I also cannot monitor all of those packages upstream with reasonable effort. This is exactly what a package management system has version numbers for and keeping them up to date is exactly what package maintainers are for, IMHO. Airblader (talk) 11:39, 11 September 2015 (UTC)
VCS packages will always build the latest revision, regardless of what pkgver on the AUR contains. If a package maintainer bumps pkgver of a VCS package, there is no assurance that the users will actually build the same (working) revision, hence the bumping makes no sense. If you need to track multiple git repos, use something like repocheck to find out if you actually need to rebuild the package. I'm sure there are similar tools for other VCS, or just write your own. -- Lahwaacz (talk) 12:13, 11 September 2015 (UTC)
I don't think that bumping makes no sense just because you build the latest version and not the one stated by pkgver. This is *always* the case, even if the version isn't bumped. The bump isn't meant to build exactly that version, it is meant to trigger an automatic update in the user's package management system that doesn't require checking out the repository and verifying it themselves. Again, IMHO this is what version numbers are for. If we don't care about such packages triggering an update or having any kind of comparable version number, what's the point in even giving them a (meaningful) number? We might as well just set pkgver=1 and be done with it, no? So why all the fancy pkgver() functions for VCS packages? I don't see how the repocheck you linked to is relevant for this. I don't have the repositories of my VCS packages laying around (only in /tmp right after I update them and only until I reboot). I'd like to approach this with a pragmatic argument: I hope we can agree that triggering the update for official releases of the package is some benefit. Given that, what's the *harm* in recommending to update pkgver? If there's an opinion that it makes no sense, that's one thing, but as long as it doesn't harm I think the small benefit sounds like reason enough to me. It's not like updating the pkgver of a package is difficult work. Airblader (talk) 13:20, 11 September 2015 (UTC)
"I hope we can agree that triggering the update for official releases of the package is some benefit." We can't. I really see no benefit there at all. "Given that, what's the *harm* in recommending to update pkgver?" The harm comes from annoying people using those AUR helpers you mentioned. They have correctly chosen when they want to rebuild that package, now their helper is complaining all of the time even though they don't want to rebuild right now. Trigger automatic rebuilds of VCS packages is rarely a good thing, as there are no upsides and real downsides.
Your argument that it is really time intensive to rebuild your VCS packages makes no sense here, as you have to rebuild them either way. If you're unable to monitor upstream, why are you using a VCS package anyway? If you have no idea what's going on, you have no idea what you're getting when you build/rebuild it.
Scimmia (talk) 13:38, 11 September 2015 (UTC)
I don't think that you choose VCS packages because you don't want your package management system to annoy you with updates. If that's something you don't want, then you don't use one at all (why would I choose to not receive updates for *that one* package, but for all 500 others?). IMHO, you choose VCS packages because you want to be close to current development, either "just because" or because you need(ed) it for something else. Also, sometimes there just is *only* the VCS version and no package for the release. Also, if you don't want to receive updates about a certain package, you can always ignore it in the config. I don't think it's good to deny *all* users the notification of updates just because *some* might not want them.
I also can't agree with "triggering automatic rebuilds of VCS packages is rarely a good thing". I never see upront what I actually build, so there's always that "risk". You don't install VCS packages if you want the release version. Furthermore, in practice, it seems rare that rebuilding a VCS package causes any issues since they usually point to the master branches which in most projects have some QA before merging.
Your point about having to rebuild them anyway and hence not saving time is not correct. You're arguing that I should just rebuild all VCS packages all the time just in case there was an update. I'm suggesting the information whether a package needs to be rebuilt is transported in the pkgver. This saves a *drastic* amount of time as I'm not speculating. Of course I can check the Github page of each package's upstream URL before rebuilding it, but this puts the work of versioning the package onto each and every user rather than the package maintainer. Again, if this is the desired behavior, we should recommend that VCS packages just set pkgver=1. I don't see why they need to provide any version info if it's completely unnecessary that it ever be updated. Airblader (talk) 13:48, 11 September 2015 (UTC)