User:Foxboron/GitMigration

From ArchWiki
Warning: Details in this migration plan has changed over the course of the past 3 years

Summary

The goal is to move away from using svn internally to manage PKGBUILDs to using git. This enables developers to use tools they are more familiar with and facilitate future outside contributions.

Owner

  • Name: User:foxboron
  • Name: and whoever more are still invested in this.

Description

Gitlab

It's important to mention gitlab as we are going for a "federated" approach. The implication here is that we still need one way to define what the actual PKGBUILD source is. For the sake of simplicity in this proposal, we envision a centralized package group on the upcoming gitlab instance. The structure would follow from the current repository layout.

 https://gitlab.archlinux.org/groups/packages/
     - core
         - linux
         - pacman
         - ...
     - extra
     - community
     - multilib

The access control for this repository should be the same as it currently is. Developers has access to core and extra. Trusted Users have access to community. And multilib is granted on a need-to-need basis.

This allows Developers and Trusted Users to create new canonical package repositories in the package group without devops support, and allows for smooth operation.

The default access for these repositories should be,

  • No issue creation
  • No Merge Requests.
  • Maintainers should be able to enable or disable this on a case-by-case basis.

We still have our bugtracker for handling issues, features and bugs. And not everyone want the added commitment of handling pull-requests directly on the repo. Thus it should be decided maintainer by maintainer themselves.

This also allows us to *know* where repositories are for the sake of script integrations later on.

git

The git interaction of this layout should be fairly straight forward. Remotes are cheap, and so are new repositories. The canonical repositories in the package group is where dbscripts intersects with the maintainer, along with a rewritten asp tool. Besides this, anything should be supported. If the maintainer wishes to have a subtree, submodule or some other mono repository structure it should be simple to construct this from the single lonely repositories for easier management.

When releasing packages, *signed* tags of the pkgver needs to be done. This allows dbscripts to attest the validity of the package release, along with the signed package tarball the maintainer has created. Equally, devtools should aid the packager in these situations.

State repository layout

dbscripts would maintain a separate git repository responsible for keeping track of the package version between releases. Each pkgbase is represented as a file denoting the package version.

$ tree /srv/dbscripts
    dbscripts
    ├── core-x86_64
    │   ├── bash
    │   ├── linux
    │   └── vim
    ├── extra-x86_64
    ├── multilib
    ├── ...
/srv/dbscripts/core-x86_64/bash
bash 4.4.023-1 f19949abae19c11687679663c488a3321e138e4b

This repository is tagged with an incremented number, or a date, and represent the state of the Arch Linux repositories over time. It would keep track of the repository by implementing the same commits as we have in todays svn repository.

The following Git log is from the experimental repository.

 commit b27b2aff2022712031ee818b69f4eef1697e7d7b (HEAD -> master, origin/master)
 Author: Morten Linderud <morten@linderud.pw>
 Date:   Wed May 27 00:23:00 2020 +0200

     db-remove: hello removed by fox:

 commit 2c7c56e3a1757eb47a9cbac71f0ba9b0bb552b49
 Author: Morten Linderud <morten@linderud.pw>
 Date:   Wed May 27 00:22:59 2020 +0200

     db-move: moved hello from [core] to [extra] (x86_64):

 commit b942f04f78d3260972f6307023243f57978e7892
 Author: Morten Linderud <morten@linderud.pw>
 Date:   Tue May 26 22:54:05 2020 +0200

     updated hello-1-1 in [core-x86_64]:

dbscripts

dbscripts has been refactored so the SVN interaction. The largest different is the removal of community and packages seperation. It's easier to provide one set of tools and have dbscripts attempt to guess where the upstream repository is.

This means /community/db-update has been replaces by /packages/db-update and so on.

Addition/Update of a package

db-update

Requirements:

  • Ensure tag is signed by key from archlinux-keyring
  • Ensure pkgver is always larger with vercmp
  • Never allow someone to replace the git commit with the same tag
  • Increment the tag of the

Removal of a package

db-remove would ensure packages are removed, and the entry in the state repository is removed.

Moving of a package

db-move would locate the package, move between the given repository and record the movement in the state repository.

devtools

The current changes can be found in Foxboron/devtools.

Changes:

  • Ensure signde tags are created with archrelease
  • Ensure a checksummed PKGBUILD matches the one in the built package.
  • Ensure .SRCINFO is created and commited.
  • Enforce commits from the master branch.

abs

For users to fetch PKGBUILDs from our package repository we have a few options.

  • We can drop frontend tooling and point people towards the gitlab interface
  • Rewrite asp to work with the new structure and simplify the script
  • Provide a new script

Currently there is nothing decided which option we are going with.

Testing

Testing server: repos.linderud.dev

Deployment repository: Foxboron/dbscripts

Currently all accounts are setup as orion.

Migration

For every directory in our svn repositority (svn-community/svn-packages) a git repository has to be created. Anthraxx has created a svn to git migration script. Manjaro also mirrors our svn2git repository on Gitlab already, and we consider using their script for this migration. [1]

Open points

  • How do we populate the dbscripts database?
  • How do we tag previous releases?
  • Code search - Do we want to construct a monorepo to facilitate repo-wide grepping?

Software requiring migration

Open-ended issues

Gemini git cache

Currently we are cloning all required git repositories upon dbscripts usage. Because this is a network operation there are concerns this would be slow when dealing with larger package migrations. The ideal solution would be to have a mirror of all git repositories locally on gemini and ensure these are updated when new commits are pushed to gitlab.

Security

This is the list of intended security goals of this migration:

  • Enforce git signed commits
  • Verify git signed commits in a git hook and against the keys in the pacman keyring.
  • Disallow force pushes
  • Disallow deploy keys - relevant as we need to ensure malicious commits are not present.
  • Use git tags to match the checksum of the uploaded PKGBUILD with the built package