User:Zatricky/DeltaMirrorFromNonDelta
Overview
Make a mirror available with deltas where the source repository is *not* available with deltas:
After basic testing, will create ba/sh-form script
Also see:
https://wiki.archlinux.org/index.php/Deltup https://github.com/sabooky/archdelta https://bbs.archlinux.org/viewtopic.php?id=92085
Summary of procedure
On first sync:
- 1. configure a separate mirror path for the delta *version* of the mirror
- 2. use the standard rsync method to an “offline” path
- 3. configure hard links from the delta mirror to the same locations in the main mirror only for package files
- 4. use repo-add to add all .pkg files to new $repo.db files (how does the signing work? Does it create a new .sig for the package or does it only sign the .db?), security of signing previously unsigned packages or automatically rejecting invalid packages/signatures
Subsequent syncs:
- 5. (As with 2 above). Complete normal sync to the offline path.
- 6. (As with 3 above). Sync hard links from the delta mirror to the same locations in the main mirror only for package files
- 7. use repo-add to add only the new files. Use -d switch to automatically create deltas (as above, signed??)
- 8. remove old .pkg files
- 9. remove old deltas? Find an easy/KISS way to determine when old deltas are using up too much disk that pacman won’t bother using them. Can repo-* do this part automatically? If it does, would repo-add *only* remove the db entry or would it also delete the .delta file?
- 10. update lastsync
Questions still unanswered:
By creating lots of .delta files in /$repo/os/$arch/ might there be duplicates for multilib packages wasting mirror diskspace? Should we follow the symlinks in /$repo/os/$arch/ to /pool/packages/ and mv && ln -s as is done with the .pkg files?
What is repo-elephant?? lol.
Detail
1. configure a separate mirror path for the delta version of the mirror without the dbs:
mkdir -p /srv/ftp/pub/archlinux-delta
2. use standard rsync method with delete:
rsync -rtlvH --delete-after --delay-updates --safe-links --max-delete=1000 rsync://source-server/archlinux/ /srv/ftp/.hidden/archlinux/
3a. get lists of all new package files for each repo/os/arch:
comm -23 <(cd /srv/ftp/.hidden/archlinux/ ; find) <(cd /srv/ftp/pub/archlinux-delta; find) | grep \.pkg > $newlist comm -13 <(cd /srv/ftp/.hidden/archlinux/ ; find) <(cd /srv/ftp/pub/archlinux-delta; find) | grep \.pkg > $oldlist
3b. sync package files across via hard links only:
rsync -aplx --link-dest=/srv/ftp/.hidden/archlinux/ --exclude lastsync --exclude '*.db' --exclude '*.db.tar.gz' --exclude '*.abs.tar.gz' --exclude '*.files' --exclude '*.files.tar.gz' /srv/ftp/.hidden/archlinux/ /srv/ftp/pub/archlinux-delta/
4. Create new $repo.db as appropriate:
cd /srv/ftp/pub/archlinux-delta/$repo/os/$arch/ ; repo-add -s -k <key> $repo.db *.pkg{,.tar{,.{bz2,gz,xz,Z}}}
5. As with 2 above
6. As with 3 above
7. For each .pkg applicable in newlist, run repo-add with with “-d” to automatically create a delta:
cd /srv/ftp/pub/archlinux-delta/$repo/os/$arch/ ; repo-add -s -k <key> -d $repo.db <package.pkg.ext>
Cannot use the original $repo.db every time as the original $repo.db will not contain any delta data
8a. remove all old package files:
for i in `grep pool\/packages oldlist` ; do rm -f $i ; done
AND/OR??
for i in `ls -l /srv/ftp/pub/archlinux-delta/pool/packages/ | awk '{print $2 " " $8}' | grep ^1\ | awk '{print $2}'` ; do rm -f $i ; done
Both methods have edge cases where files might be “missed” however both edge cases are covered by their alternative method.
8b. remove all broken symlinks:
find -L /srv/ftp/pub/archlinux-delta -type l -delete
9. remove old irrelevant deltas:
find /srv/ftp/pub/archlinux-delta -ctime +90 -name \*.delta -exec repo-remove {} \;
10. Update lastsync
rsync /srv/ftp/.hidden/archlinux/lastsync /srv/ftp/pub/archlinux-delta/