Jump to content

DeveloperWiki talk:NewMirrors

From ArchWiki
Latest comment: 23 May by Marbens in topic No less than hourly sync interval is not enforced

Reasonable success rate

Would it be too much to document what's a reasonable success rate for Tier 1 vs Tier 2 mirrors? Currently there's a ticket surrounding it and it's not clear what to do in those situations. Experience will tell but it would be nice to clear that out.

Also what's a reasonable amount of time for a Tier 2 to stay out of sync?

An example: https://archlinux.org/mirrors/osbeck.com/ syncs from https://archlinux.org/mirrors/f4st.host/. Both are out of sync but I'm not sure it's reasonable to complain on the Tier 2 when the Tier 1 is out of sync, and it would be good to document how long this out-of-sync is allowed to persist. Relevant bug example: https://bugs.archlinux.org/task/72507

Torxed (talk) 22:19, 16 February 2023 (UTC)Reply

Maintainer list

Maybe we add the current mirror maintainers on the wiki page?

Torxed (talk) 22:23, 16 February 2023 (UTC)Reply

Respond to rate limiting and other issues

Another issue category that appears to pop up quite frequent is rate limiting resulting in 403 or pure slow traffic resulting in termination of transfer.

We should document the guidelines here.

Suggested limits (if any)

  • Allow for conservative 4 paralell downloads in pacman
  • Allow for at least 100-200 packages per minute, per IP.
  • No active throttling below the given bandwidth definition
  • Require directory listing to be allowed on the mirror URL

Torxed (talk) 16:10, 19 February 2023 (UTC)Reply

Rework Tier model/requirements

As outlined in FS bug report https://bugs.archlinux.org/task/71616, there are a few countries with several Tier 1 mirrors. The goal of the tier model was to offload the Tier 0 mirror. Given the fact that a lot of countries now have several Tier 1 mirrors, that means the load on Tier 0 has grown significant too.

We should also improve the general Wiki entry on how to both set up a mirror, but also what the requirements are. As mentioned in the FS bug, optimizing the mirrors for greater speeds etc.

This is a discussion that we'll need to have before taking action.

Torxed (talk) 16:43, 19 February 2023 (UTC)Reply

Automatic HTTPS addition?

I'm starting this discussion as noted in https://bugs.archlinux.org/task/66312 Several mirrors support HTTPS but only list HTTP.

I don't see this as something urgent, but what should we do here? Some scenarios that I see against adding HTTPS on the mirrors that support it: 1. Enabling HTTPS on reports like this might put additional strain on the mirror which the mirror admin did not agree to 2. The mirror admin might not actively maintain certificates, and therefore does not wish to commit to HTTPS support

Benefits would be less risk of being monitored if the user is afraid of such things. But then again, they could opt out of HTTP mirrors in those cases.

Ideas?

Torxed (talk) 18:17, 4 March 2023 (UTC)Reply

I Agree on both points. My approach would be to batch email people who have https on their mirrors but not in their mirror entries. That way they are aware it's being used. --Pitastrudl (talk) 19:34, 14 September 2023 (UTC)Reply
From my POV (tier 2 mirror) - I agree with both points. Especially regarding the admin not agreeing with that traffic. The best approach would be to contact mirror admins and ask them to confirm (one-click link in email, like the email confirmation for new signups?) to have the TLS entry added to the official mirror list. If they do not confirm in (let's say) two weeks the entry is not added.
Implicitly assuming that the admin agrees with this change is - IMO - wrong. While I agree that https should be always supported and the default option, I think that it should not be automatically changed without the admin agreeing. Aljaxus (talk) 11:21, 13 November 2023 (UTC)Reply
I think it is also worth discussing if it even make sense to accept mirrors without at least HTTPS in 2023. Is there any reason for not raising the bar? Klausenbusk (talk) 03:02, 31 December 2023 (UTC)Reply

Add Expected Monthly Bandwidth

A forum thread from 2007 reports 6TB of bandwidth usage by ftp.archlinux.org

I'm hesitant to host a public tier 2 without having some rough estimate of monthly bandwidth use.

--Port19 (talk) 11:25, 6 September 2023 (UTC)Reply

ftp.archlinux.org is one of the main servers that probably get pulled by a lot of people and the amount of bandwidth is not comparable to a tier2 mirror. It also heavily depends where is your mirror located geographically and how many arch users are there in your area. You could ask on mirrors@archlinux.org mailing list for some statistics from other mirror owners.
--Pitastrudl (talk) 19:38, 14 September 2023 (UTC)Reply
I can provide the numbers for the mirrors managed by Arch Linux's DevOps team. This data is from the last 30 days and they do not fluctuate much. The numbers should be seen as outliers, as geo.mirror.pkgbuild.com is used by e.g. the official Arch Linux container image and we suspect a not insignificant amount of the traffic comes from CI jobs running at the big providers (GitHub, GitLab etc.).
Mirror Bandwidth
london.mirror.pkgbuild.com 28 TiB
america.mirror.pkgbuild.com 83 TiB
asia.mirror.pkgbuild.com 19 TiB
europe.mirror.pkgbuild.com 52 TiB
seoul.mirror.pkgbuild.com 8 TiB
sydney.mirror.pkgbuild.com 13 TiB
geo.mirror.pkgbuild.com (total) 203 TiB
Klausenbusk (talk) 03:24, 31 December 2023 (UTC)Reply

Update mirror size

Mirror size has grown since 2023. I am preparing to host an Arch mirror and downloaded the mirror in its entirety from mirror.accum.se just a few minutes ago. This is the current sizes:

32K	./multilib-staging
5.9G	./images
4.6M	./extra-staging
6.7G	./other
132K	./core-staging-debug
4.2M	./core
197M	./extra
456K	./gnome-unstable
3.1M	./multilib-debug
4.0K	./latest
916K	./kde-unstable-debug
152K	./multilib-testing
4.5M	./extra-staging-debug
255M	./wsl
3.0M	./multilib
15G	./archive
2.5M	./core-debug
6.9G	./iso
332K	./core-testing
140K	./multilib-testing-debug
56K	./core-testing-debug
32K	./multilib-staging-debug
69G	./sources
172G	./pool
104M	./extra-debug
184K	./gnome-unstable-debug
2.6M	./extra-testing-debug
340K	./core-staging
992K	./kde-unstable
2.8M	./extra-testing
275G	.

It was a bit of an inconvenience that the sizes are out of date since I made the drive in my virtual machine way too small and had to start all over again. Please update the article with the new sizes so we can prevent that more people experience the same inconvenience.

All the best Ted Tewa (talk) 13:26, 18 May 2025 (UTC)Reply

The sizing needs updating for sure but debug packages do not need to be synced so that will save you at least half if not more of the 275GB size. We should update that as well since some T1 mirrors might be syncing that as well. Pitastrudl (talk) 18:20, 18 May 2025 (UTC)Reply
Oh that's nice. I'll tweak my rsync script accordingly. Thanks for that. We can count at least one T1 that syncs debug stuff, so that's a good point as well. Tewa (talk) 18:30, 18 May 2025 (UTC)Reply

No less than hourly sync interval is not enforced

In the Tier 2 requirements, it says:

"Do not sync more often than every hour, but you should sync at least once a day"

This is not really enforced, for example, see New Tier 2 Mirror - Milan, Italy, and the mirror status page sorting by μ Delay. Marbens (talk) 19:20, 23 May 2025 (UTC)Reply