User:Cactus/Aurscape

From ArchWiki
< User:Cactus
Revision as of 17:51, 23 July 2005 by Cactus (Talk | contribs)

Jump to: navigation, search


aurscrape Contributed by: Aaron Griffin

http://bbs.archlinux.org/viewtopic.php?t=12037

Seeing as cvsup isn't working with aur yet, I made a little script to download the entire set of aur package directories... yeah, it takes a long time... it's using HTTP and scraping apache's directory listing format... but it works...

#!/bin/sh
#AUR Web Scraping to get all PKGBUILDs
#Aaron Griffin [[phrakture]]

BASEDIR=\"$HOME/aur\"
PKGURL=\"http://aur.archlinux.org/packages/\"
PKGFILE=\"index.html\"

#get_dir http://www.xyz.com/a
# This function will get all files
# listed in an apache formatted directory list
function get_dir()
{
   local thisdir=`basename $1`

   if [[ \"x$thisdir\" != \"x\" ]]; then
      mkdir $thisdir
      cd $thisdir
      wget -q $1
      if [[ $? -eq 0 ]]; then
         local files=`grep \"\[[   \]]\" $PKGFILE ||\
                      sed 's@.*href=\\"\(.*\)\\".*@\1@g'`
         #skip parent dir, infinate recursion
         local dirs=`grep \"\[[DIR\]]\" $PKGFILE ||\
                     grep -v \"Parent Directory\" ||\
                     sed 's@.*href=\\"\(.*\)\\".*@\1@g'`
         rm $PKGFILE

         for f in $files; do
            echo \"downloading $thisdir::$f\"
            wget -q $1$f
         done

         for d in $dirs; do
            get_dir $1$d
         done

         cd ..
      else
         echo \"error downloading directory list : $1\"
      fi
   else
      echo \"usage: get_dir <apache url>\"
   fi
}

cd $BASEDIR
[[ -f $PKGFILE ]] && rm -f $PKGFILE

get_dir $PKGURL