Difference between revisions of "User:Cactus/Aurscape"

From ArchWiki
Jump to: navigation, search
Line 1: Line 1:
 
[[Category:Scripts]]
 
[[Category:Scripts]]
  
==aurscrape==
+
'''''aurscrape''''' ''Contributed by: Aaron Griffin''
 
+
''Contributed by: Aaron Griffin''
+
  
 
http://bbs.archlinux.org/viewtopic.php?t=12037
 
http://bbs.archlinux.org/viewtopic.php?t=12037

Revision as of 17:51, 23 July 2005


aurscrape Contributed by: Aaron Griffin

http://bbs.archlinux.org/viewtopic.php?t=12037

Seeing as cvsup isn't working with aur yet, I made a little script to download the entire set of aur package directories... yeah, it takes a long time... it's using HTTP and scraping apache's directory listing format... but it works...

#!/bin/sh
#AUR Web Scraping to get all PKGBUILDs
#Aaron Griffin [[phrakture]]

BASEDIR=\"$HOME/aur\"
PKGURL=\"http://aur.archlinux.org/packages/\"
PKGFILE=\"index.html\"

#get_dir http://www.xyz.com/a
# This function will get all files
# listed in an apache formatted directory list
function get_dir()
{
   local thisdir=`basename $1`

   if [[ \"x$thisdir\" != \"x\" ]]; then
      mkdir $thisdir
      cd $thisdir
      wget -q $1
      if [[ $? -eq 0 ]]; then
         local files=`grep \"\[[   \]]\" $PKGFILE ||\
                      sed 's@.*href=\\"\(.*\)\\".*@\1@g'`
         #skip parent dir, infinate recursion
         local dirs=`grep \"\[[DIR\]]\" $PKGFILE ||\
                     grep -v \"Parent Directory\" ||\
                     sed 's@.*href=\\"\(.*\)\\".*@\1@g'`
         rm $PKGFILE

         for f in $files; do
            echo \"downloading $thisdir::$f\"
            wget -q $1$f
         done

         for d in $dirs; do
            get_dir $1$d
         done

         cd ..
      else
         echo \"error downloading directory list : $1\"
      fi
   else
      echo \"usage: get_dir <apache url>\"
   fi
}

cd $BASEDIR
[[ -f $PKGFILE ]] && rm -f $PKGFILE

get_dir $PKGURL