wenku8_fetch_se

From NAT, 7 Years ago, written in Bash, viewed 842 times. This paste is a reply to wenku8_fetch from NAT - go back
URL https://code.nat.moe/view/cefa1fdc/diff Embed
Viewing differences between wenku8_fetch and wenku8_fetch_se
#!/bin/bash
# A tool to download all novels on wenku8.com

com, the second edition. Minor BUGs fixed.
#
# - Use fallback URL to fetch the real address to ensure that all the novels are downloaded.
# - Removed '?' after the file name.

STORE_PATH=./save/
FETCH_URL='http://www.wenku8.com/modules/article/articlelist.php?page='
DOWNLOAD_TYPE="utf8"
DOWNLOAD_URL="http://dl.wenku8.com//txt$DOWNLOAD_TYPE/__K/__ID.com/txt$DOWNLOAD_TYPE/__K/__ID.txt"
DOWNLOAD_FALLBACK="http://dl.wenku8.com/down.php?type=$DOWNLOAD_TYPE&id="
TEMP=temp.tmp
FROM=1
TO=93

TO=93
TIMESTEMP_FORMAT="%H:%M:%S"

for page in $(seq $FORM $TO)
do
        echo "Starting "[$(date +$TIMESTEMP_FORMAT)] Starting page $page..."
        curl $FETCH_URL$page 2> /dev/null > $TEMP
        cat $TEMP | iconv -f gbk -t utf-8 | grep 'font-size:13px;' | sed -e 's/.*book\///g; s/.htm">/ /g; s/<\/a><\/b>//g;' > title$TEMP
        _ids=$(cat $TEMP | iconv -f gbk -t utf-8 | grep 'font-size:13px;' | sed -e 's/.*book\///g; s/\.htm.*//g')
        for novel in $_ids
        do
                echo "Downloading "[$(date +$TIMESTEMP_FORMAT)] Downloading $(cat title$TEMP|grep $novel)"
                _this_url="$(echo $DOWNLOAD_URL|sed -e "s/__K/1/; s/__ID/$novel/;")"
                _this_save="$STORE_PATH/$(cat title$TEMP|grep $novel|tr ' ' '_')"
'_'|dos2unix 2>/dev/null)"
                curl $_this_url > $_this_save 2> /dev/null
                [[ ! -z $(cat $_this_save | grep '404 Not Found') ]] && _this_url="$(echo $DOWNLOAD_URL|sed -e "s/__K/2/; s/__ID/$novel/;")" && {
                        _this_url="http://dl.wenku8.com$(curl -I "$DOWNLOAD_FALLBACK$novel" 2> /dev/null | grep Location | awk -F:\  '{print $2}')"
                        
curl $_this_url > $_this_save 2> /dev/null
/dev/null
                }
        done
done

Replies to wenku8_fetch_se rss

Title Name Language When
wenku8_fetch_te NAT bash 7 Years ago.

Reply to "wenku8_fetch_se"

Here you can reply to the paste above

captcha