site  contact  subhomenews

How can I download Vala site?

November 23, 2008 — BarryK
I want all of this site available locally:

I can't find it packaged up anywhere. Does anyone know if there is some way, some kind of spider thing, that can get it all?


Website Get
Username: Dougal
wget? Also, there's fancy stuff like httrack ( that I think will change links etc.

Username: Leon

wget usage
Username: eprv
"All you need is here

Username: kirk
"I needed to do the same thing a couple days ago, so I downloaded Httrack. Here's a pet package if you want to try it: It was compiled in my pupplet, so hopefully there's no missing dependencies.

Username: kirk
"Eric's server is over it's limit, try this:">

Teleport Pro with WINE
Username: happypuppy
"If you don't mind running a Windows app with WINE you can use 'Teleport Pro' instead - the best app (on any platform) for this kind of stuff. (I wish there was a native Linux version of this app)

Using wget
Username: BarryK
"Yeah but there's a problem. I did try wget, like this: # wget -r The '-r' means recursive, and it does get all the pages. The problem is that the links don't work. The site has a server script running to interpret the query on the end of the URL and serve the correct page. wget downloads the pages and names them according to this query, but it doesn't work for local links. That is, clicking a link in one of the pages should bring up another local page, but doesn't. What is needed is a clever downloader that renames the links and pages so that they will work locally. I don't know if there is such a thing.

Clever Downoader
Username: Dougal
"As I mentioned, I think Httrack gives you that option. Else, use sed.

k parameter for local links
Username: Dingo
"adding [b]-k[/b] parameter should rewrite link locally, Have you already tried? wget -r -k

Re: wget
Username: muggins
"I should have added the real address: wget -m

Username: kirk
"Ya, httrack is very slick. I used it to copy a web site with tons of links and flash videos.

Broken links
Username: BarryK
"I should have been more specific about the problem in the original post. It doesn't matter whether I use -k, -m, -r or whatever with wget, nor does httrack help. The links in the downloaded pages are broken. As dougal commented, I'll have to use sed. wget names the pages like this: index.html?path=cairo::Cairo::Path index.html?path=cairo::Cairo::PathData index.html?path=cairo::Cairo::PathDataHeader index.html?path=cairo::Cairo::PathDataPoint index.html?path=cairo::Cairo::PathDataType And the links in the pages are named accordingly. But the browser doesn't like links to pages with those weird characters in them. I can probably fix it by doing a batch renaming of all the files, then use sed to rename all links in the pages.

more wget
Username: us
"wget -r -k -E -l 8 http://www... - 8 levels - recursive - convert links - archive

Username: kirk
"Barry, Did you try the latest version of httrack? I just downloaded 125mb of the valdoc site, all the links I tried worked, though I didn't download the whole site.

Re: httrack
Username: BarryK
"kirk, thanks for testing httrack on Ok, I'll give it a go. I did start to use the .pet you uploaded, but the links seemed wrong. But I stopped anyway due to the limited monthly bandwidth on my home satellite connection. I'll be in Perth by end of this week. One of my relatives has a 40GB monthly limit, so I'll download from there.

Tags: woof