How can I download Vala site?

I want all of this site available locally:

http://valadoc.org/

I can't find it packaged up anywhere. Does anyone know if there is some way, some kind of spider thing, that can get it all?


Posted on 23 Nov 2008, 18:07


Comments:

Posted on 23 Nov 2008, 19:04 by Dougal
Website Get
wget?

Also, there's fancy stuff like httrack (http://www.httrack.com/) that I think will change links etc.


Posted on 23 Nov 2008, 19:29 by Leon
DownThemAll!
https://addons.mozilla.org/en-US/firefox/addon/201

https://addons.mozilla.org/en-US/firefox/images/p/18836/1204274475


Posted on 23 Nov 2008, 19:37 by eprv
wget usage
All you need is here
http://linuxreviews.org/quicktips/wget/


Posted on 23 Nov 2008, 19:41 by Bill St. Clair
wget rocks
Yep, wget will do it for you. And you already have it, in Puppy.


Posted on 23 Nov 2008, 21:33 by kirk
httrack
I needed to do the same thing a couple days ago, so I downloaded Httrack. Here's a pet package if you want to try it:

http://puppylinux.ca/tpp/kirk/httrack-3.43.1-i486.pet

It was compiled in my pupplet, so hopefully there's no missing dependencies.


Posted on 23 Nov 2008, 21:46 by kirk
httrack
Eric's server is over it's limit, try this:

http://www.filefactory.com/file/e266f9/n/httrack-3_43_1-i486_pet">httrack-3.43.1-i486.pet


Posted on 24 Nov 2008, 13:09 by happypuppy
Teleport Pro with WINE
If you don't mind running a Windows app with WINE you can use 'Teleport Pro' instead - the best app (on any platform) for this kind of stuff.

http://files.tenmax.com/Teleport_Pro_Installer.exe

(I wish there was a native Linux version of this app)



Posted on 24 Nov 2008, 18:10 by BarryK
Using wget
Yeah but there's a problem. I did try wget, like this:

# wget -r http://valadoc.org/

The '-r' means recursive, and it does get all the pages. The problem is that the links don't work.

The site has a server script running to interpret the query on the end of the URL and serve the correct page. wget downloads the pages and names them according to this query, but it doesn't work for local links. That is, clicking a link in one of the pages should bring up another local page, but doesn't.

What is needed is a clever downloader that renames the links and pages so that they will work locally. I don't know if there is such a thing.


Posted on 24 Nov 2008, 18:25 by Dougal
Clever Downoader
As I mentioned, I think Httrack gives you that option.
Else, use sed.


Posted on 24 Nov 2008, 18:29 by Dingo
k parameter for local links
adding -k parameter should rewrite link locally, Have you already tried?

wget -r -k http://valadoc.org/


Posted on 24 Nov 2008, 18:43 by muggins
wget
wget -m http://www.etc mirrors the address locally, with linlks converted.


Posted on 24 Nov 2008, 18:47 by muggins
Re: wget
I should have added the real address:

wget -m http://valadoc.org/


Posted on 24 Nov 2008, 21:40 by kirk
httrack
Ya, httrack is very slick. I used it to copy a web site with tons of links and flash videos.



Posted on 25 Nov 2008, 5:53 by BarryK
Broken links
I should have been more specific about the problem in the original post. It doesn't matter whether I use -k, -m, -r or whatever with wget, nor does httrack help.

The links in the downloaded pages are broken.

As dougal commented, I'll have to use sed. wget names the pages like this:

index.html?path=cairo::Cairo::Path
index.html?path=cairo::Cairo::PathData
index.html?path=cairo::Cairo::PathDataHeader
index.html?path=cairo::Cairo::PathDataPoint
index.html?path=cairo::Cairo::PathDataType

And the links in the pages are named accordingly. But the browser doesn't like links to pages with those weird characters in them. I can probably fix it by doing a batch renaming of all the files, then use sed to rename all links in the pages.


Posted on 25 Nov 2008, 14:04 by us
more wget
wget -r -k -E -l 8 http://www...

- 8 levels
- recursive
- convert links
- archive


Posted on 26 Nov 2008, 10:33 by kirk
httrack
Barry,

Did you try the latest version of httrack? I just downloaded 125mb of the valdoc site, all the links I tried worked, though I didn't download the whole site.




Posted on 26 Nov 2008, 21:00 by BarryK
Re: httrack
kirk,
thanks for testing httrack on valdoc.org. Ok, I'll give it a go. I did start to use the .pet you uploaded, but the links seemed wrong. But I stopped anyway due to the limited monthly bandwidth on my home satellite connection.

I'll be in Perth by end of this week. One of my relatives has a 40GB monthly limit, so I'll download from there.