Jul
01
2007

Generating Website Snapshots (Thumbnails)

How can I capture a snapshot of a web page and save it as a JPEG or PNG file? Today I briefly looked into this question.

On-line Snapshot Generators

Two classes of snapshot generators are available on the web. BrowserCam and NetRenderer support web site developers who need to verify their web designs on multiple platforms and browser versions. These snapshots are full-size and present exactly what would appear on a user’s screen. NetRenderer describes their solution as follows: “we use a proprietary C# application to control parallel rendering and to generate the virtual screenshot images.” BrowserCam allows you to connect to their rendering machine using VNC.

The second class of web-available snapshot generators offer their service to website authors as a way of enriching their site’s graphics. These are typically thumbnail images, in terms of size, detail, and function. This class of generator is more in line with my possible use for snapshots.

WordPress offers Snap’s snapshot generator on all links within these blogs (to go to their site, hover on any link then click on the SnapShot logo in the lower-right footer). New snapshots are generated relatively quickly (i.e. minutes, not hours).

A similar offering comes from ShrinkTheWeb. Let’s try it on the New York Public Library web site:

ShrinkTheWeb website thumbnail

Although NYPL might be cached, I tried it on an unlisted site and the thumbnail appeared within 10 seconds. The engine is usually available and offers parameters to control the size and quality of the snapshot. The image is directly embeddable in my web page, unlike Snap.

If you compare the actual web site to the various thumbnail engines, notice that Java, Flash, and JavaScript generated graphics may not appear. I would expect that BrowserCam would be most faithful in this regard, since one actually connects to a real browser.

Currently I can’t easily gain possession of the image file. I could capture the image with wget or Curl, but that’s not the same as directly generating the image. And Snap displays the image only in a pop-up.

Command Line Snapshot Generation

Ideally, I’d like to replicate snapshot functionality on a local workstation. In Linux, I would hope to generate a thumbnail by providing a URL and a view-port size to Firefox or Konqueror, then requesting that it save the image to a file in my preferred format. Something like this:

   browser --background --size 800x600 --jpeg www.yahoo.com > yh.jpg

Alas, man pages and user documentation gave no evidence that Firefox, Opera, or Konqueror support a background mode of operation. Others have succeeded, however. In his Planet-PHP Website Thumbnails entry from 2005, J Eichorn describes using Mozilla in just such a fashion. Update: This functionality is available online at Bluga.net Webthumbs.

Things have evolved from 2005 — now I guess SeaMonkey is the suite offering. If one wants access to the rendering engine, the Mozilla web site suggests embedding the Gecko engine directly using the Toolkit API.

posted in WebHosting by Bozzie

5 Comments to "Generating Website Snapshots (Thumbnails)"

  1. claude wrote:

    Hi Boozie,

    > browser –background –size 800×600 –jpeg http://www.yahoo.com > yh.jpg
    THIS is the only useful approach to the problem. Some kind of webservice could return the pic file itself…

    Did you solve it already?
    Or found another solution provides a “wgettable” img file?

  2. Jean-Marc Liotier wrote:

    What you really want is khtml2png : http://khtml2png.sourceforge.net/

  3. Greg wrote:

    After trying most of the major thumbnail providers, many of which proved unreliable, I chose Snapcasa. Their website thumbnails are free and do not have a watermark on them. It’s served an average of 13000 snapshots daily on three of my sites for almost two months now. I would recommend it.

  4. Olivier wrote:

    I was looking for a command line script that generates thumbnails but only found an example in the Qt documentation so I copied that in python. It is now available here: http://code.google.com/p/sitesnap/.

  5. Clint wrote:

    The example snapshot you display is broken.

 
Powered by Wordpress and MySQL. Theme by openark.org