Saving a web page to an HTTP Archive

An HTTP Archive file captures the full details of a series of HTTP requests and responses as JSON.

The shot-scraper har command can save a *.har.zip file that contains both that JSON data and the content of any assets that were loaded by the page.

shot-scraper har https://datasette.io/

This will save to datasette-io.har. You can use -o to specify a filename:

shot-scraper har https://datasette.io/tutorials/learn-sql \
  -o learn-sql.har

A .har file is JSON. You can view it using the Google HAR Analyzer tool.

HTTP Archives can also be created as .har.zip files. These have a slightly different format: the har.har JSON does not include the full content of the responses, which is instead stored as separate files inside the .zip.

To create one of these, either add the --zip flag:

shot-scraper har https://datasette.io/ --zip

Or specify a filename that ends in .har.zip:

shot-scraper har https://datasette.io/ -o datasette-io.har.zip

You can view the contents of a HAR zip file using unzip -l:

unzip -l datasette-io.har.zip
Archive:  datasette-io.har.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
    39067  02-13-2025 10:33   41824dbd0c51f584faf0e2c4e88de01b8a5dcdcd.html
     4052  02-13-2025 10:33   34972651f161f0396c697c65ef9aaeb2c9ac50c4.css
     2501  02-13-2025 10:33   9f612e71165058f0046d8bf8fec12af7eb15f39d.css
    10916  02-13-2025 10:33   2737174596eafba6e249022203c324605f023cdd.svg
     5557  02-13-2025 10:33   427504aa6ef5a8786f90fb2de636133b3fc6d1fe.js
     1393  02-13-2025 10:33   25c68a82b654c9d844c604565dab4785161ef697.js
     1170  02-13-2025 10:33   31c073551ef5c84324073edfc7b118f81ce9a7d2.svg
     1158  02-13-2025 10:33   1e0c64af7e6a4712f5e7d1917d9555bbc3d01f7a.svg
     1161  02-13-2025 10:33   ec8282b36a166d63fae4c04166bb81f945660435.svg
     3373  02-13-2025 10:33   5f85a11ef89c0e3f237c8e926c1cb66727182102.svg
     1134  02-13-2025 10:33   3b9d8109b919dfe9393dab2376fe03267dadc1f1.svg
    31670  02-13-2025 10:33   469f0d28af6c026dcae8c81731e2b0484aeac92c.jpeg
     1157  02-13-2025 10:33   b7786336bfce38a9677d26dc9ef468bb1ed45e19.svg
    50494  02-13-2025 10:33   har.har
---------                     -------
   154803                     14 files

You can record multiple pages to a single HTTP Archive using the shot-scraper multi –har option.

shot-scraper har --help

Full --help for this command:

Usage: shot-scraper har [OPTIONS] URL

  Record a HAR file for the specified page

  Usage:

      shot-scraper har https://datasette.io/

  This defaults to saving to datasette-io.har - use -o to specify a different
  filename:

      shot-scraper har https://datasette.io/ -o trace.har

  Use --zip to save as a .har.zip file instead, or specify a filename ending in
  .har.zip

Options:
  -z, --zip              Save as a .har.zip file
  -a, --auth FILENAME    Path to JSON authentication context file
  -o, --output FILE      HAR filename
  --wait INTEGER         Wait this many milliseconds before taking the
                         screenshot
  --wait-for TEXT        Wait until this JS expression returns true
  -j, --javascript TEXT  Execute this JavaScript on the page
  --timeout INTEGER      Wait this many milliseconds before failing
  --log-console          Write console.log() to stderr
  --fail                 Fail with an error code if a page returns an HTTP error
  --skip                 Skip pages that return HTTP errors
  --bypass-csp           Bypass Content-Security-Policy
  --auth-password TEXT   Password for HTTP Basic authentication
  --auth-username TEXT   Username for HTTP Basic authentication
  --help                 Show this message and exit.