Taking multiple screenshots

You can configure multiple screenshots using a YAML file. Create a file called shots.yml that looks like this:

- output: example.com.png
  url: http://www.example.com/
- output: w3c.org.png
  url: https://www.w3.org/

Then run the tool like so:

shot-scraper multi shots.yml

This will create two image files, www-example-com.png and w3c.org.png, containing screenshots of those two URLs.

Use - to pass in YAML from standard input:

echo "- url: http://www.example.com" | shot-scraper multi -

If you run the tool with the -n or --no-clobber option any shots where the output file aleady exists will be skipped.

You can specify a subset of screenshots to take by specifying output files that you would like to create. For example, to take just the shots of one.png and three.png that are defined in shots.yml run this:

shot-scraper multi shots.yml -o one.png -o three.png

The url: can be set to a path to a file on disk as well:

- output: index.png
  url: index.html

Use the --scale-factor option to capture all screenshots at a specific scale factor, which effectively simulates different device pixel ratios. This setting is useful for testing high-definition displays or emulating screens with various pixel densities.

For example, setting --scale-factor 3 results in screenshots with a CSS pixel ratio of 3, which is ideal for emulating a high-resolution display, such as Apple’s iPhone 12 screens.

To take screenshots with a scale factor of 3 (tripled resolution), run the following command:

shot-scraper multi shots.yml --scale-factor 3

This will multiply both the width and height of all screenshots by 3, resulting in images with a higher level of detail, suitable for scenarios where you need to capture the screen as it would appear on a high-DPI display.

Use --retina to take all screenshots at retina resolution instead, doubling the dimensions of the files:

shot-scraper multi shots.yml --retina

Note: The --retina option should not be used in conjunction with the --scale-factor flag as they are mutually exclusive. If both are provided, the command will raise an error to prevent conflicts.

To take a screenshot of just the area of a page defined by a CSS selector, add selector to the YAML block:

- output: bighead.png
  url: https://simonwillison.net/
  selector: "#bighead"

You can pass more than one selector using a selectors: list. You can also use padding: to specify additional padding:

- output: bighead-multi-selector.png
  url: https://simonwillison.net/
  selectors:
  - "#bighead"
  - .overband
  padding: 20

You can use selector_all: to capture every element matching a selector, or selectors_all: to pass a list of such selectors:

- output: selectors-all.png
  url: https://simonwillison.net/
  selectors_all:
  - .day
  - .entry:nth-of-type(1)
  padding: 20

The --js-selector and --js-selector-all options can be provided using the js_selector:, js_selectors:, js_selector_all: and js_selectors_all: keys:

- output: js-selector-all.png
  url: https://github.com/simonw/shot-scraper
  js_selector: |-
    el.tagName == "P" && el.innerText.includes("shot-scraper")
  padding: 20

To execute JavaScript after the page has loaded but before the screenshot is taken, add a javascript key:

- output: bighead-pink.png
  url: https://simonwillison.net/
  selector: "#bighead"
  javascript: |
    document.body.style.backgroundColor = 'pink'

You can include desired height, width, quality, wait and wait_for options on each item as well:

- output: simon-narrow.jpg
  url: https://simonwillison.net/
  width: 400
  height: 800
  quality: 80
  wait: 500
  wait_for: document.querySelector('#bighead')

Running a server for the duration of the session

If you need to run a server for the duration of the shot-scraper multi session you can specify that using a server: block, like this:

- server: python -m http.server 8000

The server: key also accepts a list of arguments:

- server:
  - python
  - -m
  - http.server
  - 8000

With that server configured, you can now take screenshots of http://localhost:8000/ and any other URLs hosted by that server:

- output: index.png
  url: http://localhost:8000/

The server process will be automatically terminated when the shot-scraper multi command completes, unless you pass the --leave-server option to shot-scraper multi in which case it will be left running - you can terminate it using kill PID with the PID displayed in the console output.

Running custom code between steps

If you are taking screenshots of a single application, you may find it useful to run additional steps between shots that modify that application in some way.

You can do that using the sh: or python: keys. These can specify shell commands or Python code to run before taking the screenshot:

- sh: echo "Hello from shell" > index.html
  output: from-shell.png
  url: http://localhost:8000/

You can also specify a list of shell arguments like this:

- sh:
  - curl
  - -o
  - index.html
  - https://www.example.com/
  output: example.png
  url: http://localhost:8000/

If you specify these steps without a url: key they will still execute as individual task executions, without also taking a screenshot:

- sh: echo "hello world" > index.html
- python: |
    content = open("index.html").read()
    open("index.html", "w").write(content.upper())

shot-scraper multi --help

Full --help for this command:

Usage: shot-scraper multi [OPTIONS] CONFIG

  Take multiple screenshots, defined by a YAML file

  Usage:

      shot-scraper multi config.yml

  Where config.yml contains configuration like this:

      - output: example.png
        url: http://www.example.com/

  For full YAML syntax documentation, see:
  https://shot-scraper.datasette.io/en/stable/multi.html

Options:
  -a, --auth FILENAME             Path to JSON authentication context file
  --scale-factor FLOAT            Device scale factor. Cannot be used together
                                  with '--retina'.
  --retina                        Use device scale factor of 2. Cannot be used
                                  together with '--scale-factor'.
  --timeout INTEGER               Wait this many milliseconds before failing
  -n, --no-clobber                Skip images that already exist
  -o, --output TEXT               Just take shots matching these output files
  -b, --browser [chromium|firefox|webkit|chrome|chrome-beta]
                                  Which browser to use
  --browser-arg TEXT              Additional arguments to pass to the browser
  --user-agent TEXT               User-Agent header to use
  --reduced-motion                Emulate 'prefers-reduced-motion' media feature
  --log-console                   Write console.log() to stderr
  --fail                          Fail with an error code if a page returns an
                                  HTTP error
  --skip                          Skip pages that return HTTP errors
  --silent                        Do not output any messages
  --auth-password TEXT            Password for HTTP Basic authentication
  --auth-username TEXT            Username for HTTP Basic authentication
  --leave-server                  Leave servers running when script finishes
  --help                          Show this message and exit.