Seeding data
Populate preview databases with seed data or a production dump — once — using PULLPREVIEW_FIRST_RUN and state that persists across deploys.
Staging environments usually need data to be useful. PullPreview preserves the state of your environment between deployments to the same pull request, so any data loaded once persists across redeploys — the only question is how to load it the first time. This guide covers seeding from framework seeds and restoring a production dump, gated so it runs only once. The examples below target the Compose deployment target.
How state persists
PullPreview keeps Docker volumes between deployments on the same pull request. Once you load data into a database volume, it stays there across every redeploy of that PR. That means you only need to seed on the very first deployment.
To make first-run logic easy, PullPreview sets PULLPREVIEW_FIRST_RUN to true on the first deployment to an instance and false on every deployment after that. This variable, along with the other PULLPREVIEW_* variables, is written to /etc/pullpreview/env on the server and is available to your pre-script and for Compose interpolation. See environment variables for the full list.
Seeding from framework seeds
If your framework ships a seed task (for example Rails db:seed), run it as a one-off Compose service with a restart: on-failure policy. The service runs once, retrying only if it fails, then exits.
services: db: image: postgres web: build: . command: bundle exec rails s depends_on: [db, seeder] seeder: command: bundle exec rails db:seed restart: on-failure depends_on: [db]Because state persists between deployments, the seeded data remains available on subsequent deploys. If running db:seed again would create duplicate data, gate the command on PULLPREVIEW_FIRST_RUN (see the dump example below for the pattern).
Seeding from a production dump
For more realistic previews, restore a dump of your production database. There are two approaches.
Restore manually over SSH
Admins can SSH into the preview server, so after the first deploy you can copy a dump up and restore it. The SSH user is ec2-user on AWS Lightsail and root on Hetzner — adjust the commands accordingly. For a Postgres service named db:
scp my-dump.gz ec2-user@SERVER_IP:/tmp/zcat /tmp/my-dump.gz | docker compose exec -u postgres db pg_restore -d DBNAMEThe aws CLI is preinstalled on every preview server, so you can also pull the dump straight from S3 on the server instead of copying it from your machine.
Auto-fetch from S3 and restore on first run
To fully automate this, fetch the dump in your workflow before the PullPreview step, then have a seeder service restore it only on the first run.
Add a step to your workflow that downloads the dump into a directory that your Compose file mounts:
# .github/workflows/pullpreview.yml — extra step before the pullpreview step- name: Fetch dump env: AWS_ACCESS_KEY_ID: "${{ secrets.AWS_ACCESS_KEY_ID }}" AWS_SECRET_ACCESS_KEY: "${{ secrets.AWS_SECRET_ACCESS_KEY }}" run: | mkdir -p dumps/ aws s3 cp s3://my-backup-bucket/latest-dump.gz dumps/Then define a seeder service that restores the dump only when PULLPREVIEW_FIRST_RUN is true:
services: seeder: image: postgres command: '[ "$PULLPREVIEW_FIRST_RUN" = "true" ] && pg_restore -h db -d DBNAME /dumps/latest-dump.gz' restart: on-failure volumes: - ./dumps:/dumps depends_on: [db]Since PULLPREVIEW_FIRST_RUN is true only on the first deployment, the restore is skipped on every later deploy while the data stays in place.
Next steps
- See environment variables for the full list of
PULLPREVIEW_*variables available during deploys. - The pre-script page covers another place to run first-run seeding logic, gated the same way on
PULLPREVIEW_FIRST_RUN. - For more on troubleshooting previews, see the FAQ.