Skip to content

Bug: Setting Up a Chromium User Profile Docker Desktop Windows Perms Errors #1585

Open
@pilotkrc

Description

Provide a screenshot and describe the bug

archivebox.io has a Working Around Sites that Block Archiving section to archive sites that actively work against snapshots such as websites behind Cloudflare. This includes a few different methods but for this bug report, the CHROME_USER_DIR link that demonstrates how to setup a Chromium profile.

My instance is running via docker compose on Docker Desktop Windows and this works fine until Step 3 of the provided Docker VNC guide but then hits a dead-end. This is the docker compose run archivebox /usr/bin/chromium-browser --user-data-dir=/data/personas/Default/chrome_profile --profile-directory=Default --disable-gpu --disable-features=dbus --disable-dev-shm-usage --start-maximized --no-sandbox --disable-setuid-sandbox --no-zygote --disable-sync --no-first-run command which I attempted to run via the docker desktop's terminal.

I have tried to troubleshoot already using:

  • uncommenting and setting PUID and PGID as default, then 1000 and 100 and then from Windows SID (this might be irrelevant apparently according to something from Devilbox documentation that Docker Windows uses SMB to mount volumes?)
  • searching extensively for the Chromium SingletonLock file within the externally-linked dir and docker container files (both in the data and home directory configs for container)
  • double checking the compose has the matching lines from Step 2 uncommented and copied exactly and repeating from Step 1 with a fresh container install
  • setting image to 0.8.5rc51 version and retrying instructions

Steps to reproduce

  1. Open docker-compose.yml, uncomment personas directory and link to match archivebox directory on host, add CHROME_USER_DATA_DIR and DISPLAY variables to the env. Check and ensure the novnc is matching besides the host port which was changed but confirmed to be accessible.
  2. In the docker desktop containers terminal use docker compose up -d novnc
  3. Use docker compose run archivebox /usr/bin/chromium-browser --user-data-dir=/data/personas/Default/chrome_profile --profile-directory=Default --disable-gpu --disable-features=dbus --disable-dev-shm-usage --start-maximized --no-sandbox --disable-setuid-sandbox --no-zygote --disable-sync --no-first-run
  4. errors

Logs or errors

[7:7:1104/042514.718415:ERROR:process_singleton_posix.cc(335)] Failed to create /data/personas/Default/chrome_profile/SingletonLock: Permission denied (13)
[7:7:1104/042514.719296:ERROR:chrome_main_delegate.cc(597)] Failed to create a ProcessSingleton for your profile directory. This means that running multiple instances would start multiple browser processes rather than opening a new window in the existing process. Aborting now to avoid profile corruption.

ArchiveBox Version

ArchiveBox v0.7.2 COMMIT_HASH=315c9f3 BUILD_TIME=2024-04-24 22:47:02 1713998822
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.9         valid     /usr/local/bin/python3.11                                 

 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py               

 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.11/site-packages/django/__init__.py

 √  ARCHIVEBOX_BINARY     v0.7.2          valid     /usr/local/bin/archivebox                                 


 √  CURL_BINARY           v8.5.0          valid     /usr/bin/curl                                             

 √  WGET_BINARY           v1.21.3         valid     /usr/bin/wget                                             

 √  NODE_BINARY           v20.12.2        valid     /usr/bin/node                                             

 √  SINGLEFILE_BINARY     v1.1.46         valid     /app/node_modules/single-file-cli/single-file             

 √  READABILITY_BINARY    v0.0.11         valid     /app/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /app/node_modules/@postlight/parser/cli.js                

 √  GIT_BINARY            v2.39.2         valid     /usr/bin/git                                              

 √  YOUTUBEDL_BINARY      v2023.12.30     valid     /usr/local/bin/yt-dlp                                     

 √  CHROME_BINARY         v124.0.6367.29  valid     /usr/bin/chromium-browser                                 

 √  RIPGREP_BINARY        v13.0.0         valid     /usr/bin/rg                                               


[i] Source-code locations:
 √  PACKAGE_DIR           23 files        valid     /app/archivebox                                           

 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates                                 

 -  CUSTOM_TEMPLATES_DIR  -               disabled  None                                                      


[i] Secrets locations:
 √  CHROME_USER_DATA_DIR  1 files         valid     ./personas/Default/chrome_profile                         

 -  COOKIES_FILE          -               disabled  None                                                      


[i] Data locations:
 √  OUTPUT_DIR            6 files @       valid     /data                                                     

 √  SOURCES_DIR           1 files         valid     ./sources                                                 

 √  LOGS_DIR              1 files         valid     ./logs                                                    

 √  ARCHIVE_DIR           1 files         valid     ./archive                                                 

 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                         

 √  SQL_INDEX             208.0 KB        valid     ./index.sqlite3    

How did you install the version of ArchiveBox you are using?

Docker (or other container system like podman/LXC/Kubernetes or TrueNAS/Cloudron/YunoHost/etc.)

What operating system are you running on?

Windows (including WSL, WSL2, Docker Desktop on Windows)

What type of drive are you using to store your ArchiveBox data?

  • data/ is on a local SSD or NVMe drive
  • data/ is on a spinning hard drive or external USB drive
  • data/ is on a network mount (e.g. NFS/SMB/CIFS/etc.)
  • data/ is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/OneDrive, etc.)

Docker Compose Configuration

services:
    archivebox:
        image: archivebox/archivebox:latest
        ports:
            - 8100:8000
        volumes:
            - ./data:/data
            - ./data/personas/Default/chrome_profile/Default:/data/personas/Default/chrome_profile/Default
        environment:
            # - ADMIN_USERNAME=admin            # create an admin user on first run with the given user/pass combo
            # - ADMIN_PASSWORD=SomeSecretPassword
            - CSRF_TRUSTED_ORIGINS=https://archivebox.example.com  # REQUIRED for auth, REST API, etc. to work
            - ALLOWED_HOSTS=*                   # set this to the hostname(s) from your CSRF_TRUSTED_ORIGINS
            - PUBLIC_INDEX=True                 # set to False to prevent anonymous users from viewing snapshot list
            - PUBLIC_SNAPSHOTS=True             # set to False to prevent anonymous users from viewing snapshot content
            - PUBLIC_ADD_VIEW=False             # set to True to allow anonymous users to submit new URLs to archive
            - SEARCH_BACKEND_ENGINE=sonic       # tells ArchiveBox to use sonic container below for fast full-text search
            - SEARCH_BACKEND_HOST_NAME=sonic
            - SEARCH_BACKEND_PASSWORD=SomeSecretPassword
            # - PUID=911                        # set to your host user's UID & GID if you encounter permissions issues
            # - PGID=911                        # UID/GIDs <500 may clash with existing users and are not recommended
            # - MEDIA_MAX_SIZE=750m             # increase this filesize limit to allow archiving larger audio/video files
            # - TIMEOUT=60                      # increase this number to 120+ seconds if you see many slow downloads timing out
            # - CHECK_SSL_VALIDITY=True         # set to False to disable strict SSL checking (allows saving URLs w/ broken certs)
            # - SAVE_ARCHIVE_DOT_ORG=True       # set to False to disable submitting all URLs to Archive.org when archiving
            # - USER_AGENT="..."                # set a custom USER_AGENT to avoid being blocked as a bot
            - CHROME_USER_DATA_DIR=/data/personas/Default/chrome_profile
            - DISPLAY=novnc:0.0
            # ...
            # add further configuration options from archivebox/config.py as needed (to apply them only to this container)
            # or set using `docker compose run archivebox config --set SOME_KEY=someval` (to persist config across all containers)
        # For ad-blocking during archiving, uncomment this section and pihole service section below
        # networks:
        #   - dns
        # dns:
        #   - 172.20.0.53

    archivebox_scheduler:
        
        image: archivebox/archivebox:latest
        command: schedule --foreground --update --every=day
        environment:
            - TIMEOUT=120                       # use a higher timeout than the main container to give slow tasks more time when retrying
            # - PUID=502                        # set to your host user's UID & GID if you encounter permissions issues
            # - PGID=20
        volumes:
            - ./data:/data
        # cpus: 2                               # uncomment / edit these values to limit scheduler container resource consumption
        # mem_limit: 2048m
        # restart: always

    sonic:
        image: valeriansaliou/sonic:latest
        expose:
            - 1491
        environment:
            - SEARCH_BACKEND_PASSWORD=SomeSecretPassword
        volumes:
            - ./sonic.cfg:/etc/sonic.cfg:ro    # use this if you prefer to download the config on the host and mount it manually
            - data-sonic:/var/lib/sonic/store

    novnc:
        image: theasp/novnc:latest
        environment:
            - DISPLAY_WIDTH=1920
            - DISPLAY_HEIGHT=1080
            - RUN_XTERM=no
        ports:
            # to view/control ArchiveBox's browser, visit: http://127.0.0.1:8080/vnc.html
            # restricted to access from localhost by default because it has no authentication
            - 127.0.0.1:8180:8080

networks:
    # network just used for pihole container to offer :53 dns resolving on fixed ip for archivebox container
    dns:
        ipam:
            driver: default
            config:
                - subnet: 172.20.0.0/24

volumes:
  data-sonic:

ArchiveBox Configuration

SECRET_KEY = ***

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions