Transferring data to/from PIC

From Public PIC Wiki
Jump to navigation Jump to search

How to provide data access to PIC massive storage

This page explains how to configure rclone to access PIC storage.

The configuration depends on the storage path you need to access.

Storage access methods
Storage path Protocol rclone backend Supported authentication Example remote name
Paths under /pnfs/pic.es/data WebDAV / HTTPS webdav OIDC token pic-pnfs-oidc
Macaroon bearer token pic-pnfs-macaroon
Any other storage location SSH / SFTP sftp SSH with ControlMaster pic-storage

Requirements

You need:

  • rclone installed on your client machine.

For paths under /pnfs/pic.es/data, you also need one of the following:

  • an OIDC token setup using oidc-agent; or
  • a Macaroon token.

For any other storage location, you also need:

  • SSH access to the PIC host.
  • SSH ControlMaster configured on your client machine.

Install rclone

You can download the rclone binary directly without installing system packages. For example, on a Linux 64-bit machine:

$ curl -JLO https://downloads.rclone.org/rclone-current-linux-amd64.zip
$ unzip rclone-current-linux-amd64.zip

Alternatively, on Ubuntu you can install the Debian package:

$ cd /tmp
$ curl -JLO 'https://downloads.rclone.org/rclone-current-linux-amd64.deb'
$ sudo apt install ./rclone-current-linux-amd64.deb

Check that rclone works:

$ rclone version

Which configuration should I use?

Use the table below to choose the right configuration.

Choosing the right configuration
If you want to access... Use this section
A path under /pnfs/pic.es/data using OIDC authentication Configure rclone for PNFS data paths using OIDC
A path under /pnfs/pic.es/data using a Macaroon token Configure rclone for PNFS data paths using a Macaroon
Any other storage location Configure rclone for any other storage location

Configure rclone for PNFS data paths

Paths under /pnfs/pic.es/data are accessed using WebDAV over HTTPS.

For these paths, configure rclone with the webdav backend.

Use one of the following authentication methods:

  • OIDC token, retrieved automatically with oidc-token.
  • Macaroon token, pasted into rclone as a bearer token.

Do not configure username/password authentication unless PIC explicitly instructs you to do so.

Configure rclone for PNFS data paths using OIDC

Use this option if you have been instructed to authenticate with OIDC.

OIDC requirements

Before configuring rclone, make sure that:

  • oidc-agent is available.
  • You have configured an OIDC account, for example pic-pnfs.
  • The following command returns a token:
$ oidc-token pic-pnfs
eyJhbGciOiJSUzI1[...]4YjAwg

If this command does not work, see Configuring oidc-agent for OIDC tokens.

Configure the WebDAV remote using OIDC

Start the rclone configuration wizard:

$ rclone config

Create a new remote:

No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n

Enter name for new remote.
name> pic-pnfs-oidc

Select WebDAV:

Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
Storage> webdav

Enter the WebDAV base path (https://webdav.pic.es), if desired, you can also enter a subdirectory (https://webdav.pic.es/PATH_TO_YOUR_STORAGE_SPACE).

Example:

Option url.
URL of http host to connect to.
url> https://webdav.pic.es

Select vendor other:

Option vendor.
Name of the WebDAV site/service/software you are using.
vendor> other

Leave username empty:

Option user.
User name.
user>

Leave password empty:

Option pass.
Password.
y) Yes, type in my own password
g) Generate random password
n) No, leave this optional password blank
y/g/n> n

When asked for a bearer token, leave it empty:

Option bearer_token.
Bearer token instead of user/pass, for example a Macaroon.
bearer_token>

Edit the advanced configuration:

Edit advanced config?
y) Yes
n) No
y/n> y

Set bearer_token_command:

Option bearer_token_command.
Command to run to get a bearer token.
bearer_token_command> oidc-token pic-pnfs

Accept the remaining advanced options by pressing ENTER, unless PIC instructed you to change them.

At the end, review and accept the configuration:

Configuration complete.
Options:
- type: webdav
- url: https://webdav.pic.es
- vendor: other
- bearer_token_command: oidc-token pic-pnfs

Keep this "pic-pnfs-oidc" remote?
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d> y

Quit the rclone configuration menu:

e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q> q

OIDC configuration example

The resulting rclone configuration at ~/.config/rclone/rclone.conf should look similar to:

[pic-pnfs-oidc]
type = webdav
url = https://webdav.pic.es
vendor = other
bearer_token_command = oidc-token pic-pnfs

OIDC usage examples

List the configured remote root:

$ rclone lsd pic-pnfs-oidc:

List a directory below the configured path:

$ rclone lsd pic-pnfs-oidc:PROJECT_OR_DIRECTORY

List files recursively:

$ rclone ls pic-pnfs-oidc:PROJECT_OR_DIRECTORY

Download data:

$ rclone copy pic-pnfs-oidc:PROJECT_OR_DIRECTORY ./local-copy

Upload data:

$ rclone copy ./local-data pic-pnfs-oidc:PROJECT_OR_DIRECTORY

Example using a concrete path below the configured WebDAV root:

$ rclone copy ./ntuples pic-pnfs-oidc:analysis/ntuples

Configure rclone for PNFS data paths using a Macaroon

Use this option if you have been given a Macaroon token.

The Macaroon is used as a WebDAV bearer token.

Configure the WebDAV remote using a Macaroon

Start the rclone configuration wizard:

$ rclone config

Create a new remote:

No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n

Enter name for new remote.
name> pic-pnfs-macaroon

Select WebDAV:

Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
Storage> webdav

Enter the WebDAV base path (https://webdav.pic.es) or, if desired, you can enter a subdirectory under it (https://webdav.pic.es/PATH_TO_YOUR_STORAGE_SPACE).

Example:

Option url.
URL of http host to connect to.
url> https://webdav.pic.es

Select vendor other:

Option vendor.
Name of the WebDAV site/service/software you are using.
vendor> other

Leave username empty:

Option user.
User name.
user>

Leave password empty:

Option pass.
Password.
y) Yes, type in my own password
g) Generate random password
n) No, leave this optional password blank
y/g/n> n

Paste the Macaroon token as the bearer token:

Option bearer_token.
Bearer token instead of user/pass, for example a Macaroon.
bearer_token> YOUR_MACAROON_TOKEN

You normally do not need to edit the advanced configuration:

Edit advanced config?
y) Yes
n) No
y/n> n

At the end, review and accept the configuration:

Configuration complete.
Options:
- type: webdav
- url: https://webdav.pic.es
- vendor: other
- bearer_token: *** ENCRYPTED ***

Keep this "pic-pnfs-macaroon" remote?
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d> y

Quit the rclone configuration menu:

e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q> q

Macaroon configuration example

The resulting rclone configuration should look similar to:

[pic-pnfs-macaroon]
type = webdav
url = https://webdav.pic.es/pnfs/pic.es/data/PATH_TO_YOUR_STORAGE_SPACE
vendor = other
bearer_token = YOUR_MACAROON_TOKEN

Macaroon usage examples

List the configured remote root:

$ rclone lsd pic-pnfs-macaroon:

List a directory below the configured path:

$ rclone lsd pic-pnfs-macaroon:PROJECT_OR_DIRECTORY

List files recursively:

$ rclone ls pic-pnfs-macaroon:PROJECT_OR_DIRECTORY

Download data:

$ rclone copy pic-pnfs-macaroon:PROJECT_OR_DIRECTORY ./local-copy

Upload data:

$ rclone copy ./local-data pic-pnfs-macaroon:PROJECT_OR_DIRECTORY

Example using a concrete path below the configured WebDAV root:

$ rclone copy ./ntuples pic-pnfs-macaroon:analysis/ntuples

Configure rclone for any other storage location

Use this section for storage paths that are not under /pnfs/pic.es/data.

These locations are accessed using SSH/SFTP.

Because SSH authentication may require browser-based authentication, you should configure SSH ControlMaster. This allows rclone to reuse an already-authenticated SSH connection.

Configure SSH ControlMaster

On your client machine, edit:

~/.ssh/config

Add an entry for the PIC SSH host. For example:

Host pic-ssh
    HostName ui04.pic.es
    User YOUR_PIC_USERNAME
    ControlMaster auto
    ControlPath ~/.ssh/cm-%r@%h:%p
    ControlPersist yes

Replace:

  • ui04.pic.es with the SSH host provided by PIC, if different.
  • YOUR_PIC_USERNAME with your PIC username.

Instead of restricting this configuration to pic-ssh, you can enable it for all SSH connections:

Host *
    ControlMaster auto
    ControlPath ~/.ssh/cm-%r@%h:%p
    ControlPersist yes

However, using a specific host entry is usually clearer and safer.

Open the first SSH connection

Before using rclone, open one SSH connection manually:

$ ssh pic-ssh

You may see a message similar to:

(tallada@ui04.pic.es) Authenticate at https://idp.pic.es/realms/PIC/device?user_code=FPTB-HKEV and press ENTER.

Open the URL in your browser, complete the authentication, then return to the terminal and press ENTER.

After this first authentication, subsequent SSH connections to the same host should reuse the existing ControlMaster session.

Test this with:

$ ssh pic-ssh
Last login: Tue Apr 14 13:25:33 2026 from 10.212.134.205
[tallada@ui04 ~]$

You can close this shell after testing.

Configure the SFTP rclone remote

Other storage locations use the rclone sftp backend.

Because we want rclone to reuse the SSH ControlMaster connection, configure the remote to use the external ssh command.

First, locate your rclone configuration file:

$ rclone config file

The command prints something like:

Configuration file is stored at:
/home/YOUR_LOCAL_USER/.config/rclone/rclone.conf

Edit that file and add:

[pic-storage]
type = sftp
ssh = ssh pic-ssh
shell_type = unix
known_hosts_file = ~/.ssh/known_hosts

The important line is:

ssh = ssh pic-ssh

This tells rclone to use your system SSH command, which can reuse the SSH ControlMaster connection.

SFTP configuration example

Example rclone configuration:

[pic-storage]
type = sftp
ssh = ssh pic-ssh
shell_type = unix
known_hosts_file = ~/.ssh/known_hosts

SFTP usage examples

List the remote root:

$ rclone lsd pic-storage:

List a specific storage path:

$ rclone lsd pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

List files recursively:

$ rclone ls pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

Download data:

$ rclone copy pic-storage:/PATH_TO_YOUR_STORAGE_SPACE ./local-copy

Upload data:

$ rclone copy ./local-data pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

Example using a concrete absolute path:

$ rclone copy ./results pic-storage:/storage/projects/MY_PROJECT/results

General rclone usage

The same basic rclone commands can be used with all configured remotes.

Replace REMOTE_NAME with one of the remotes you configured, for example:

  • pic-pnfs-oidc
  • pic-pnfs-macaroon
  • pic-storage

List a remote directory

$ rclone lsd REMOTE_NAME:PATH

Examples:

$ rclone lsd pic-pnfs-oidc:
$ rclone lsd pic-pnfs-macaroon:
$ rclone lsd pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

List files recursively

$ rclone ls REMOTE_NAME:PATH

Examples:

$ rclone ls pic-pnfs-oidc:PROJECT_OR_DIRECTORY
$ rclone ls pic-pnfs-macaroon:PROJECT_OR_DIRECTORY
$ rclone ls pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

Download a remote directory

$ rclone copy REMOTE_NAME:PATH LOCAL_PATH

Examples:

$ rclone copy pic-pnfs-oidc:PROJECT_OR_DIRECTORY ./local-copy
$ rclone copy pic-pnfs-macaroon:PROJECT_OR_DIRECTORY ./local-copy
$ rclone copy pic-storage:/PATH_TO_YOUR_STORAGE_SPACE ./local-copy

Upload a local directory

$ rclone copy LOCAL_DIR REMOTE_NAME:PATH

Examples:

$ rclone copy ./local-data pic-pnfs-oidc:PROJECT_OR_DIRECTORY
$ rclone copy ./local-data pic-pnfs-macaroon:PROJECT_OR_DIRECTORY
$ rclone copy ./local-data pic-storage:/PATH_TO_YOUR_STORAGE_SPACE

Recommended upload flags

For uploads, we recommend:

--check-first -P --stats-one-line --transfers N_TRANSFERS --size-only

For many small files, N_TRANSFERS can be up to 350.

Example:

$ rclone copy ./local-data pic-pnfs-oidc:PROJECT_OR_DIRECTORY \
  --check-first -P --stats-one-line --transfers 350 --size-only

If uploading into directories with many files, for example more than 1000 files, also use:

--no-traverse

If uploading files larger than 200 MB, also use:

--multi-thread-streams 1

If uploading very large files, for example larger than 10 GB, also use:

--timeout 15m

Full example for a large upload to a path under /pnfs/pic.es/data:

$ rclone copy ./large-dataset pic-pnfs-oidc:PROJECT_OR_DIRECTORY \
  --check-first -P --stats-one-line --transfers 350 --size-only \
  --no-traverse \
  --multi-thread-streams 1 \
  --timeout 15m

Full example for a large upload to another storage location:

$ rclone copy ./large-dataset pic-storage:/PATH_TO_YOUR_STORAGE_SPACE \
  --check-first -P --stats-one-line --transfers 350 --size-only \
  --no-traverse \
  --multi-thread-streams 1 \
  --timeout 15m

For more information, see the rclone documentation:

https://rclone.org/docs/

Configuring oidc-agent for OIDC tokens

This section is only needed if you use OIDC authentication for paths under /pnfs/pic.es/data.

Make sure oidc-agent is available.

Load oidc-agent

Initialize oidc-agent in the terminal session:

$ eval `oidc-agent`

If the account is already configured, load it with:

$ oidc-add pic-pnfs

Configure the OIDC account

This step only needs to be done once.

You need an updated version of oidc-agent, version greater than 5.0.0.

Ask your PIC contact for the client secret and replace XXXXXXXXXXXXXXXXXX below.

Configure a pic-pnfs account to retrieve tokens from PIC:

$ oidc-gen -m --client-id CLIENT_ID_PROVIDED_BY_PIC \
  --client-secret XXXXXXXXXXXXXXXXXX \
  --pub --flow=device \
  --discovery-endpoint=https://idp.pic.es/realms/PIC/.well-known/openid-configuration \
  --scope="openid profile offline_access" \
  --redirect-uri=edu.kit.data.oidc-agent:/ pic-pnfs

The command will show a URL and a code:

No account exists with this short name. Creating new configuration ...
Generating account configuration ...
accepted

Using a browser on any device, visit:
https://idp.pic.es/realms/PIC/device

And enter the code: ASDF-GHJK
Alternatively you can use the following QR code to visit the above listed URL.

[ QR CODE ]

Enter encryption password for account configuration 'pic-pnfs':
Confirm encryption password:
Everything setup correctly!

Open the URL in your browser, enter the code, authenticate, then return to the terminal.

You will be asked to enter an encryption password twice. You will need this password when refreshing or reloading the oidc-agent account.

Reauthenticate if the refresh token has expired

If the oidc-agent process is restarted, or if your refresh token expires due to inactivity, you may need to reauthenticate:

$ oidc-gen --reauthenticate pic-pnfs

Example output:

Enter decryption password for account config 'pic-pnfs':
Generating account configuration ...
accepted

Using a browser on any device, visit:
https://idp.pic.es/realms/PIC/device

And enter the code: ASDF-GHJK
Alternatively you can use the following QR code to visit the above listed URL.

[ QR CODE ]

Enter encryption password for account configuration 'pic-pnfs' [***]:
Everything setup correctly!

Test OIDC token retrieval

After loading and configuring oidc-agent, test that you can retrieve a token:

$ oidc-token pic-pnfs
eyJhbGciOiJSUzI1[...]4YjAwg

If this command works, the rclone configuration using:

bearer_token_command = oidc-token pic-pnfs

should also work.

Obtaining a Macaroon for /pnfs/pic.es/data

This section is intended for contacts or administrators who need to generate Macaroons for users.

Macaroons are valid for up to 7 days.

The restricted path should be under /pnfs/pic.es/data.

Example:

RESTRICTED_PATH=pnfs/pic.es/data/PATH_TO_YOUR_STORAGE_SPACE

Read-only Macaroon

Use this for downloading data only.

The Macaroon will allow:

  • listing directories;
  • downloading files.
$ curl -u ${USER} -X POST -H 'Content-Type: application/macaroon-request' \
  -d '{"caveats": ["activity:DOWNLOAD,LIST"], "validity": "P7D"}' \
  https://door04.pic.es:8460/${RESTRICTED_PATH}

Example response:

{
    "macaroon": "MDA2MGxvY2F0aW",
    "uri": {
        "targetWithMacaroon": "https://door04.pic.es:8460/${RESTRICTED_PATH}?authz=MDA2MGxvY2F0aW",
        "baseWithMacaroon": "https://door04.pic.es:8460/?authz=MDA2MGxvY2F0aW",
        "target": "https://door04.pic.es:8460/${RESTRICTED_PATH}",
        "base": "https://door04.pic.es:8460/"
    }
}

Give the value of the macaroon field to the user. The user should paste it into rclone as the WebDAV bearer_token.

Upload Macaroon

Use this for uploading data.

The Macaroon will allow full permissions on the requested path.

$ curl -u ${USER} -X POST -H 'Content-Type: application/macaroon-request' \
  -d '{"validity": "P7D"}' \
  https://door04.pic.es:8460/${RESTRICTED_PATH}

Example response:

{
    "macaroon": "MDA2MGxvY2F0aW",
    "uri": {
        "targetWithMacaroon": "https://door04.pic.es:8460/${RESTRICTED_PATH}?authz=MDA2MGxvY2F0aW",
        "baseWithMacaroon": "https://door04.pic.es:8460/?authz=MDA2MGxvY2F0aW",
        "target": "https://door04.pic.es:8460/${RESTRICTED_PATH}",
        "base": "https://door04.pic.es:8460/"
    }
}

Give the value of the macaroon field to the user. The user should paste it into rclone as the WebDAV bearer_token.

Troubleshooting

SSH authentication is requested repeatedly

This applies to any storage location accessed through SSH/SFTP.

Check that:

  • You have opened the first SSH connection manually with ssh pic-ssh.
  • Browser authentication was completed successfully.
  • Your ~/.ssh/config contains ControlMaster, ControlPath, and ControlPersist.
  • Your rclone remote uses the external SSH command:
ssh = ssh pic-ssh

OIDC authentication fails

This applies to paths under /pnfs/pic.es/data accessed using OIDC.

Check that oidc-agent is running:

$ eval `oidc-agent`

Check that the account is loaded:

$ oidc-add pic-pnfs

Check that you can retrieve a token manually:

$ oidc-token pic-pnfs

If the token cannot be retrieved, reauthenticate:

$ oidc-gen --reauthenticate pic-pnfs

Macaroon authentication fails

This applies to paths under /pnfs/pic.es/data accessed using a Macaroon.

Check that:

  • The Macaroon was copied completely.
  • The Macaroon has not expired.
  • The Macaroon was generated for the correct path under /pnfs/pic.es/data.
  • The Macaroon has the required permissions, for example read-only or upload permissions.

If the Macaroon has expired, request or generate a new one.