HDFS Access via VOSpace
Introduction
PIC provides access to the distributed file system HDFS through a VOSpace server, following the IVOA standard described in VOSpace 2.1.
This service is an alternative to WebDAV access, allowing users to manage their data programmatically and in a structured way using tools compatible with the Virtual Observatory (VO) ecosystem. It is especially aimed at users who require data management operations such as reading, writing, moving, and metadata querying within a standardized environment.
How to connect to the service
VOSpace Endpoint
The VOSpace server is available at the following URL:
https://vospace.pic.es/vospace
Compatible clients
You can access the service using tools compatible with the VOSpace 2.1 standard:
- Install with pip:
pip install vos>=3.6.3
- curl:
- Can be used to perform HTTP operations following the examples defined in the VOSpace 2.1 standard, including node creation, file transfers, and property queries.
Authentication
The VOSpace server allows both anonymous and authenticated access. To access restricted data or personal space, users must specify their PIC username and password.
Usage Guidelines
Once connected to the service, users can perform a variety of operations defined by the VOSpace standard. Supported functionalities include:
- getProtocols: Query supported transfer protocols.
- getViews: View available data formats (views).
- getProperties: Obtain node properties.
- createNode: Create files or folders.
- getNode: Retrieve node information.
- deleteNode: Delete files or folders.
- moveNode: Move nodes inside the VOSpace.
- put / get: Upload or download files.
- pushToVoSpace / pullFromVoSpace: Transfer files via external URLs.
Best Practices
- Avoid using the ! character in file or folder names, as it may cause compatibility issues with the CADC vos client.
- Use tools that comply with the VOSpace standard to ensure compatibility and avoid transfer errors.
Troubleshooting
Cannot access the server:
- Verify the URL is correct:
https://vospace.pic.es/vospace
- Make sure PIC credentials are set correctly.
- Check client configuration (e.g., vos, curl).
Problems with properties:
- The setProperties operation is not allowed due to permission restrictions on HDFS.
Security and Data Management
- Protect PIC credentials. Do not share or store them in plain text.
- For automation, consider using secure credential managers or temporary storage.
Usage Examples
Using the CADC vos client (Python)
This server supports the CADC vos client and operations like listdir, mkdir, copy, move, and delete. Minimum version: vos >= 3.6.3.
Configuration
- Create a config file (e.g.
vos-config.ini
):
[vos] resourceID = https://vospace.pic.es/vospace http
- Export the file path as an environment variable:
export VOSPACE_CONFIG_FILE=/path/to/vos-config.ini
- Create
~/.netrc
for authentication:
machine https://vospace.pic.es/vospace login USERNAME password PASSWORD
- Set permissions:
chmod 600 ~/.netrc
Python usage examples
import vos
client = vos.Client()
# List contents files = client.listdir("https://vospace.pic.es/vospace/nodes/user/my_user/") print("Contents:", files)
# Create directory client.mkdir("https://vospace.pic.es/vospace/nodes/user/my_user/mydir")
# Upload file client.copy("localfile.txt", "https://vospace.pic.es/vospace/user/my_user/mydir/localfile.txt")
# Download file client.copy("https://vospace.pic.es/vospace/user/my_user/mydir/localfile.txt", "downloaded.txt")
# Move file client.move( "https://vospace.pic.es/vospace/user/my_user/mydir/localfile.txt", "https://vospace.pic.es/vospace/user/my_user/mydir/renamedfile.txt" )
# Get node info node = client.get_node("https://vospace.pic.es/vospace/nodes/user/my_user/mydir/renamedfile.txt") print("Properties:", node.props)
# Delete resource client.delete("https://vospace.pic.es/vospace/nodes/user/my_user/mydir")
Note: Ensure configuration is properly set.
Using curl
You can also interact with the server using curl or other HTTP tools.
These must follow the VOSpace 2.1 specification.
Important:
- Avoid using ! in URLs.
- Use full HTTP paths like
/vospace/nodes/...
.
VOSpace Server Endpoints
The server provides the following REST endpoints per the VOSpace 2.1 specification:
Method | Path | Description |
---|---|---|
GET | /vospace/protocols |
Retrieves supported transfer protocols |
GET | /vospace/views |
Retrieves available data views |
GET | /vospace/properties |
Retrieves node properties |
GET | /vospace/capabilities |
Retrieves server capabilities |
GET | /vospace/{job_id} |
Retrieves job information |
GET | /vospace/{job_id}/phase |
Retrieves job phase |
GET | /vospace/{job_id}/error |
Retrieves job error info |
GET | /vospace/{job_id}/results/transferDetails |
Retrieves transfer details |
GET | /vospace/{path} |
Retrieves a file/folder node |
GET | /vospace/{path} |
Downloads a file (streaming) |
PUT | /vospace/{path} |
Creates a file or folder |
PUT | /vospace/{path} |
Uploads a file |
POST | /vospace/ |
Moves or copies a node (returns 303 redirect) |
POST | /vospace/synctrans |
Push/pull transfer |
POST | /vospace/{path} |
Set properties (not supported) |
POST | /vospace/{job_id}/phase |
Change job phase |
DELETE | /vospace/{path} |
Deletes a file or folder |
Notes
{path}
and{job_id}
are dynamic parameters.- The server uses standard HTTP error codes: 400, 403, 404, 409, 500.
- PUT supports uploads and node creation.
- POST /vospace/ moves or copies nodes (responds with 303 See Other).
- setProperties is disabled due to HDFS permissions.
- File downloads return streamed content with GET.
- Avoid using ! in paths for full compatibility with the CADC client.