ArcoDataHub, part of the Italian AI Factory (IT4LIA), offers an innovative and efficient way to access live-updated ARCO (Analysis-Ready Cloud-Optimized) datasets across diverse domains. Here's everything you need to know to start working with our datasets.
Datasets are published in ARCO (Analysis-Ready Cloud-Optimized) formats including Zarr, COG, GeoParquet, and FlatGeobuf, with live updates (like weather radar data updated every 5 minutes). Every user has access controls to ensure data security and proper usage quotas.
We'll show you how to obtain and use your authentication credentials, but let's start by understanding the basic workflow for accessing data through ArcoDataHub.
The first step to accessing ArcoDataHub is creating your account. Our registration process is designed to be simple and secure.
Once your account is set up, you can request access to specific datasets. Our approval process ensures data is used for legitimate research and educational purposes.
Explore our catalog of ARCO datasets spanning meteorology, agriculture, cybersecurity, and other domains.
Use our request form to apply for access, including your research purpose and institutional details.
The easiest way to get started with ArcoDataHub is using Python and Xarray. Make sure you have Python set up and install the required tools.
pip install xarray zarr dask aiohttp requests
Once your access is approved, you'll receive your personal access credentials. Here's how to use them to access datasets.
import xarray as xr
import requests
# Your personal access credentials (from your account page)
username = "your_username"
access_key = "your_access_key"
# Example: Access a dataset
dataset_url = f"https://{username}:{access_key}@api.arcodatahub.com/S3/dataset_name.zarr"
ds = xr.open_dataset(dataset_url, engine="zarr")
# Display dataset information
print(ds.info())
# Access specific variables
if 'temperature' in ds:
temperature = ds['temperature']
print(f"Temperature data shape: {temperature.shape}")
For more convenient access, you can configure your credentials using environment variables or configuration files:
# Using environment variables
import os
import xarray as xr
username = os.getenv('ARCODATAHUB_USERNAME')
access_key = os.getenv('ARCODATAHUB_ACCESS_KEY')
# Configure Xarray with storage options
storage_options = {
"client_kwargs": {"trust_env": True}
}
ds = xr.open_dataset(
"https://api.arcodatahub.com/S3/dataset_name.zarr",
storage_options=storage_options,
chunks={},
engine="zarr"
)
Now that you're set up with ArcoDataHub and ready to access live-updated ARCO datasets, here are some suggested next steps:
If you have a dataset in mind that could benefit the community, let us know! Write us an email by addressing it to the registration email. We welcome suggestions for new datasets and collaborations.
Do you know someone who could benefit from ArcoDataHub? Spread the word!
If you encounter any issues or have questions about using ArcoDataHub, don't hesitate to reach out to our support team. We're here to help you make the most of Italian AI Factory's diverse ARCO data resources.