This page explains how to create a URL list and test your process for generating MD5 hashes. You can use Storage Transfer Service to transfer data from a list of public data locations to a Cloud Storage bucket. When you configure your transfer, you simply refer to the URL list.
Requirements
The following are requirements of URL lists:
-
The URL list must be a tab-separated values (TSV) file.
URLs must be sorted in UTF-8 lexicographical order.
The server sets a strong
Etag
header in the HTTP response when it returns the URL list.The URL list is publicly accessible from a URL beginning with either
http
orhttps
.
To ensure your data is transferable, verify the following:
That each URL you specify is publicly accessible.
For example, in Cloud Storage, you can share an object publicly and get a link to it.
The server's
robots.txt
file allows access to each URL.The server hosting each object:
- Supports
Range
requests. - Returns a
Content-Length
header in each response.
- Supports
Formatting the URL list
Do the following to format a URL list:
Create a tab-separated values (TSV) file.
Insert the format specifier
TsvHttpData-1.0
on the first line.Add additional lines for each object to transfer. Include the following tab-separated fields, in order, on each line:
The HTTP or HTTPS URL of a source object.
When an object located at
http(s)://[HOSTNAME]:[PORT]/[URL_PATH]
is transferred to Cloud Storage, the name of the object in Cloud Storage is[HOSTNAME]/[URL_PATH]
.The size of the object in bytes.
Ensure that the specified size matches the actual size of the object when it is fetched. If the size of the object received by Cloud Storage does not match the size specified, the object transfer will fail.
The base64-encoded MD5 checksum of the object.
Ensure that the specified MD5 checksum matches the MD5 checksum computed from the transferred bytes. If the MD5 checksum of the object received by Cloud Storage does not match the MD5 checksum specified, the object transfer will fail.
The following is a sample TSV file that specifies two objects to transfer:
TsvHttpData-1.0 https://example.com/buckets/obj1 1357 wHENa08V36iPYAsOa2JAdw== https://example.com/buckets/obj2 2468 R9acAaveoPd2y8nniLUYbw==