PURL help

What is a PURL?

A PURL is a persistent URL, it provides a permanent address to access a resource on the web. When a user retrieves a PURL they will be redirected to the current location of the resource. When an author needs to move a page they can update the PURL to point to the new location.

PURLs with a common prefix are grouped together into domains. Each domain has a single maintainer who can add new PURLs to the domain and make changes to existing PURLs within the domain.

PURL types

Each PURL has a target and status code or type. The target specifies where the PURL redirects to. The type is a status code from HTTP specification. The default PURL type is "302 Found", meaning that the object of the request was found elsewhere. This is the status code that should be used if there is not a reason to use a different one.

A partial PURL is a special type which will match the beginning of a URL. The PURL resolver will match as much of a PURL as it can and append the remainder to the end of the resolved URL. This reduces the need to create multiple PURLs to handle all of the resources that share a common location. For more on partials, see the section below the following table.

For more information, Wikipedia has brief summary of HTTP Status Codes and their meanings. For more much more detailed information, see the redirection section of RFC 9110.

PURL Type Meaning HTTP Shorthand
301 Moved permanently to a target URL Moved Permanently
302 Simple redirection to a target URL Found
303 See other URLs (use for Semantic Web resources) See Other
307 Temporary redirect to a target URL Temporary Redirect
404 Temporarily gone Not Found
410 Permanently gone Gone

Partial types and name resolution

As mentioned above, partials are a special type that only have meaning within the PURL service. They are not a standard. Though they can reduce the need to create many PURLs, they can also be a bit confusing as they can act a bit like wildcard forwarders. At a high level, PURLs are resolved as follows:
  1. a patron visits https://purl.archive.org/some/path/to/a/purl;
  2. the PURL service determines the correct domain to use within the patron-entered path;
  3. within that domain, a search is performed for a matching PURL according to the following logic:
    1. if there is an exact match based on the PURL's "name" field, that PURL will be used and patron forwarded to the "target" of that PURL.
    2. next there's a check for a case insensitive match, and if one is found, then the PURL with the first case insensitive match is used;
    3. finally, if there's a partial type that matches, the PURL with the longest partial type "name" field is used.
To help visualize this, consider the following example:
name type target created
/example-domain 302 Found http://example.org/a-domain-can-be-a-purl-too 2024-05-07 23:34:38
/example-domain/partial partial http://example.org/partial 2024-05-07 23:34:55
/example-domain/partial/something/specific 302 Found http://example.org/this-does-not-forward-to-the-partial-namespace 2024-05-07 23:37:59
/example-domain/partial/crazy/nested/partial partial http://example.org/nested-partial/destination 2024-05-07 23:38:50
In the above example domain, the following would be true:
  1. https://purl.archive.org/example-domain would forward only to http://example.org/a-domain-can-be-a-purl-too.
  2. https://purl.archive.org/example-domain/partial would forward to http://example.org/partial, but it would also forward anything after the name of our partial, which here is named "partial". Partials can be named anything. Only the type matters.
  3. https://purl.archive.org/example-domain/partial/123 is an example of this, and would forward to http://example.org/partial/123, because "123" was added to the end of the partial.
  4. https://purl.archive.org/example-domain/partial/this/is/weird would forward to http://example.org/partial/this/is/weird for the same reason.
  5. https://purl.archive.org/example-domain/partial/something/specific isn't treated as a partial, even though this occurs within the "example-domain/partial" namespace. It is an exact match of a PURL with a 302 redirect and would be treated as such. Therefore it would forward only to http://example.org/this-does-not-forward-to-the-partial-namespace, and the partial would not come into play at all.
  6. https://purl.archive.org/example-domain/partial/crazy/nested/partial/file5.tar.gz is a nested partial, and because it is longer than https://purl.archive.org/example-domain/partial, it would be matched first, so this would go to http://example.org/nested-partial/destination/file5.tar.gz.

Claiming a PURL domain

The PURL service is now administered by the Internet Archive. If you have any difficulty making changes to your PURLs please contact info@archive.org for assistance.

Administering PURLs

The PURL system is a service of the Internet Archive. To make changes to a PURL users need to have a user account with the Internet Archive.

Search for a PURL

PURLs are grouped into domains, domains can be searched from the home page.

The domain search shows a list of domains that match the search criteria. Each domain links to a domain details page.

Viewing the contents of a PURL domain

The domain details page displays the list of PURLs within the domain. This includes the name, redirect type and target for each PURL.

There is a form that can be used to add new PURLs.

Every PURL has a page that shows information about the PURL including the revision history. There is a link to the edit page.

Users can edit the type of PURL and the target URL.

Version

This is version 1.2.3.