NFT Metadata

Retrieving NFT metadata is a complex and time-consuming process. It requires reading individual smart contracts, understanding and working with various encodings and edge cases, crawling external resources stored on public HTTP or IPFS servers, and dealing with often unreliable or unresponsive third-party servers. Keeping metadata up-to-date is an even more complex task, as it requires constantly re-indexing millions of tokens.

Mnemonic solves these problems by indexing on-chain and off-chain NFT metadata in real time. It normalizes the metadata and keeps it up-to-date by constantly re-indexing changed metadata on a global scale of the entire NFT universe. This makes it easy for developers and users to access and work with NFT metadata, regardless of the underlying blockchain or storage format.

Understanding metadata

Metadata is possibly one of the most critical pieces of data for builders creating NFT-related experiences. What people don’t see is just how deceptively difficult it is to fetch and build with NFT metadata, especially at scale. Check out our high level overview about the complexities of building with NFT metadata.

Before we jump into technical details, it is important to outline certain basics first to set the baseline.

Heterogeneous metadata standards and formats

While there are certain standard proposals that are commonly accepted by the developer community, it is not guaranteed that a smart contract implementation follows these standards.

Particularly, because none of these standards are enforceable it makes it impossible to build any deterministic approach that would yield predicable results.

In essence, an NFT metadata can be any text or binary data stored on-chain or off-chain.

Note: It is also important to note that NFT is not guaranteed to have any metadata at all.

After crawling hundreds of millions of NFTs (210 million NFTs were crawled by Mnemonic at the time of writing), we identified the following most common metadata formats:

On-chain JSON, HTML, JavaScript or text documents (plain or base64 encoded);
On-chain SVG (plain or base64 encoded);
Off-chain JSON, HTML, JavaScript or text documents (plain or base64 encoded);
Off-chain SVG (plain or base64 encoded);
Off-chain PNG, JPEG, MP4, MP3 and other image/video/audio formats;
Off-chain binary data (often either encrypted documents or archives)

Aside from that, there are tens of thousands of malformed JSON documents, URLs, and non-existent or invalid resources (such as a domain where the metadata used to be stored no longer exists).

With Mnemonic API you don't have to worry about any of that, we've got you covered.

Dynamic metadata

Some contracts implement dynamic metadata, such that under certain conditions on-chain or by external interaction with the contract the metadata changes.

Most commonly:

In-game assets obtain a new property based on the game/character progression.
A new property or trait is added by the creators of a collection.
A new property or trait is obtained by the holder based on certain conditions.
Metadata is being revealed after a collection was fully minted.

This creates a challenge, because it requires constant crawling of the metadata to ensure that it is up-to-date.

Mnemonic constantly crawls all NFTs across the blockchains at different paces based on various characteristics to provide the most up-to-date metadata to developers.

Missing metadata

As mentioned above, it is not guaranteed that every NFT has metadata, although, in such cases, there is a valid question to be raised as to whether an NFT is valid in the first place if it lacks metadata.

There are other scenarios in which metadata may be missing or not available via Mnemonic API:

The domain where the metadata was initially stored no longer exists or is not accessible due to other reasons.
The remote server where the metadata was stored is not available (either it was removed, is faulty, or is not able to serve HTTP requests anymore).
The tokenURI or URI methods are not implemented by the NFT smart contract.
The contract no longer exists.

Mnemonic makes its best attempt to crawl such metadata over a certain period of time by automatically revisiting these resources. However, after several unsuccessful attempts the crawler will abandon these URLs.

We flag such collections as not "trust-worthy".

Mnemonic schema

Mnemonic normalizes metadata from all the disparate formats into a common schema to ensure consistency and to make it easy to consume by the clients.

// Metadata holds decoded NFT metadata as defined by the `tokenURI` of the NFT
// contract.
//
// The decoded part of the metadata holds only those fields which are defined
// as part of the token metadata specification (such as ERC721 Metadata JSON
// Schema, ERC-1155 Metadata URI JSON Schema etc.).
message Metadata {
    // Details about original metadata resource returned by the `tokenURI`
    // (ERC-721) or `uri` (ERC-1155) methods of the contract that tracks this
    // NFT.
    //
    // In a special case, when this object is a part of the list-type endpoint
    // and contract returns URI longer than 2kB, the `uri` attribute will be
    // returned as an empty string, while `mimeType` will be filled with
    // corresponding MIME-type. To get full URI in this case – use dedicated
    // "NFT Details" endpoint.
    mnemonic.uniform.types.base.v1beta2.UriResource metadata_uri = 1;

    // Token name.
    string name = 2;

    // Token description.
    string description = 3;

    // Token image (if provided in the metadata).
    mnemonic.uniform.types.base.v1beta2.MediaUri image = 4;
}

// UriResource holds details about a resource (such as metadata, image, etc).
//
// A `uri` value in this message is not normalized and is returned as is like it was
// defined in the contract.
//
// Clients should follow the `mime_type` in order to properly handle a resource that
// this message represents.
//
// The `mime_type` holds the mime-type of an enclosed content, and not of an envelope type. Such that,
// if the `uri` represents a `base64` resource (either on-chain or remotely stored on IPFS or a server) the
// `mime_type` value will hold the mime-type of the content that is encoded by `base64`. For example, if
// a `uri` is a `base64` of a JSON content, the `mime_type` will be `application/json`.
message UriResource {
    // Original URI (may lead to HTTP(s), IPFS, base64 encoded data, etc).
    string uri = 1;

    // Detected MIME-type of the resource.
    string mime_type = 2;
}

// MediaUri holds the details about NFT media (image, animation etc), as well as provides
// a cached version if applicable.
//
// Caching is currently provided only for the following image types: `image/png`,
//  `image/jpeg`, `image/gif`, `image/svg+xml`. For any other types the original
// uri is provided.
message MediaUri {
    // A URI to the cached media resource (empty for unsupported media types).
    string uri = 1;

    // A original URI (may lead to HTTP(s) or IPFS). If this URI is a data URI or
    // an encoded SVG we will intentionally omit this field in the list responses
    // in order to prevent large payload sizes. In such cases, a `uri` should
    // be used instead which provides a cached version.
    //
    // This field is always populated in the full token metadata response.
    string uri_original = 2;

    // A MIME-type of the media resource.
    string mime_type = 3;
}

{
    "type": "object",
    "title": "Metadata",
    "properties": {
        "metadata_uri": {
            "$ref": "#/definitions/UriResource"
        },
        "name": {
            "type": "string",
            "description": "Token name."
        },
        "description": {
            "type": "string",
            "description": "Token description."
        },
        "image": {
            "$ref": "#/definitions/MediaUri",
            "description": "Token image (if provided in the metadata)."
        }
    },
    "definitions": {
        "UriResource": {
            "type": "object",
            "properties": {
                "uri": {
                    "type": "string",
                    "description": "Original URI (may lead to HTTP(s), IPFS, base64 encoded data, etc)."
                },
                "mime_type": {
                    "type": "string",
                    "description": "Detected MIME-type of the resource."
                }
            }
        },
        "MediaUri": {
            "type": "object",
            "properties": {
                "uri": {
                    "type": "string",
                    "description": "A URI to the cached media resource (empty for unsupported media types)."
                },
                "uri_original": {
                    "type": "string",
                    "description": "An original URI (may lead to HTTP(s) or IPFS)."
                },
                "mime_type": {
                    "type": "string",
                    "description": "A MIME-type of the media resource."
                }
            }
        }
    }
}

{
    "metadata": {
        "metadataUri": {
            "uri": "ipfs://QmPMc4tcBsMqLRuCQtPmPe84bpSjrC3Ky7t3JWuHXYB4aS/2596",
            "mimeType": "application/json"
        },
        "name": "Doodle #2596",
        "description": "A community-driven collectibles project featuring art by Burnt Toast. Doodles come in a joyful range of colors, traits and sizes with a collection size of 10,000. Each Doodle allows its owner to vote for experiences and activations paid for by the Doodles Community Treasury. Burnt Toast is the working alias for Scott Martin, a Canadian–based illustrator, designer, animator and muralist.",
        "image": {
            "uri": "https://ethereum.cdn.mnemonichq.com/0x8a90cab2b38dba80c64b7734e58ee1db38b8992e/2596/cc4aac0ab67cb60c67ef864c3ac5acd1febd3d6a23fffb8141fd8123ab67353f.png",
            "uriOriginal": "ipfs://QmcU7hF7r7oinete7skd15bmDXHR6khoJtkQtDaPLqU7iq",
            "mimeType": "image/png"
        }
    }
}

Full API definitions are available in Buf repository.

Token URIs

The URIs obtained from the smart contracts (commonly from tokenURI or URI methods) are normalized into the following schema:

// A `uri` value in this message is not normalized and is returned as is like it was
// defined in the contract.
//
// Clients should follow the `mime_type` in order to properly handle a resource that
// this message represents.
//
// The `mime_type` holds the mime-type of an enclosed content, and not of an envelope
// type. Such that, if the `uri` represents a `base64` resource (either on-chain or
// remotely stored on IPFS or a server) the `mime_type` value will hold the mime-type
// of the content that is encoded by `base64`. For example, if a `uri` is a `base64`
// of a JSON content, the `mime_type` will be `application/json`.
message UriResource {
    // Original URI (may lead to HTTP(s), IPFS, base64 encoded data, etc).
    string uri = 1;

    // Detected MIME-type of the resource.
    string mime_type = 2;
}

Image/Media URIs

The URIs obtained from the metadata that contain links to images are normalized into the following schema:

// MediaUri holds the details about NFT media (image, animation etc), as well as
// provides a cached version if applicable.
message MediaUri {
    // A URI to the cached media resource (empty for unsupported media types).
    string uri = 1;

    // An original URI (may lead to HTTP(s) or IPFS). If this URI is a data URI
    // or an encoded SVG we will intentionally omit this field in the list
    // responses in order to prevent large payload sizes. In such cases, a `uri`
    // should be used instead which provides a cached version.
    //
    // This field is always populated in the full token metadata response.
    string uri_original = 2;

    // A MIME-type of the media resource.
    string mime_type = 3;
}

A uri field will always point to a cached version of the image, while uri_original provides an original URI as parsed from the metadata.

Note: If an original_uri contains an on-chain SVG it will be trimmed in some responses where NFTs are returned as a list to reduce the response size.

URI normalization

Sometimes, retrieved URIs (from tokenURI or URI) may be pinned to some public IPFS gateway or represent a malformed URL (i.e. https://https://somedomain.com., yes, we've seen that too and much more).

All such URIs are automatically cleaned and normalized by the Mnemonic crawler, and served in the correct format in the API.

Particularly, all pinned URIs that point to public IPFS gateways such as https://ipfs.io/ipfs/QmeSjSinHpPnmXmspMjwiXyN6zS4E9zccariGR3jxcaWtq/131 will always be provided as ipfs://QmeSjSinHpPnmXmspMjwiXyN6zS4E9zccariGR3jxcaWtq/131 in the API responses.

All HTTP URLs are also cleaned and normalized to a canonical form that is uniformly supported by all HTTP clients and libraries.

Examples

Below are a few examples of the normalized Metadata provided by the Mnemonic API.

Example (ArtBlocks)

{
  "metadata": {
      "metadataUri": {
          "uri": "https://api.artblocks.io/token/235000341",
          "mimeType": "application/json"
      },
      "name": "Maps for grief #341",
      "description": "Maps for Grief explores the hyper-connected existence, filled with individuals who share common intents, ideals and feelings, but tend to live their sentiments in isolation. Its compositions oppose areas of fluid movement and immobility, revealing the structure of the piece itself: two interacting force fields that dictate the direction, as well as the strength – or absence – of movement.On the keyboard: Press A to have the lines gently animate. Use the 1/2/3 keys to inspect the fields that make up each iteration.",
      "image": {
          "uri": "https://ethereum.cdn.mnemonichq.com/0xa7d8d9ef8d8ce8992df33d8b8cf4aebabd5bd270/235000341/3eec3e3a57f2fef90c49bfa926da5b813f5674f7437d96f791dfbf7aff6ee8a2.png",
          "uriOriginal": "https://media-proxy.artblocks.io/0xa7d8d9ef8d8ce8992df33d8b8cf4aebabd5bd270/235000341",
          "mimeType": "image/png"
      }
  }
}

Example (BAYC)

Original IPFS URIs provided as defined by the token metadata.

{
  "metadata": {
    "metadataUri": {
        "uri": "ipfs://QmeSjSinHpPnmXmspMjwiXyN6zS4E9zccariGR3jxcaWtq/2087",
        "mimeType": "application/json"
    },
    "name": "",
    "description": "",
    "image": {
        "uri": "https://ethereum.cdn.mnemonichq.com/0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d/2087/e7e47109f0dc797166f7615a3ff6e002092f760c98bfb499738a376fb9e91552.png",
        "uriOriginal": "ipfs://QmYhUX5fjigN2HgGmq3AcEtzVjSX3iR4EjKEoWCMTkwb6g",
        "mimeType": "image/png"
    }
  }
}

Note: name and description are provided empty because these fields are not defined in the original token metadata.

Example (CryptoPunks)

In many cases the metadata, including images, is stored on-chain. CryptoPunks, for example, is encoded as base64 string.

Clients should pay attention to the mimeType property in the response. Mnemonic detects the type of an encoded data (on-chain or off-chain) to make it easier for the client to decode and display this data.

{
  "metadata": {
      "metadataUri": null,
      "name": "321",
      "description": "",
      "image": {
          "uri": "",
          "uriOriginal": "ZGF0YTppbWFnZS9zdmcreG1sO3V0ZjgsPHN2ZyB4bWxucz0iaHR0cDovL3d3dy5...<trimmed>",
          "mimeType": "image/svg+xml;base64"
      }
  }
}

In the example above, the image is a base64 encoded SVG, as defined by the mime-type image/svg+xml;base64. In such cases, a client simply needs to decode base64 string obtained from the uriOriginal and render an SVG image.

Media caching

Ensuring NFT media loads quickly and reliably within your product can be tricky because most NFTs are stored and served by IPFS or third party servers. To eliminate slow load times and potential time-outs or other media accessibility issues, Mnemonic caches NFT media and serves it through a CDN close to your clients so you can provide the best possible experience to your users.

Our endpoints that return NFT media in the response return both cached media URIs (when available) and original URIs

Supported media types: .png, .gif, .webp, .jpg, .svg.
Media sizing: returned cached media has a default maximum dimension of 1600px.

Metadata freshness

Mnemonic's crawler uses various heuristics to identify when metadata needs to be re-indexed. The most common use case is when metadata is revealed only after the entire collection has been minted, or in a game where an object has obtained some new traits or properties.

Mnemonic constantly crawls metadata changes to deliver the most up to date results.