Metadata – Discoverability and Liability

metadataI am going to assume you have a basic idea of what metadata is. For our purposes, we can borrow from wikipedia and define it as “the information used to search and locate an object such as title, author, subjects, keywords, publisher.” Although we typically associate metadata with the digital online world, metadata actually predates computers and the Internet. Do you remember all that information on the little 3X5 cards in the library’s card catalog? That was metadata.

If you don't remember card catalogues, first, go ask your parents, and second, stop making us feel old.

If you don’t remember card catalogues, 1.) go ask your parents, and 2.) stop making us feel old.

In ye olden days, that information helped us find the books we were looking for in the library, after we’d walked five miles in the snow, uphill both ways, to get there. Today, metadata is even more crucial for finding books. For authors that means metadata is critical to BUYERS finding your books online. Metadata is part of what search engines, such as those at Google or Amazon, use to locate results; crucially, it is the part that publishers can control.

A Story Related at RWA

During a workshop at the recent Romance Writers of America meeting in Atlanta, I was reminded of the importance of metadata to a book’s success by a story told by literary agent Kristin Nelson. One of her authors had a three book series that was doing ok on Amazon, but not as good as they had expected. Even more strangely, the second book in the series was selling more copies than the first book in the series.

When she investigated, she discovered that the second book had more tags and that the books within the series did not have matching tags.

This series had been traditionally published. So, she and the author did not have the ability to change the tags themselves. They had to convince the publisher to fix them – and that apparently took quite some time, even with the agent providing the publisher with the exact tags they needed to put on Amazon. Once the tags had been fixed, the series jumped 200+ spots to #1, 2 and 3 on the relevant bestseller list.

The importance of metadata in selling more books is hopefully something you are already familiar with if you self-publish. If you publish traditionally, then the story related by Nelson should convey to you the importance of not only metadata but also staying on top of what your publisher is (or is not) doing with your book.

Nelson took pains not to make it sound as if the publishers did not care about metadata. Even while she was struggling to get her author’s publisher to correct the metadata, the people at the publishing house acknowledged that fixing the metadata would help sales. The problem was two-fold. First, publishers simply aren’t structured for digital. They don’t have the tech people they need. Their personnel and corporate infrastructure from slush pile to publication is not shaped for digital. They are, and have been for a long time, structured for print. (See this recent post by Passive Guy for more confirmation of how the big publishers aren’t built for digital.)

Second, the numbers on metadata are pretty staggering once you examine them. If a publisher puts out 2,500 books a year, and each of the has 10 different files of metadata (one for each of the different online retailers), that’s 25,000 different profiles that have to be updated.  And the numbers only grow when you consider they would need to do it for each book they have for sale, not just the ones published in a particular year. I seem to remember her characterizing the reluctance as a “if we do it for you, we’ll have to do it for everyone” attitude, but I don’t want to quote her.

Tips for Metadata

Metadata? How do they work?

How do they work?

It is not my goal to make this post your one stop source for metadata advice. It’s not something I’m an expert in, but I would feel remiss if I did not pass on some of the advice I heard at RWA and came across online.  The gist of that is the importance of selecting keywords. Instead of going with what you think, actually take the time to research what actually gets searched by real users. You can, for example, search the frequency of Google searches for particular words and phrases here. Unfortunately, it looks like Google is phasing that tool out very soon. Those wanting to search frequencies in the future will need to create an Adwords account and use the new Keyword Planner tool. With Amazon, it is a more trial and error approach. You type in words in the appropriate search box (e.g. in the Kindle Store) and see what comes up. For both Google and Amazon, the auto-complete feature of their search boxes can provide you with useful information about the frequency of certain words or phrases.

Liability – Trademark Infringement Through Metadata

So, how does law come into this? For that, we need to know a little bit about trademarks.

A trademark is an identifier of the source of goods or services – in other words, a trademark tells the consumer who made something. An example would be Coca-Cola. A trademark can be infringed in one of two ways: by causing confusion or by causing dilution. Dilution, at least under federal law, only applies if the trademark is famous. Confusion is by far the more common theory used to find infringement. Trademark infringement can result in civil liability (and even criminal liability in cases of counterfeiting) and the granting of injunctive relief.

Because a brutal killing machine can be trusted on his choice of his cola...

Because a brutal killing machine can be trusted on his choice of cola…

In a confusion analysis, a court asks whether the use of a similar trademark by the alleged infringer is likely to confuse a consumer as to the source of the goods. If I made my own soda and then put Coca-Cola or Cola-Coca on the label, white cursive script on a red background, then a court would probably conclude that a consumer might be confused as to the source of my cola. In other words, the court would probably say that someone would think my cola was actually made (or authorized) by the real Coca-Cola. That would mean I had infringed Coca-Cola’s trademark among other things. The analysis is a little more developed than that, a list of factors to be considered, etc., but it is more than you need to know the purposes of this post.

How does this relate to metadata?

In the middle of the last decade, many search engines relied on metadata hidden in the code of websites in order to find relevant links for searchers. This in turn led many websites to put other companies’ trademarks into their own metadata. This practice resulted in web searches for specific trademarks returning not only the trademark holder’s site, but also competitors’ sites that had buried the trademark in their metadata.

And that led to trademark lawsuits. These cases turned on two complicated legal issues: 1.) whether putting the trademark into  metadata unseen by the consumer was a “use in commerce” for the purpose of the trademark infringement analysis and 2.) whether initial interest confusion could serve as a basis for liability. Don’t worry about those details.

Before the law could become settled, everyone moved from that type of metadata. Search engines moved away from relying on it because it was too easy to abuse and thus not all that useful. Once search engines stopped caring, we stopped seeing the litigation. That means that we never really got a clear answer on whether the use of a trademark in unseen metadata infringed the rights in that trademark.

Instead, the litigation moved on to what could be gamed – purchased adwords. Since the late 2000’s to the present, we’ve seen a number of cases raising the same complicated issues in the context of adwords sold by search engines, primarily Google. The doctrine is still unsettled.

Regardless of that uncertainty, or even the probability that results are trending strongly away from liability, the key for you, the indie publisher in control of their own Amazon metadata, is that lawsuits get litigated! Remember how much we want to avoid that, regardless of the end result.

Further, litigants have shown themselves to be pretty irrational in these suits, spending huge amounts of money over metadata uses that have little to no impact on their actual bottom lines. Eric Goldman, one of the experts in this type of litigation (if not THE expert), has said the following in the context of the 1-800 Contacts litigation:

…economically irrational and socially wasteful litigation, such as plaintiffs who spend over a million dollars in legal fees on a problem that, at most, is worth tens of thousands of dollars…


With this list in mind, you can see why I hate the 1-800 Contacts v. lawsuit. 1-800 Contact has spent enormous amounts on legal fees—at least $650k as of 2010–pursuing for competitive keyword ads that had generated $20 in profit for (no, that’s not a typo) and, at maximum, a few tens of thousands of dollars in revenue for affiliates.


After 6 years in court, the case isn’t over yet. This week, the Tenth Circuit affirmed most of the district court’s opinion and emphatically rejected most of 1-800 Contacts’ lawsuit against for the competitive keyword advertising it and its affiliates did. However, a small issue got remanded for a jury trial, so the parties will get the pleasure of wasting many tens of thousands of dollars more to conduct the jury trial unless they can finally find a way to settle.

The seeming irrationality of trademark owners pursuit of these types of claims is probably a result of the accepted wisdom that failure to protect trademark rights can result in losing them, but the reason for the litigiousness is irrelevant for our purposes. What is important is that the risk of getting sued is probably greater than we would like.

What does this mean in its simplest terms – think twice about using metadata that is someone else’s trademark or that could cause confusion, and then don’t do it.

Note also that this sort of misuse of metadata is something that the relevant intermediaries – e.g. Amazon – frown upon. Amazon’s Metadata Guidelines specifically prohibit the mention of another’s trademark or the unauthorized use of another author’s name or book title. That means even if no one chooses to sue you, Amazon could pull your book. Remember the modified second question for self publishers.

Take away: Metadata is critically important, but don’t try to get cute and game the system with someone else’s trademark, name, or title.

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s