Semantics (and Metadata) at the New York Times

Home > Future of Newspapers, Semantic Web > Semantics (and Metadata) at the New York Times

Semantics (and Metadata) at the New York Times

October 17, 2009 glennas Leave a comment Go to comments

***** Nov 10 2009 Update:
I have uploaded a summary doc of the NY Times presentation. Please click the following link to access: Semantics at The New York Times – notes – SemTech 2009
*****

Yet another great presentation from the SemTech 2009 conference this past June in San Jose. This presentation is on Semantics at the New York Times.

Here is a slide presentation that the New York Times delivered at a different conference, but it’s very similar to the one delivered at SemTech.

The (Long) History of Metadata at the New York Times

The presentation starts out exploring the history of metadata at the New York Times, from the beginnings of their Morgue archive which was created at the newspaper’s inception in, if you can believe, 1851. The so-called Morgue was not a collection of corpses (thank goodness), but rather a collection of newspaper clippings and photos.

No subject was too big or small to be indexed in the Morgue. As the Times VP of Digital Production Rob Larson states in the presentation, in 1907 the Times’ Managing Editor Carr Van Anda invested in the Morgue to add staff and rigor of organization to the files, and a Tagging system grew up around this effort.

At the Morgue’s zenith a few decades ago, the Morgue had a staff of 24 persons, creating 600 new clip folders per week, cutting up 36 editions of the final New York city edition of the Times, as well as copies of other prominent newspapers.

Within its main operation on the third floor, there were more than 4,000 cabinet drawers of newspaper clippings, containing 1,126,000 named individuals (including animals, etc), 65,000 subject headings, 300,000 ships and planes, 500,000 places, and 500,000 corporations. (Wow!)

The Morgue is only one form of tagging system used at the Times – others include the New York Times Index and the NYTimes.com website.

So what is the Tagging workflow at the New York Times?

A few slides to show from the presentation. The first slide depicts the tagging workflow at the New York Times, and what roles apply metadata at what step in the workflow.

Tagging at the NY Times

This visual oversimplifies the underlying complexity of the application of metadata, however, in the editorial workflow. Here’s a very-hard-to-read workflow diagram of the stages at which metadata is applied in the NY Times – which suggests the overall complexity of the end-to-end workflow, to both Print and Online channels.
Tagging Workflow at the Times

Why Tag?

Another core visual is shown below, which summarizes the motivation for tagging – that is the various use cases for metadata-tagged content at the Times.

Tagging - Use Cases

Rob Larson specifically addresses the importance of metadata for generating NY Times Topic Pages, 4 examples of which are provided below:

Topic Pages - NY Times

The Future

Next the presenters address the future of metadata (and now the talk turns more to “semantics”) at the NY Times.

What near-term plans does the Times have for evolving their metadata management practice? See the slide below:

Metadata Opportunities

Next up the presenters discusses the New York Times’ various Open Data initiatives, and the APIs the Times is making avaiable to the public to access and build applications on top of its data.

New York Times and Linked Data

Finally, the New York Times announced at SemTech the next phase of their Open Data strategy, which is to prepare their Corpus to be exposed to the Linked Data Cloud.

Interesting stuff.

glenn

Categories: Future of Newspapers, Semantic Web Tags: Future of Newspapers, Metadata, New York Times, News Media, Open Linked Data, Semantic Web

Comments (0) Trackbacks (2) Leave a comment Trackback

No comments yet.

October 19, 2009 at 12:14 am

Semantics (and Metadata) at the New York Times | Digital Asset Management
November 5, 2009 at 7:54 am

How to Solve 5 Common Web Publishing Mistakes | Digital Tonto

End of Business as Usual – Glenn's blog

Semantics (and Metadata) at the New York Times

The (Long) History of Metadata at the New York Times

So what is the Tagging workflow at the New York Times?

Why Tag?

The Future

New York Times and Linked Data

Share this:

Related

Leave a comment Cancel reply

Categories

Advertising and Marketing

Architecture

Business Strategy and Innovation

Citizen/Community Journalism

Cloud Computing

Commerce

Content Management

Content Strategy

Data Architecture & Analysis

Design

Favorite News Sources

Funny

Information Architecture

Interesting and Creative

Investing and Economy

Local

Media and Content

Media and Culture

Mobile

News Media and Journalism

Politics

Product Management

Search Marketing & SEO

Semantic Web

Social Business

Social Media/Social Web

Structured/Linked Data

Technology News

Trendwatching

Visual Thinking

Archives

Meta