Context first

A unified field theory of publishing

(This post, which is significantly longer than most others here, covers my prepared remarks for an October 21 presentation at the Internet Archive’s “Books in Browsers” conference.  I expect we’ll be able to also share the accompanying presentation materials in the not-too-distant future.)

In August, around the time that the Internet Archive began to assemble the program for this conference, David Wilk invited several people, including Bob Stein and me, to join him in Stamford for one in a series of discussions he calls “Publish the Future”.

After the session ended, Bob and I somehow wound up comparing notes about our proposed topics for “Books in Browsers”.  I was proud of my idea, and I particularly liked the title, “A unified field theory of publishing.”

Given Bob’s experience and depth of thought in our part of the universe, I was a little nervous showing the proposal to him, but he had shown me his, so…  He looked mine over and in full seriousness asked, “Is that REALLY the title of your talk?”  And I said, “Yes, it is.”

And then Bob told me that he gave a talk with exactly the same title two years ago.

Fortunately, and not surprisingly, Bob’s earlier talk took a different angle – broadly, how publishing roles might evolve and blend in a networked era.  Today, I’m talking about something else – the damage done by what I call the “container model of publishing” – but it’s also fitting that I could borrow the title from Bob, as I’ll start to explain in a moment.

Before I do that, though, my idea in a nutshell is this: book, magazine and newspaper publishing is unduly governed by the physical containers we have used for centuries to transmit information.  Those containers define content in two dimensions, necessarily ignoring that which cannot or does not fit.

Worse, the process of filling the container strips out context – the critical admixture of tagged content, research, footnoted links, sources, audio and video background, even good old title-level metadata – that is a luxury in the physical world, but a critical asset in digital ones.  In our evolving, networked world – the world of “books in browsers” – we are no longer selling content, or at least not content alone.  We compete on context.

I propose today that the current workflow hierarchy – container first, limiting content and context – is already outdated.  To compete digitally, we must start with context and preserve its connection to content.

We need to think about containers as an option, not the starting point.  Further, we must start to open up access, making it possible for readers to discover and consume our content within and across digital realms.

Without a shift in mindset, we are vulnerable to a range of current and future disruptive entrants.  Containers limit how we think about our audiences.  In stripping context, they also limit how audiences find our content.

Here, scale is not our friend.  It may well be the enemy.  As Clay Christensen first outlined in 1997, disruptive technologies don’t look or feel like what we typically value.  Often enough, they are cheaper, simpler, smaller and more convenient than their traditional analogues.

Already, smaller, more nimble digital upstarts have reversed the paradigm.  They start with context, vital to digital discoverability and trial, and use it to strengthen content.  Many startups forego containers, or they create them only as a rendering of personal (consumer) preference.

Think Craiglist.  Think Monster.  Think Cookstr, a born-digital food site that started with and continues to evolve its taxonomy.  Context first.

As barriers to entry have fallen, I’ve started to think more about how traditional book, magazine and newspaper publishers can survive in a digital era.  There are both new and non-traditional established entrants across most publishing segments.  Their successes have pushed traditional publishers to look at ways to change business models and organize around customers.

It is time to see our publishing brethren – newspapers and magazines – as part of a disrupted continuum that affects us all.  Digital makes convergence not only possible; digital has made convergence inevitable.  Marketers have become publishers; publishers are marketing arms; new entrants are a bit of both.  Customers have become alternately competitors, partners and suppliers.

A few minutes ago I mentioned that Bob Stein and I share the title of this talk.  That overlap is a coincidence that reminded me of a passage from Salman Rushdie’s 1990 book, Haroun and the Sea of Stories.

In the book, Haroun sets off to find stories for his father, who has lost his ability to tell tales. Along the way, Haroun comes across Iff, the Water Genie, who at first does not treat Haroun kindly.  But at a low point, the Water Genie relents and starts to tell Haroun …

“… about the Ocean of the Streams of Story, and even though he was full of a sense of hopelessness and failure, the magic of the Ocean began to have an effect on Haroun.  He looked into the water and saw that it was made up of a thousand thousand thousand and one different currents, each one a different color, weaving in and out of one another like a liquid tapestry of breathtaking complexity …”

I’ll stop there.  We’ll return to this story in a bit, but for the moment I’d like to use it as a jumping-off point, a call for us to:

Imagine a world in which content authoring and editing tools are cheap, or even free.

Imagine a world in which storage is plentiful, even virtual.

And imagine a world in which content can be disseminated in a range of formats, at the figurative or literal push of a button.

That world exists today, with literally dozens of credible, widely accessible tools and resources.  These authoring, repository and distribution tools and resources make it possible for anyone to create, manage and disseminate digital as well as physical content.

The thing is, while that world is already here, it is far from evenly distributed.

A couple of years ago, in a discussion with Laura Dawson and Mike Shatzkin, I sketched out a version of this somewhat basic diagram of the kinds of content best served by the use of XML.

The typical winners are in the upper right – genres, like cooking, that have many components, or “chunks”, and a higher probably of being recombined or reused.

Our problem is, we’re not the only ones looking at these markets.

While publishers think of agile workflows as an opportunity to drive down the cost of making content for containers, a newer breed of “born-digital” competitors have started with context.  These new entrants are developing taxonomies and tools so that they can invade the same niches we thought we were making more efficient.

The challenge is not just being digital; it’s being demonstrably relevant to the audiences who now turn first to digital to find content.

New entrants – our real competition – start with the customer.  They develop contextual frameworks that help them differentiate both readers and themselves.  The new guys like the new tools because they are cheap, scalable and open-source.  In fact, they are already exploiting tools that many traditional publishers lament are “just too hard to learn”.

How did we get here?  There’s a reason.

In their physical forms, newspapers, magazines and books establish the boundaries of both content and context.  Historically focused on containers, we have become stuck using them as the primary source for digital content.

Only after we fill the physical container do we turn our attention to rebuilding the digital roots of content: the context, including tags, links, research and unpublished material, that can get lost on the cutting-room floor.

Most of that context never makes it back.  We have taken to using things like title-level metadata, some search engine optimization and occasionally effective use of syndication as proxies for something contextually rich.

Competing as we are against the “born-digital”, that’s not nearly enough.

Further, we treat readers as if their needs can be defined by containers.  But in a digital world, search takes place before physical sampling, much more often than the reverse.  Readers may at times look for a specific product, but more often they search for an answer, a solution, a spark that turns into an interest and perhaps a purchase.

Publishers are in the business of linking content to markets, but we’re hamstrung at search because we’ve made context the last thing we think about.

When content scarcity was the norm, we could live with a minimum of context.  In a limited market, our editors became skilled in making decisions about what would be published.  Now, in an era of abundance, editors have inherited a new and fundamentally different role: figuring out how “what is published” will be discovered.

To serve that new role, we must reverse our publishing paradigm.  We need to start with context and develop and maintain rich, linked, digital content.

We also need to use the tools we have (as well as ones we have yet to develop) to make containers an output of digital workflows, not the source of content in those workflows.  This is a fundamental change in our approach, but it is the only way that I see to compete in a digital-first, content-abundant universe.

And I don’t think that this change in mindset (or workflow) will come easily.

Over time, we have adopted a series of mental models that constrain our ability to change.  The long history of using physical containers to distribute content, for example, has led us to conflate “format” with “brand”.

Perhaps there was a time when the physical nature of content products – their look and feel – dominated.  But in a digital era, I think that its time has passed.

In a similar way, we often speak of digital content as a derived or secondary use.  The recent debate about e-book rights underscores how deeply this bias runs.  Who “owns” e-book rights is a different topic, but the Open Road and Wylie dust-ups were telling for the question that was not asked: who owns the context that drives discoverability, use and value in a digital realm?

In a digital era, context supports discoverability, use and re-use.  Investing in context is now a requirement.

Unfortunately, our product focus and an obsession with scale lead us to worry more about finding ways to reduce costs. We think of making the physical object incrementally better, optimizing the creation, production and delivery of content in a single package.

Along the way, we miss opportunities to create agile, discoverable and accessible content.

I call this situation “container myopia”, paying homage to Ted Levitt’s 1960 article, “Marketing myopia”.  In the article, Levitt called on marketers to shift from a product-centered to a customer-centered paradigm.  He famously showed how railroad companies failed to see that they were in the transportation business, much as publishers have struggled to see that they are in the content solutions business.

In a digital realm, true content solutions are increasingly built with open APIs, something containers are pretty bad at.  APIs – application programming interfaces – provide users with a roadmap that lets them customize their content consumption.

The physical forms of books, magazines and newspapers have analog forms of APIs.  We’ve all figured out how to access the information contained in these physical products.  But, the physical form itself does not always make for a good API, something that Craigslist, the Huffington Post, Cookstr and others have capitalized on.

Open up your API, I contend, or someone else will.

Many current audiences (and all future ones) live in an open and accessible environment.  They expect to be able to look under the hood, mix and match chunks of content and create, seamlessly, something of their own.  Failure to meet those needs will result in obscurity, at best.

To illustrate that point, I want to bring you to perhaps the most hierarchical, inaccessible, closed environment I know of: an American public high school.  In particular, I’d like to take you to Columbia High School in Maplewood, New Jersey, where our youngest son, Charlie, is now a junior.  The school opened in 1927, and it has not changed much since then.

Last summer, Charlie learned (happily) that he had earned a 5 on the AP Art History exam.  This made him eligible to serve as a sort of teaching assistant for this year’s Art History class.  All he needed to do was align his free period with the scheduled slot for Art History.

I don’t know how many of you have tried to parse a high-school scheduling API.  It seems to rely on green-screen devices, stacks of forms and a queuing process that means you won’t have your new schedule in hand until two weeks after the start of the school year.

On a Friday in July, Charlie came home to find his junior-year schedule in the mail.  His free period did NOT align. Charlie has seen his brother and sister fight the powers that be at Columbia High School, at times unsuccessfully, and he decided to pursue a different course.

Lacking access to the master schedule, he went to a free resource – Facebook – posted his schedule there and asked anyone who attended Columbia High School to do the same.

By Sunday morning, he had gathered enough data to compile his own master schedule.  With this information in hand, he rearranged his classes, filled out a home-made “change form” and sent it to the high school on Monday morning.  “Please give me this schedule”, it said.  Problem solved.

Stories like this one, as well as everything Kirk Biglione says about DRM, have led me to see piracy as the consequence of a bad API. 16 years olds expect access, or they invent it.  The future of content involves giving readers access to the rules, tools and opportunities of contextually rich content, so that they can engage with it on their own terms.

And whether they say it just like this or not, readers WANT good APIs.

Content is no longer just a product.  It’s part of a value chain that solves readers’ problems.

Readers expect publishers to point them to the outcomes or answers they want, where and when they want them.  We’re interested in content solutions that don’t waste our time, a precious commodity for all of us.

Perhaps most daunting: readers expect that their content solutions will improve over time.  They don’t care that much (or at all) about how it happens.

Companies that are good at aggregating solutions will reduce the time and hassle involved in finding and buying something.  Those firms have a leg up on their competitors.

Drawn from the prescient “lean consumption” model that James Womack and Daniel Jones debuted half a decade ago, these ideas are evident in aggregators like Amazon.  They’re embodied in services like Kobo and Kindle.  They’re not just products; they’re solutions.

So, if containers are now an option, and content must be made accessible, what is the role of context?

First, let’s establish a context of our own: Freed from physical constraints, we no longer have to write to length.  We can link; we can expand; we can annotate.

As low- or no-cost authoring, repository and distribution tools and resources become freely available, it is axiomatic that ours has become and will remain an era of content abundance.

Simply: content abundance is the precursor to the development (and maintenance) of context.

When there was only the Gutenberg Bible, we didn’t need Dewey.  When booksellers were smaller and largely independent, we didn’t have much need for BISAC codes.  And before online sales made almost every book in print evident and available, ONIX was an unattended luxury.

Digital abundance is pushing us to create much more than title-level metadata.  To manage abundance, we can (and do) use blunt instruments, like verticals, or somewhat more elegant tools, like search engines.

But when it comes to discovery, access and utility, nothing substitutes for authorial and editorial judgment, as evidenced in the structural and contextual tags applied to our content.

Context can’t be just a preference or an afterthought any more.  Early and deep tagging is a search reality.  In structural terms, our content fits search conventions, or it will not be referenced.

And in contextual terms, our content needs to be deeply and consistently tagged, or it will face an increasingly tough time being found.

We can’t afford to build context into content after the fact.  Doing so irrevocably truncates the deep relationships that authors and editors create and often maintain until the day, hour or minute that containers render them impotent.  Building back those lost links is redundant, expensive and ultimately incomplete.

This isn’t a problem of standards.  At Indiana University, Jenn Riley and Devin Becker have vividly illustrated our abundance of contextual frameworks.  The problem we face, the one we avoid at our peril, is implementing these standards.

Ultimately, that’s a function of workflow.

If strategy is a head, I liken workflow to a circulatory system.  We all know how hard it can be to change organizational direction, but in practice, it’s a matter of coordination.  Decide you want to go somewhere else, and your head tells your arms and legs to swing one way or another.

If you want to change workflow, though, you are looking at the publishing equivalent of a heart transplant.  And starting with context requires publishers to make a fundamental change in their content workflows.

At a time when we struggle to create something as simple as a clean ONIX feed, planning for and preserving connections to content is a challenge of significant proportion.  And we don’t have much time to get this new challenge right.

Although the precise changes in workflow will vary by publisher, certain principles apply.  I think moving from a mindset of “product” to “service” or “solutions” means at least four things for publishers:

• Our content must become open, accessible and interoperable.  Adherence to standards will not be an option;
• Because we compete on context, we’ll need to focus more clearly on using it to promote discovery;
• Because we’re competing with businesses that already use low- and no-cost tools, trying to beat them on the cost of content is a losing proposition.  We need to develop opportunities that encourage broader use of our content; and
• We will distinguish ourselves if we can provide readers with tools that draw upon context to help them manage abundance.

Clearly, we’ll need new skill sets to compete in an era of abundance.  We’ll probably have to add a lot more training than we have ever done internally.  But those aren’t the toughest challenges.  Changing workflow is.

I want to leave you on a stronger, happier note than that, though.  Change can be hard, and we all need reasons to try something different or new.  A short while ago, I asked you to leave Haroun and join me in a leap of imagination.

I’d like to travel back to the Sea of Stories, where the Water Genie is explaining to Haroun that …

“… these were the Streams of Story, and that each colored strand represented and contained a single tale.  Different parts of the Ocean contained different sorts of stories, and as all the stories that had ever been told and many that were still being invented could be found here, the Ocean of the Streams of Story was in fact the biggest library in the universe.  And because the stories were held here in fluid form, they retained the ability to change, to become new versions of themselves, to join up with other stories and so become yet other stories; so that unlike a library of books, the Ocean of the Streams of Story was much more than a storeroom of yarns.  It was not dead, but alive.”

Like Haroun, we in publishing can sometimes become filled with a sense of hopelessness and failure.  But like Haroun, we have been taken in by the magic of the Ocean.  And like Haroun, we’re perched atop a tapestry of breathtaking complexity.

Undeniably, it is a time of remarkable opportunity in publishing, one in which we are able to find and build upon those strands of stories, in context.

Yes, we face a significant challenge preparing for a very different world, but it is a challenge I think we have the insight and experience to meet.  What we choose to do now will begin to determine who writes those countless other stories, as well as which stories get told.

With that, I’ll close my tag.  This story is ending, but I hope that it spurs some stories of your own. As it does, I ask that you all think more about context, and that you continue to imagine.

Thank you for your time this afternoon, and thank you to the Internet Archive for hosting this conversation.

(With special thanks to Peter Brantley, Kirk Biglione, Laura Dawson, Kassia Krozser, Don Linn and Hugh McGuire for their feedback on various drafts of this presentation, as well as Frank O’Leary for his excellent work preparing a visual story to accompany these remarks.)

Posted by Eric Hellman
Oct 21, 2010  at  01:30 PM

No wonder you haven’t posted for a while!

Perhaps I can provoke you to take this further. If this is a “unified field theory” then what sort of particle does it predict?

One quibble. Containers are ALL about the API. The Codex provides a set of standard ways to access and navigate its content. An XML document container similarly builds a set of API’s onto content- DOM, SAX, XPATH, etc. The challenge we face is not the dissolution of containers, but rather their recrystallization.

Posted by Brian O'Leary
Oct 21, 2010  at  02:22 PM

Yes, developing these ideas absorbed some bandwith over the last couple of months smile

I think we agree about containers (in that they have an understood set of APIs).  In some instances, those APIs provide superior value to other options.

What I feel strongly is that using containers as the source of subsequent, digital content is a mistake that cripples the competitiveness of publishers.  That’s what I hope to change.

Posted by Terry Jones
Oct 21, 2010  at  03:47 PM

Hi Brian

It would be interesting to have a talk sometime - I have way too much to say to put it all into a blog comment grin

But, briefly, I think you should consider what happens if containers are made writable by anyone, at any time, without the need to ask for permission or to have your needs anticipated. I.e., what if we had a world of writable containers? That changes many things, though it takes some time to see what and why. A related thought / opinion is that APIs are actually just an intermediate step on the way to something much more open - a world of writable containers. It’s my belief that the latter is a more powerful world than the world of APIs.

I can say more if you’re interested. I’ve been thinking about these things for many years. If you’re curious, have a look at FluidDB (, which we’re building for these exact purposes. It may look like a regular boring database, but it’s actually a world of fully writable containers for information.

Posted by Brian O'Leary
Oct 22, 2010  at  11:46 AM

@Terry - You raise an important point, one hinted at (though far from fully addressed) in the middle of my talk, when I said:

“APIs – application programming interfaces – provide users with a roadmap that lets them customize their content consumption.”

To continue the metaphor introduced in the post, though: I have no problem with a variation of containers that can be further evolved by readers (with notes, commentary, fan fiction ... name it).

The containers that trouble me are ones that necessarily truncate context and content.  If we start with this limiting kind of container, discoverability, use and reuse are limited, and the commentary you anticipate would likely be under-informed.

Posted by Noah Genner
Oct 22, 2010  at  02:33 PM

Really great post/talk. Should be required reading.


Posted by Brian O'Leary
Oct 22, 2010  at  04:06 PM

Thanks, Noah.  It was a privilege to address folks at this conference - the entire agenda is really very informative and challenging.

Posted by Mike McNamara
Oct 24, 2010  at  05:35 AM

Excellent post and totally agree about context, metadata and API.

The change to working practices, work flows & tools etc. within publishing needs to significantly speed up just to keep pace with the ever changing market.

Those that don’t change, ‘will’ be left behind wondering what happened, even though they only blinked for a second!

Posted by Charlie Ter Bush
Oct 27, 2010  at  03:51 PM

Fascinating, with a tip of the hat to the theoretical physicists.  Anything digital allows for infinite context, and the trick is in defining it, whether you’re a creator or a user.  I can’t/won’t get into a technology discussion, because that really isn’t the point of your essay. 

The containers blew up long ago in my field, legal publishing, and the search was on for context.  Our entire industry has fastened on user workflow contexts, as well illustrated by your (nicely-named) son.  Publishers provide the bits of content and the workflow (software) tools, bent to whatever strengths they may have (e.g., Bloomberg Law’s new interface, integrating with their business content). 

I think the next iteration, as you say, is for customers in this industry to create their own context and it will be fascinating to see how we respond.

Thanks.  You always make me think, and make it fun.

Charlie Ter Bush

Posted by Brian O'Leary
Oct 27, 2010  at  07:11 PM

Thanks for reading and commenting - the legal profession provides a clear example of how much can change, and how quickly.

Posted by fairuse
Dec 13, 2010  at  04:35 PM

This is great. I am still reading but you have to revisit the “context first” plan now that there is talk of advertising in ebooks.

By the way, I am about ready to kindle my android mobile. I still prefer Audible on the small screen. Time will tell.

Posted by Peter Turner
Mar 04, 2011  at  03:48 PM

Hi Brian: Really thought provoking post. The dimension of content that most strikes me is how it relates to discoverability in an increasingly content-abundant environment. I come at this from a slightly different angle, from that of community and affiliation. If the environment we might choose to inhabit one-line is narrower, but deeper, in content the value of what is discovered is greater if one’s affiliation with that content is greater. Make any sense?—Peter

Posted by Brian O'Leary
Mar 04, 2011  at  04:06 PM

Perfect sense.  I work with a range of association publishers, almost all organized around community.  The most effective ones align content and community, even going so far as to tap members to write, structure and edit the content.

Charlie Ter Bush’s comment. a couple above yours, talks about his experience with the legal profession.  The closer the connection between content and community, the more valued the asset becomes.

Without adequate contextual tagging, those connections are tough to establish in a content-abundant universe.  Established publishers with premium content could find themselves pushed aside not by a better-written book or article, but a more discoverable (and thus useful) one.

Commenting is no longer available for this article.