Using technology to make in-house style guides more efficient

Companies that put out lots of content, have lots of employees, and want consistency in voice across their messaging will often have a style guide, which is good. If technology is leveraged, this can be even better.

Let’s say a company wants certain text in uppercase and certain text in title case. Often the copywriters are expected to type in the content in the required style. To me, this is the same as hard-coding boldness or color. Like other text transformations, case can be controlled by CSS. Companies could therefore simplify by having copywriters write all headlines and titles in title case. The content management system would label each item (via class or id) with a category, such as headline, subhead, etc., and the case of each could then be controlled via CSS. Not only would this simplify things for copywriters, cutting down on user error, but it would also make it simple to shift copy to a new format if there was ever a change to the style guide.

There are caveats. CSS is not natively able to transform text properly from uppercase to title case or sentence case, nor is it able to transform text from uppercase or sentence case to title case. Title case has strange rules–certain words are capitalized, others aren’t–that would have to be scripted. And transforming to sentence case presents another problem: there would be no way to preserve or create capitalization of proper nouns. If an organization knew all the proper nouns that were to be used in copy, this could be scripted as well, but it’s extremely likely that something would be missed, making this an imperfect solution.

I would handle all these situations by simply having copywriters write everything in sentence case. Transformations to title case could be achieved through scripting, as there are finite rules as to what gets capitalized. The capitalization of proper nouns would also be preserved. Meanwhile, changing to uppercase would be a simple CSS transformation.

This discussion has been web-focused, but I imagine something similar could be done for print.

Speaking of print, it amazes me that some organizations keep their print and online content storage separate. I would put them all together in a robust, customized CMS. Yes, the two have different needs, and those would have to be dealt with. But there is also a lot of crossover. Having everything in the same place would ensure consistency across the organization’s media.

The future of content, Part 4

I’ve talked about redesigning the web into a collection of interconnected pieces of content, and I’ve discussed monetizing such a paradigm. Now I’d like to go further into the value this reconstruction would bring to content creators, sharers, and users.

The way the web works right now, content creators and sharers typically must either have their own website or use third-party services in order to build an audience and make money. Under this paradigm, the websites (or their content streams) are the main point of interest, and the onus is on the site owners and managers to “keep the content fresh”. In the case of businesses, this includes finding and hiring/contracting creators and negotiating licensing agreements with third-party content providers. The now-now-now pace puts pressure on creators to write something, anything, in order to keep people coming back to the site. This has resulted in a glut of content that is posted for the sake of having new content posted. SEO marketing has exacerbated the issue with content posted for the sake of higher search engine rankings. People are wasting more and more time reading navel-gazing content that adds little value to the human community.

With a web that is truly content-driven, the focus would shift from trying to keep thousands of disparate sites and streams “fresh” to trying to produce and share content that is meaningful, impactful, and important. With IP issues handled through robust tagging, content would be available for anyone to share. Licensing would be streamlined, and creators would be directly paid for their work. Media houses could more confidently keep creators on staff; sharing would provide an obvious metric of a creator’s value. Creators could focus on more long-form pieces, knowing that their existing work would continue to be shared and monetized. There would be less pressure to post something, anything, every day.

The web has suffered from the adoption of the “always on” mindset. If there is nothing new to report, there is no need to invent something to report. Someone, somewhere, is always producing content; it’s a big world. Rather than polluting millions of streams with junk, media companies, news organizations, marketers, and individuals should shift their focus to finding and sharing value. Simply aggregating RSS feeds or repurposing content the way we’ve been doing it so far is not enough; it does not meet the needs of the user and it does not ensure that content creators are paid for their work. We need to rebuild the system from the ground up.

The future of content, Part 3

Over the past two days I’ve described a new model for web architecture, one whose primary unit is an individual piece of content stored in a universal repository, rather than a product (page, feed, API, etc.) hosted on a web server. (Read Part 1; read Part 2.) Today I’ll discuss how such a system might be monetized.

Currently, content is shared in many disparate ways. The Associated Press has its own proprietary format for allowing other news sites to automatically repost its content; it also allows its lower-tier affiliates to manually repost (i.e., by copying and pasting into their own content management system), so long as the copyright notice remains intact. Sites pay to be affiliates. Bloggers, of course, have done the manual copy-and-paste thing for years; nowadays a pasted excerpt with a link to the original is considered standard, and this of course brings little money to the original creator. Video sites, too, have their own different ways of allowing users to share. Embedded video advertising allows the content creator to make some money on shares…assuming someone hasn’t simply saved the video and reposted it. Data is far more difficult to share or monetize. Some sites offer an API, but few laypeople know what to do with such a thing. The typical social media way of sharing data is by posting a still image of a graph or infographic–not contextualized or accessible at all.

In a system where every piece of content is tagged by creator, wherein sharing of any type of media is simple, IP could be more easily secured and monetized. Content tags could include copyright types and licensing permission levels. A piece of content might, for example, be set to freely share so long as it is always accompanied by the creator’s advertising. Ads could be sponsorship watermarks, preroll video, display banners or text that appear within the content unit, or something else entirely. The content creator would determine what advertising would be available for each piece of content, and the content sharers would each individually decide what advertising they are willing to have appear, or if they’d rather purchase an ad-free license. Resharers who took the content from someone else’s share would not avoid the advertising choice, because while they would have found the content at another sharer’s site or stream, the content itself would still be the original piece, hosted at the original repository, with all the original tags intact–including authorship and advertising.

Content could also be set to automatically enter the public domain at the proper time, under the laws governing its creator, or perhaps earlier if the creator so wishes.

The first step in making all of this work is to have all content properly tagged and a system wherein content tags are quickly updated and indexed across the internet. The second step would be in making sharing the “right” way so easy that very few would attempt to save someone else’s content and repost it as their own. As I mentioned in Part 2, I’m imagining browsers and sites that offer a plethora of in-browser editing and sharing options, far easier (and less expensive!) than using desktop applications. Making sharing and remixing easy and browser-based would also cut down on software piracy. Powerful creation suites would still be purchased by the media producers who need them to make their content, but the average person would no longer require a copy of Final Cut Pro to hack together a fan video based on that content.

The kind of tagging I’m talking about goes somewhat beyond the semantic web. Tags would be hard-coded into content, not easily removed (or avoided by a simple copy and paste). A piece of content’s entire history would be stored as part of the unit. Technologically, I’m not sure what this would involve, or what problems might arise. It occurs to me that over time a piece of content would become quite large through the logging of all its shares. But making that log indivisible from the content would solve many issues of intellectual property rights on the internet today. Simply asking various organizations who host disparate pieces of content to tag that content properly and then hoping they comply will not lead to a streamlined solution, especially given the problem of “standards” (as spoofed by xkcd).

With a system like this, the web rebuilt from the bottom up, there would be no need for individual content creators to reinvent the wheels of websites, APIs, DRM, advertising. They could instead focus on producing good content and the contextualizing it into websites and streams. Meanwhile, the hardcore techies would be the ones working on the underlying system, the content repository itself, the way streams are created, how tagging and logging occurs, tracking sharing, etc. Media companies–anyone–could contribute to this process if they wanted, but the point is they wouldn’t have to.

The future of content, Part 2

(This is the second in a series of posts about the future of content creation and sharing online. Part 1 contains my original discussion, while Part 3 considers monetization.)

Yesterday I imagined a web architecture that depends on individual pieces of highly tagged content, rather than streams of content. Today I’d like to expand on that.

Right now when a creator posts something to the web, they must take all their cues from the environment in which they are posting. YouTube has a certain category and tag structure. Different blogging software handles post tagging differently. News organizations and other media companies have their own specialized CMSes, either built by third parties, built in-house, or built by third parties and then customized. This ultimately leads to content that is typically only shareable through linking, copy-and-paste, or embedding via a content provider or CMS’s proprietary solution.

None of this is standardized. Different organizations adhere to different editorial guidelines, and these likely either include different rules for content tagging or neglect to discuss content tagging at all. And of course, content posted by individuals is going to be tagged or not tagged depending on the person’s time and interest in semantic content.

The upshot is, there is no way, other than through a search engine, to find all content (not just content from one specific creator) that relates to a certain keyword or phrase. And since content is tagged inconsistently across creators, and spammers flood the web with useless content, search engines are a problematic solution to content discovery.

In my idealized web, creators would adhere to a certain set of standards when posting content. The content posting interface would automatically give each section of content its own unique identifier, and the creator would be able to adjust these for accuracy–for example, marking an article as an article, marking the title as the title, and making sure each paragraph was denoted as a paragraph. If this sounds like HTML5, well, that’s intentional. I believe that in the interest of an accessible, contextualized web of information, we need all content posting interfaces to conform to web standards (and we need web standards to continue to evolve to meet the needs of content).

Further, I think such systems should tag each unit of content such that the context and sharing and linking history of that unit of content can be logged. This would provide extraordinarily rich information for data analysts, a field that is already growing and would explode upon adoption of this model.

In my vision, content would not be dependent on an individual or an organization to host it on a website at a particular IP address. Instead, there would be independent but interconnected content repositories around the world where all content would reside. “Permalinks” would go straight to the content itself.

Browsers would become content interpreters, bringing up individual pieces of content in a human-comprehensible way. Users could have their own browser settings for the display of different kinds of content. Theming would be big. And a user’s browser history could allow that browser to suggest content, if the user opted in.

But websites would still exist; content interpretation would not be the sole domain of browsers. Rather than places where content is stored and then presented, websites would be contextualized areas of related content, curated by people or by processes or both. Perhaps a site owner would always be able to post and remix their own content, but would need to acquire a license to post or remix someone else’s. Perhaps different remix sites would pop up, sites with in-browser video and image editing, that would allow users to become creators. All remixes would become bona fide content, stored in the repository; anyone could simply view the remix from their browser, but community sites could also share streams of related remixes.

With properly-tagged content that is not tied to millions of different websites, content streams would be easy for anyone to produce. Perhaps browsers would facilitate this; perhaps websites would do so; perhaps both. The web would shift from being about finding the right outlets for content to finding the right content interpreter to pull in the content the user wants, regardless of source.

Such a system would have “social media” aspects, in that a user could set their browser or favorite content interpretation website to find certain kinds of content that are shared or linked by their friends, colleagues, and people of interest to them. This information, of course, would be stored with each piece of content in the repository, accessible to everyone. But users would also be able to opt out of such a system, should they wish to be able to share and remix but not have their name attached. The rest of the trail would still be there, linking from the remix to the original pieces, such that the content could be judged on its worth regardless of whether the creator was “anonymous user” or a celebrity or a politician or a mommy blogger.

Under this sort of system, content creators could be as nit-picky about the presentation of their content as they wanted. They could be completely hands-off, submitting content to the repository without even creating a website or stream to promote or contextualize it. Or they could dig in deep and build websites with curated areas of related content. Media companies that produce a lot of content could provide content interpretation pages and content streams that take the onus of wading through long options lists away from the user and instead present a few options the creator thinks users might want to customize. The point is, users would be able to customize as much as they wanted if they dug into the nitty-gritty themselves, but content creators would still be able to frame their content and present it to casual users in a contextualized way. They could also use this framework, along with licensing agreements, to provide content from other creators.

Comments would be attached to content items, but also tagged with the environment in which they were made–so if they were made on a company’s website, that information would be known, but anyone could also see the comment on the original piece of content. Content streams made solely of comments would be a possibility (something like Branch).

This system would be extremely complex, especially given the logging involved, but it would also cut down on a lot of duplication and IP theft. If sharing is made simple, just a few clicks, and all content lives in the same place, there’s no reason for someone to save someone else’s picture, edit out the watermark, and post it elsewhere. Since all content would be tagged by author, there would actually be no reason for watermarks to exist. The content creator gets credit for the original creation, and the person who shares gets credit for the share. This would theoretically lead to users following certain sharers, and perhaps media companies could watch this sort of thing and hire people who tend to share content that gets people talking.

Obviously such a paradigm shift would mean a completely different approach to content creation, content sharing, commenting, and advertising…a whole new web. I haven’t even gotten into what advertising might be like in such a system. It would certainly be heavily dependent on tagging. I’ll think more about the advertising side and perhaps address it in a Part 3.

The future of content

(This is the first in a series of posts about the future of content creation and sharing online. Part 2 expands on the ideas in this post, while Part 3 considers monetization.)

I recently read Stop Publishing Web Pages, in which author Anil Dash calls for content creators to stop thinking in terms of static pages and instead publish customizable content streams.

Start publishing streams. Start moving your content management system towards a future where it outputs content to simple APIs, which are consumed by stream-based apps that are either HTML5 in the browser and/or native clients on mobile devices. Insert your advertising into those streams using the same formats and considerations that you use for your own content. Trust your readers to know how to scroll down and skim across a simple stream, since that’s what they’re already doing all day on the web. Give them the chance to customize those streams to include (or exclude!) just the content they want.

At first I had the impression that this would mean something like RSS, where content would be organized by publish date, but customizable, so a user could pick which categories/tags they wanted. This sounded like a great way to address how people currently approach content.

Upon further contemplation, though, I don’t think it would go far enough. Sorting by date and grouping by category seem like good options for stream organization, but why limit ourselves? What if I want to pull in content by rating, for example?

What if, alongside a few curated content streams, users visiting a content creator had access to all possible content tags–so that power users could not only simply customize existing streams, but create their own? As they start to choose tags, the other options would narrow dynamically based on the content that matches the tags and what tags are in place on that content. I’d want to be able to apply sub-tags when customizing a stream, so, for example, I could build a recipe stream that included all available beef entree recipes, but only sandwiches for the available chicken entree recipes. The goal would be to give users as much or as little power as they want, while maintaining ownership of the content.

Think of all the fun ways users could then curate and remix the content. Personal newspaper sites like paper.li have already given us a glimpse of the possibilities, but with properly tagged content, the customization could be even better, especially if the content curation system they’re using is flexible. Users could pick the images they want, create image galleries, pull in video, and put everything wherever they wanted it, at whatever size they wanted, using whatever fonts and colors they wanted. And what if each paragraph, or perhaps even sentence, in an article had a unique identifier? A user could select the text they want to be the highlight/summary for the piece, without having to copy and paste (and without the possibility of inadvertently misrepresenting the content).

And what if the owner of the content could tell what text was used to share the content? With properly tagged content within a share-tracking architecture, each sharing instance would serve as a contextualized trackback to the content owner. Over time, they’d have aggregate sharing data that would provide valuable audience information: who shared the content, what text they used, what pictures they used, what data they used, what video they used. Depending on how the sharing architecture is built, perhaps the content owner could even receive the comments and ratings that are put on the content at point-of-share, helping them determine where to look for feedback. They could see who shared the content directly and who reshared it from someone else’s share. Whose shares are getting the most reshares? How do those content sharers share the content? What is the context; what other content are they sharing in that space? This could inform how the content owner chooses to share the content on their own apps and pages.

For websites would still exist, of course. They would just be far more semantic and dynamic. Rather than being static page templates, they’d be context-providing splash pages, pulled together by content curators. Anything could be pulled into these pages and placed anywhere; curators could customize the look and feel and write “connector text” to add context (such as a custom image caption referring back to an article). This connector text would then become a separate tagged unit associated with the content it is connecting, available for use elsewhere. The pages themselves would serve as promotional pieces for content streams users could subscribe to; the act of visiting such a page could send the user the stream information. And content shared alongside other content would then be linked to that content. Whenever a content creator presented two pieces of their own content together, that would tag those pieces of content as being explicitly linked. Content would also be tagged as linked whenever sharers presented it together, regardless of creator. Perhaps explicit links would be interpreted by the sharing architecture as stronger than other links; perhaps link strength could be dynamically determined by number of presentations, whether the content had the same creator, and the trust rating of the sharers involved. Regardless, users could then browse through shares based on link strength if they chose.

Author and copyright information would be built into this sharing system. Ideally, authors would be logged into their own account on a content management system such that their author information (name, organization, website, etc.) would be automatically appended to any content they create or curate. There would probably need to be a way for users to edit the author, to allow for cases where someone posts something for someone else, but this would only be available at initial content creation, to avoid IP theft. This author information would then automatically become available for a “credits” section in whatever site, blog, app, or other managed content area that content is pulled into. Copyright would be protected in that author information is always appended and the content itself isn’t being changed as it’s shared, just contextualized differently. Every piece of content would link back to its original author.

I’m imagining all of this applying to everything–not just text-based articles and still images, but spreadsheets, interactive graphs, video. Users would have in-browser editing capabilities to grab video clips if they didn’t want to present the entire video. They’d have the ability to take a single row out of a table to make a graph. Heck, they’d have the ability to crop an image. But no matter how they chopped up and reassembled the content, it would always retain its original author and copyright information and link back to the whole original. Remixes and edits would append the information of the sharer who did the remix/edit.

Essentially, rather than pages or even streams, I’m seeing disparate pieces of content, linked to other content by tags and shares. All content would be infinitely customizable but still ultimately intact. This would serve the way people now consume content and leave possibilities open for many different means of content consumption in the future. Meanwhile, it would provide valuable data to content creators while maintaining their content ownership.

I would love to work on building such a system. Anybody else out there game?

Since writing I’ve found some related sites and thoughts: