Showing posts with label PDF. Show all posts
Showing posts with label PDF. Show all posts

Wednesday, June 16, 2010

Why PDF?

Posted by Mark Brousseau

What's the purpose of PDF? Why can't you just send Word or Excel files? And why should you bother converting to PDF? Duff Johnson (duffjohnson@appligent.com), CEO of Appligent Document Solutions, explains:

Very few “love” PDF, but we all need it, because PDF is electronic paper.

For the efficient and reliable delivery of final-form electronic documents, there's nothing else quite like a PDF file.

For business and government organizations, “posting the PDF” is now essentially THE physical act of publication. Pretty much everyone with a computer is assumed to have a PDF Reader; it's a standard assumption in millions of interactions between consumers, business and government everyday. Hundreds of millions of people “PDF it” when they want to share some content.

Current squabbles between the two companies aside, even Apple's display technology is based on Adobe's PDF.

So how did electronic paper get defined as PDF?

Fundamentally, it's all about portability. Reliable viewing and printing across platforms is one of the great Killer Apps of all time.

There are other technologies that deliver some of PDF's complete package, but PDF is built from the outset to work the same way in all places, period. It turns out that's the most important thing of all. There are a set of very specific reasons why PDF is the world's choice for electronic paper. No other format offers this combination of attributes.

Easy to make and share
Sure, you can send a Word, HTML, PowerPoint or any other file. But other formats, while just as easy to attach to an email, aren't quite as easy to share as PDF.

First and foremost, you can't be sure the recipients have the same version of PowerPoint (or whatever you are sending). You may not want to give them the ability to edit the document, you don't want hassle with passwords. Making a PDF is usually just a click or two, and for that amount of effort, it's clearly a smart move.

A typical Acrobat or Reader user doesn't think about their choice to use PDF at a fundamental level. They make, send and use PDF files precisely because, hey – why worry? PDF just works.

WHY YOU MIGHT CARE: Who doesn't like it easy?

Reliable, manageable presentation
There's just no excuse for poor presentation. From elaborate graphic-design to simply making sure the page-breaks happen just the way you've set it up, PDF delivers you from worrying about what it's going to look like or print on the other end.

Other formats might don't look quite the same when opened on different machines, or can't be opened on a Mac. There may be font dependencies, or differing page-sizes or other application or user settings that affect appearance. There may be undesirable information such as slide-show notes, metadata or track-changes information that's really a part of the file, and you might not want to share it!

Not only does PDF provide a completely faithful, high-fidelity rendering of your source document, but you can mix and match it with other documents from other sources. There's detailed management of all sorts of document functions, navigation features, accessibility and more, and it's all just ready to go, for users on every platform, inside of each and every PDF file.

WHY YOU MIGHT CARE: PDF delivery is entirely manageable and utterly predictable.

Convert from any source, use in every workflow
PDF files may be created from any application that can print, including desktop publishing, office software, design, database report and other applications. PDF files may also be produced from scanners, either with or without searchable text via OCR. You can even take a screen-shot and convert it to PDF and combine it with other PDF pages.

WHY YOU MIGHT CARE: Users can learn to make PDF files from any software in seconds, and every PDF file works with every other PDF file, so they can be shuffled and reorganized like... paper pages.

Smaller file-size, yet fully searchable
When converting to a PDF file, it's usually possible to reduce the file-size substantially below that of the original source files. Even for scanned documents, conversion to PDF generally means smaller files - and more importantly, scanned pages can be made into searchable PDFs.

WHY YOU MIGHT CARE: Although hard-drives are getting larger and larger, a 195kb PDF file is usually preferred over a 2.95 MB Word file, especially if users aren't expected to edit it.

Self-contained
Unlike most authoring formats, a properly-made PDF file includes all content, fonts, images, structure, signatures, encryption, scripts and other resources necessary to the appearance and proper function of the file in an ISO 32000 conforming reader.

PDF just works everywhere; it has no server or style-sheet dependencies, and each page may be extracted into it's own self-contained PDF file.

WHY YOU MIGHT CARE: Self-contained files are inherently rugged and adaptable, for example, they can go offline, be emailed, FTPed or accessed in any preferred manner, always with the same result.

Makes content from any source accessible to users with disabilities
One of the great beauties of PDF is the ability to make almost any source content accessible to users with disabilities who must use Assistive Technology (AT) in order to read. From scanned documents to drawings, diagrams and multilingual content, PDF files may be tagged to provide a complete, high-quality reading and navigating experience.

Many applications can't generate accessible content by themselves, but converted to PDF, these documents may be structured and tagged for complete accessibility.

WHY YOU MIGHT CARE: For Federal agencies and contractors, Section 508 requires that electronic documents be accessible. Other jurisdictions are beginning to adopt similar regulations, and many businesses are choosing to post accessible content.

A multiplatform International Standard
PDF is a truly multiplatform technology, and it's here to stay. PDF is equally at home on Windows, Mac OS X, Linux, UNIX, Android and any other operating system.

No-one has ever had to pay Adobe a royalty to make PDF files, and the company has published the PDF Reference since the beginning. Until recently, Adobe kept the copyright and updated the Reference, the “rules of the road” for PDF, as they wished.

In 2008, Adobe ceded control of the PDF specification to ISO, the International Standards Organization. Now known as ISO 32000, PDF is an International Standard; it is no longer owned by Adobe Systems but is managed by diverse members of the electronic document industry, with free and open access to all interested parties as observers or full voting members.

WHY YOU MIGHT CARE: While PDF is everywhere, one lingering doubt for some has been the idea that Adobe Systems “owned” PDF and therefore, adopting PDF for critical business functions would create a vulnerability. Turning over PDF to ISO is the categorical solution to this concern – Adobe Systems or no, PDF is here to stay, and no-one owns your PDF files except you.

What do you think?

Sunday, May 16, 2010

Is PDF an open standard?

Is PDF an open standard? Duff Johnson, CEO, Appligent Document Solutions, weighs in:

On May 13, the founders of Adobe Systems stepped up to the microphone to deliver a response to Steve Jobs' open letter about Flash. They say Adobe has acted on open standards while Apple offers mere words.

At the outset, I must acknowledge that I owe my livelihood to the genius of these two gentlemen. The inventors of PostScript and PDF and the creators of Adobe Systems, Warnock and Geschke are gods in my Pantheon. They are the founding fathers of technologies that have been instrumental in making computers relevant to the modern everyday operations of government and business.

That said, claims about PDF being a true open standard need to be placed in context.

Adobe Systems has published the PDF Reference, the rulebook for PDF developers, since 1993. At the very beginning, if you wanted to make, view or manipulate PDFs you bought the book in the store for a few dollars. Pretty soon it was (and still is) available online at no charge.

On July 1, 2008, version 1.7 of the PDF Reference was rewritten as ISO 32000, a document managed by committees under the auspices of the International Standards Organization. ISO 32000 is managed by individual representatives of interested parties in open meetings under parliamentary rules. Anyone can observe and participate. While they are obviously heavily invested in the outcome of the committee's decisions, Adobe Systems has only one vote at the table, the same as any other.

By now, the rulebook for PDF is relatively mature and precise in its language. It was not always so. Adobe's very openness – their willingness to let third-parties in to make their own PDFs before the PDF Reference was a mature document – was and continues to be a source of pain.

Three of five PDF viewers displayed this PDF incorrectly.

When millions of PDF files from hundreds of different applications started flying around, two major problems with the rulebook for PDF emerged.

First, while the Reference set rules it is not a cookbook; it included no recipes for how to create content on a PDF page.

Second, the Reference was ambiguous in some areas and left other matters under-considered, sometimes unaddressed.

When dealing with real-world documents, Adobe's software had to deal with these vagaries, so more rules were written; specific details of their implementation were crafted to address the issues encountered in the real world.

These new rules, however, were in the software, not the Reference. As the Reference developed, Adobe's implementation and the published rules began to diverge. It became possible to create a “legal” PDF file that otherwise perfectly serviceable software couldn't handle quite right (or handled dead wrong). In fact, because the early versions of the PDF Reference were so vague (relatively speaking), the range of possible oddities that were legal in a PDF was very wide indeed. A lot of sloppy PDF software was (and still is) written for this reason.

I remember discussing this problem with Adobe developers in the late 1990s. First and foremost, we all knew PDF had to be reliable. PDFs had to display the same way on-screen and in-print, no matter the platform. The problem with these “legal” but otherwise oddball PDF files was that if they displayed with problems in Adobe Reader, then Adobe (not the PDF's producer) would get the blame.

A pattern was established in which poorly-structured PDF files were roaming around in the wild, and that problem has worsened over time. As PDF has grown more popular, more and more applications of widely varying quality make bad PDF.

Adobe's solution was to engineer Adobe Reader to handle all the various oddball PDF files out there. It's one of the main reasons why Adobe Reader is a larger application to download and install compared to its rivals. Reader includes lots of code to deal with the thousands of different types of exceptions to “good” PDF that Reader users worldwide can and will encounter on a regular basis.

In their attempt to ensure that even the sloppiest PDF files still worked, Adobe created a situation in which developers could (and have) used Adobe's Reader as the reference implementation for their PDF software.

In 2010, there is still no alternative to Adobe Reader when it comes to validating third-party software.

As the vice-chair of ISO 32000, that bothers me, and if you're relying on the idea that PDF is indeed an International Standard in your organization, it should bother you too.

To make the final move in ensuring PDF is a durable international standard, Adobe should release their test suite of PDF files used to test Adobe Reader. This could take form in several ways, the simplest of which would be a collection of PNG images demonstrating the authoritative rendering of example PDF pages.

This test suite should be referenced in the upcoming ISO 32000-2, the forthcoming update to the International Standard for PDF.

When this step is taken will it become possible to validate the open standard of ISO 32000 without the proprietary Adobe Reader, an objective which is fundamental to the project of PDF as an International Standard.

Establishing an open test suite will make PDF truly an open standard in the spirit of Warnock and Geschke's letter. The advantages for consumers will be substantial. Adobe and software developers can produce conversion software to resolve the old files.

With no further excuse for sloppy code, non-compliant software will tend to die away, removing a major source of problems.

PDF will become truly reliable and based not only on an international standard, but one that may be readily validated.

Adobe will have begun the process of liberating itself from supporting old (and now invalid) PDFs, and will eventually be free to re-direct engineering resources away from propping up other people's software and into creative development.

I can't imagine a world without PDF; if it didn't exist it would have to be invented. PDF is indeed an open standard, but it's incomplete. It's time to finish the story and end the practice of making Adobe Reader a de facto reference implementation.

What do you think?

Tuesday, November 25, 2008

Green Information Management

By Mark Brousseau

Reduce, reuse, recycle is the green mantra that gained a lot of currency in the 1980s. Today, a large number of enterprises are aggressively pursuing 3R policies, covering everything from paper usage and disposal practices to energy usage and water consumption.

As Green IT moves from the notion of a paperless office into a mainstream corporate social responsibility, CIOs are now also identifying ways in which to minimize the corporate carbon footprint and at the same time achieve their strategic business objectives.

Stuart Butts, a founding member and director of Xenos Group, Inc., says there is, however, one other area that demands equal consideration by organizations: managing structured and unstructured data and documents. Very often enterprises will hold the same information in a variety of different electronic formats and in different physical locations to meet different requirements, Butts notes. Multiple silos of information in technologically incompatible systems mean that information cannot be shared in real time. In addition, this approach consumes inordinate amounts of storage space and the associated costs that go with that.

With the explosive growth in data and documents, Butts says the time has come to apply reduce, reuse and recycle thinking to electronic business information. Embracing a more strategic, ‘green’ approach to information management will deliver a number of benefits, not the least of which is a dramatic reduction in the cost and complexity of power-consuming storage requirements, he believes.

Butts says there are a number of specific offerings that can help to reduce storage demands by eliminating redundancy and simplifying access to business critical information in real time. These include archiving, content migration and consolidation tools that enable real-time, on-demand transformation of customer statements and other key documents contained in electronic print files to PDFs for ePresentment.

By eliminating the constraints imposed by incompatible hardware or software platforms and disparate data and document archives, Xenos has helped organizations to reduce, reuse and recycle their data and documents to lower costs and improve information flow, Butts notes.

On-demand transformation to PDF format allows organizations to eliminate the unnecessary storage of large, graphically rich files, while streamlining version control with a technique known as “document resource optimization”. For some, this approach has effectively reduced storage requirements by as much as 90 percent, Butts says. Given that many large enterprises are spending as much as 70 percent of their IT budgets on their storage infrastructures, it’s time to apply reduce, reuse, and recycle thinking to data and storage needs, he adds.