Wednesday, October 29, 2008

...and this

...from Chris O'Brien at the Mercury News in California. If you're into Brewster Kahle and the Open Content Alliance, you'll want to read this.

there's more in the news today

more about Google's settlement and future digitization from the New York Times

Tuesday, October 28, 2008

now what are they up to?

Just announced was Google's new deal with publishers and authors for in-copyright and/or out-of-print books. This from the Association of American Publishers and this from Google Central. I like Google as much as the next person but I'm starting to feel like they're Microsoft. 

Is there a competitor? 

At this year's annual NDNP gathering, we talked a lot about the shrinking number of vendors who can meet (or are willing to meet) the NDNP spec. That narrowing margin is making some of us uncomfortable, and understandably so - whatever vendor is left standing can basically call the shots (not that they would but history suggests this could happen). Drawing on my limited knowledge of market lingo, competition is the market's brand of a check and balance system. Without that, who's looking after the consumer?

The first thing that popped into my mind about this new deal was: Knowing that everyone wants digital access means libraries will pay for some goods twice. That could solve a lot of problems, certainly, but it also means someone besides the library is storing what we've paid for and... oh wait, this is starting to sound like journals! There are libraries with obsolete formats that nothing can read anymore because of this nice for everyone. Need I go on?

If Google has no competitor for book digitization, then who is going to compete with them on the more tenacious formats like newspapers or A/V? There's NDNP, sure, but at only 100,000 pages in two years from each of us, we can't compete with the volume Google can afford to produce. Forget quality and preservation for now, users want access and they're not going to consider the preservation and quality and, of course, they think Google is all that and a bag of Doritos! They couldn't possible botch this.

Plus, with this new agreement is born the book equivalent of the ASCAP and BMI of music royalty overseers. The Books Rights Registry, they're calling it.  So, now I'm thinking: Where does this stop?

Maybe it's not as much a monopoly as it seems. If you can make me feel better about Gargantuan Google, I'm listening!

....Oh, I'll have more to say about Google and newspapers later- thanks to all your comments, keep 'em coming (I promise they'll post much faster from now on).

Monday, October 20, 2008

google and newspapers

Recently, when Google announced it was getting into the newspaper digitization business, many of us digitizing newspapers already took note. And who wouldn't? It's Google: they do a lot of great things and they've got a lot of money to do more. They've tried their hand at books so it's only natural that newspapers should follow. Their announcement wasn't unexpected.
(Google sample newspaper page)

Nevertheless, it gave us pause to consider the impact(s) this might have on our own digitization efforts in several key areas:

  • the long-term preservation of the digital data....quality imaging not withstanding (see below) 
  • for those of us funded by grants - our livelihoods
  • and, most importantly, title selections

I can't imagine Google would have financial worries for maintaining the enormous amount of data these newspapers generate. Even if they save their master files in a compressed format like JPG, JP2 or, God forbid some lesser format, they're still faced with loads of material to save in perpetuity. Choosing the right format and thinking in forever terms are but two issues involved with digital preservation, all of which are beyond the scope of this posting.

As to our livelihoods - between Google and the current economic collapse/crisis, it feels kind of silly to even talk about. Let's just be thankful to have a jobs and leave it at that for now.

But title selection is a different animal altogether. If you're an NDNP awardee, as we are here at the University of Kentucky, then you're bound by the NEH rules. Of particular importance here is the fact that we cannot digitize titles that have been digitized by another entity, whether it's a commercial entity or someone like Google who may make them freely available. 

Some argue that there's plenty to go around, and that's a reasonable enough argument. There are millions, if not billions, of historic newspaper pages waiting to be digitized. So, yes, there's plenty to do in that respect. But what happens to "collections"? What happens to their preservation? And who is responsible for those two things?

Picture this: what would you think if you, as a researcher - professional or layperson - landed upon a website that had tons of newspaper pages only to find that just a few newspaper titles are available? Would you feel cheated? Would you feel like you've wasted your time because, now, you have to keep looking for what you need? Or would you feel satisfied?

Take Chronicling America or our own Kentuckiana Digital Library...How strange would it be to look at Kentucky's newspapers at the end of NDNP's 20 year cycle to find we have every historic Kentucky newspaper except Louisa's Big Sandy News or the Kentucky Reporter, for instance? Wouldn't it seem odd for the University of Kentucky - the state's flagship University and Kentucky's sole NDNP content provider - to have everything except those two titles? Would you feel cheated? Would you feel like you've wasted your time because, now, you have to keep looking for what you need? Or would you feel satisfied?

And what would we say, as an arbitor for the state's historic collections and digital preservation, to those newspapers who may have opted to have their titles digitized by Google or some other outfit instead of UK when their stuff comes up missing, corrupt, distorted, or otherwise unusable? "Since you didn't let us preserve the material it's just lost. Sorry about your luck, Mr. Publisher.

In fact, it's not the publisher who stands to lose, but all of us - Kentuckian, American, Global citizens alike. Newspapers are a shared history and should be free to everyone. Further, it seems childish to want anything but the best preservation standards applied to every single page, no matter what your role may be. After all, who are we making this stuff for if not our children, or our children's children? Is it simply to glorify ourselves or is it really because this stuff matters?

I'd like to think it's the latter.