Goodreads Librarians Group discussion

152 views
Book Issues > Importing book data from publishers

Comments Showing 1-36 of 36 (36 new)    post a comment »
dateDown arrow    newest »

message 1: by Otis, Chief Architect (new)

Otis Chandler | 315 comments Mod
Hi GR Librarians,

Just a heads up that we are starting to import book meta-data directly from publishers. We are starting with the "big six", so you will start seeing books sourced from Random House, Penguin, etc now.

It's important to understand the reason for why we are doing this. Originally we used all Amazon sourced books, but then Amazon changed their rules to things we didn't like, so we switched to B&N. The worst ones were that Amazon had to be the only bookseller link on the book page, and that we couldn't use any of their data on our iphone or android apps. Now B&N's rules are getting worse, so we are going straight to the publishers, which have little to no rules - they just want us to use the data to promote their books!

Given this, please report any issues you see. The librarian logs should show that "Goodreads" is updating the files when you see changes. We will not override any images or descriptions that librarians have specifically put in, but Amazon or B&N sourced data will be overridden. Hopefully it will all the same data, but if the covers are drastically different for some reasons, we apologize - but trust me when I say it's for the better. Our data will now be our own, and we won't be subject to the whims of anyone else for what we can do with book meta-data.

thanks!

Otis


message 2: by Jessica (new)

Jessica | 963 comments Cool! Will we get some ebook info as well, since a bunch of that was coming from B&N?


message 3: by Moloch (new)

Moloch | 3185 comments So, no Amazon.it anymore?


message 4: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
It looks like Amazon (all of them) are still sources. They just won't be the preferred source anymore, for those books where we can get the data from the publishers (without Amazon's restrictions).


message 5: by Moloch (new)

Moloch | 3185 comments Ok, thank you


message 6: by Random (new)

Random (rand0m1s) | 56 comments Does this mean we'll be able to go to one Buy Now/Online Stores type button or will we still be limited on books which have still been sourced from Amazon/BN?


Grada (BoekenTrol) (boekentrol) | 17 comments rivka wrote: "It looks like Amazon (all of them) are still sources. They just won't be the preferred source anymore, for those books where we can get the data from the publishers (without Amazon's restrictions)."

Do you have any idea how this will work when registering non-English books? "Manually add a book" will probably work like it does now, but what if I want to see if an ISBN or title from a newly published non-English book is already registered here on GR?


message 8: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
Searching on GR will work the same as it does now. Info from publishers is imported in batches, not one book at a time.

Random, books imported from Amazon (or B&N) will still fall under the restrictions of such data. Those from publishers will not.


message 9: by Random (new)

Random (rand0m1s) | 56 comments rivka wrote: "Random, books imported from Amazon (or B&N) will still fall under the restrictions of such data. Those from publishers will not. "

That's what I thought, but figured I'd ask. I look forward to the day that BN doesn't have to have its own button and I can remove Amazon from my list. :D


message 10: by Peter (new)

Peter (pete_c) | 388 comments I guess that this means that we won't be forever correcting errors from Amazon listings, either! Hoo-ray! No more prenatal books!


message 11: by Sandra (new)

Sandra | 26087 comments LOL, I've found Publishers make mistakes too, especially page numbers, so we won't be out of a job just yet :)


message 12: by Lobstergirl (new)

Lobstergirl As I've said in other threads, I've spotted multiple mistakes so far. Random House overriding correct cover images with a blank image, Penguin importing author names with two spaces instead of one, etc. This will be a headache for librarians going forward.


message 13: by Lobstergirl (new)

Lobstergirl Otis - does getting data from publishers only apply to currently in-print books? What about older, or out of print books? Does moving away from Amazon as a data source mean that typing in an ISBN of an older book in the Goodreads searchbox will make that book harder to locate, and thus perhaps impossible to import?


message 14: by Vicky (last edited Nov 07, 2011 08:34PM) (new)

Vicky (librovert) | 2459 comments Otis,

I think we've found a case where a cover was added by a user and RandomHouse data overwrote it. See this post regarding this edition of Mostly Harmless.

The edition's cover has been contested before (and alternate cover edition made), there was an image uploaded by a user on June 27th, but Goodreads/RandomHouse overwrote the cover on November 5th.


message 15: by Otis, Chief Architect (new)

Otis Chandler | 315 comments Mod
Vicky and Lobstergirl: I think this post was prompted because we noticed a bug where some books were overridden by blank Random House data - it has been fixed, and the damage was relatively minimal. Sorry about that!


message 16: by Otis, Chief Architect (new)

Otis Chandler | 315 comments Mod
Penguin importing author names with two spaces

I haven't heard that one - do you have examples? I will report this to our engineer. This is the kind of stuff we need to know - thanks!


message 17: by Lobstergirl (new)

Lobstergirl Here's an example of a Penguin error. This book was imported from Penguin with the author as Stanley Elkin rather than Stanley Ellin. I've since made the correction.

http://www.goodreads.com/book/show/11...


message 18: by Lobstergirl (new)

Lobstergirl Otis wrote: "Penguin importing author names with two spaces

I haven't heard that one - do you have examples? I will report this to our engineer. This is the kind of stuff we need to know - thanks!"


http://www.goodreads.com/book/show/11...

Technically that one needs to be NABbed, of course.


message 19: by Virgilio (new)

Virgilio Machado (vapmachado) | 8 comments Otis, I think it is great that GR is starting to import book meta-data directly from publishers. I just opened Random House site and it shows that The Litigators by John Grisham ( http://www.randomhouse.com/book/21306... ) has an ISBN of 978-0-385-53513-7 Will that "meta-data" include ISBN with four hyphens or not? Will it also include the outdated 10 digit ISBN, or only the correct 13 digit ISBN? What about the title? Will it be "The Litigators" or "THE LITIGATORS"? Just curious. Thank you.


message 20: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
Virgilio wrote: "Will that "meta-data" include ISBN with four hyphens or not?"

Definitely not. GR is not built to handle that. The book in question is already sourced from RH, and you can see it here: http://www.goodreads.com/book/show/11...


message 21: by Virgilio (new)

Virgilio Machado (vapmachado) | 8 comments Rivka, Thanks for attempting to reply to one of my three questions. You are answering on behalf of GR in what capacity? I'm sorry but I could not find any reference to you on this page ( http://www.goodreads.com/about/us ) and didn't get much from your profile ( http://www.goodreads.com/user/show/17... ), except that you claim to be a GR employee, which links to the first page above. Very confusing, Very secretive. Whatever suits you.

Please note that my question, poorly phrased, I must admit, was about the data that GR was getting form RH, not the data it is displaying on the page you were so kind to give the link to ( http://www.goodreads.com/book/show/11... ). To see the data displayed all you have to do is look at that page. Let me try again, using the same example, since it shows that the book "description" is from the "Publisher" ( http://www.goodreads.com/book/edit/11... ):

ISBN13: 9780385535137
Is this the ISBN13 as supplied by RH, or is it been shown after GR took out the four hyphens?

ISBN 0385535139
Was this ten digit, no longer in use, ISBN ( you may check that yourself http://www.amazon.com/Litigators-John... ) supplied by RH, or is it been shown after GR converted the ISBN13 to ISBN10?

What I have really struggled with is the way to write the title, when I own the book, which is not the case, I'm afraid. The question was: Should the title in GR be "The Litigators" or "THE LITIGATORS"?
I can see that GR is currently using "The Litigators" right beside a cover showing "THE LITIGATORS". That's all I could get from RH. Amazon was not much help either. The book shown ( http://www.amazon.com/Litigators-John... ) is a UK edition with a different ISBN, but titled (cover and inside cover) "The Litigators". So the question remains: for this edition ( http://www.goodreads.com/book/show/11... ), should the title in GR be "The Litigators" or "THE LITIGATORS"?

Thank you all so much for your time and patience.


message 22: by Jessica (new)

Jessica | 963 comments Virgilio wrote: "Rivka, Thanks for attempting to reply to one of my three questions. You are answering on behalf of GR in what capacity? I'm sorry but I could not find any reference to you on this page ( http://www..."

Most book databases, including the ones GR uses, remove the dashes from the ISBN.

The ten digit ISBNs are still in use, and I'm assuming that since they are importing with the 13 digit ISBNs from publishers that the 10 digits are being provided by the publishers.

In the case of THE LITIGATORS vs The Litigators the title would be imported as The Litigators, in this case. If it came in as THE LITIGATORS it would probably get corrected eventually to The Litigators by a librarian. Many books have titles in all caps but when they are discussed the are written in the lower case. Plus it looks a lot better.


message 23: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
I work part-time for GR. There are a few such people who are not on the "about us" page, although I think I'm the only one active in the two official groups.

The title on the book follows GR guidelines, regardless of its source. That means it should not be in all caps.

I'm not sure what you mean about the ISBN-13. GR automatically strips out all hyphens from ISBNs when they are imported. The fields in the database cannot handle them. Also, the hyphens are unnecessary and inconsistently used, as was pointed out in this previous discussion.

GR does not convert ISBN-13s to ISBN-10s, so the ISBN-10 must have come from the publisher. (And I agree with you that ideally ISBN-13s should be listed primarily, but making that change in the existing GR system is not so easy.)


message 24: by Peter (new)

Peter (pete_c) | 388 comments Otis wrote: "I haven't heard that one - do you have examples? I will report this to our engineer. This is the kind of stuff we need to know - thanks!"

Sons, which I just updated with data from my copy and multi-site-verified data, is another example. (The new image is a copy from another edition on GR.) The original data was really messed up.


message 25: by Lobstergirl (new)

Lobstergirl Otis wrote: "Penguin importing author names with two spaces

I haven't heard that one - do you have examples? I will report this to our engineer. This is the kind of stuff we need to know - thanks!"


Here's one that is not a NAB.

http://www.goodreads.com/book/show/11...


message 26: by Peter (new)

Peter (pete_c) | 388 comments Here's a bad import from Amazon: Last Summer.
On http://www.goodreads.com/book/edit/2454609.Last_Summer we see
isbn 9997517210 isbn 13 9789997517210
source Amazon.com
. On the Amazon page we see
ASIN: 9997517210


Hope this helps.


message 27: by rivka, Librarian Moderator (last edited Nov 09, 2011 08:23PM) (new)

rivka | 44666 comments Mod
Let's try to keep this thread for issues with books imported from publishers, not Amazon? It's going to get very confusing otherwise. Feel free to start another thread.

(I also think that record is fine, but that's another story.)


message 28: by Virgilio (new)

Virgilio Machado (vapmachado) | 8 comments Jessica, Thank you so much for your comment. You do not present any evidence or quote any source for your statement that "Most book databases remove the dashes from the ISBN." I'll take your word for it for what it is worth, which is not much, given my biased skepticism.

The ten digit ISBNs are still in use for the same reason that I own a book that was purchased so long ago, that you don't even have a date for it (before 1911, it is a 1871 edition), but that's what I have and that is the book that I read some years ago (after 1960 :-) Other examples: people are still driving Type 49 Bugattis, but they don't build them anymore; you may talk about German Marks, but they're not in circulation anymore, they exist only as collectors items. Since 1 January 2007 ISBNs have consisted of 13 digits. That's almost five years ago. I could swear you would know this.

You're assumptions don't do me much good. You may assume as much and whatever you want. My question was about factual information. Thanks for giving your opinion, anyway. Given my biased skepticism, I doubt that any publisher is providing 10 digit ISBNs for new books.

Jessica and Rivka, The question of the title is the only one with any practical significance to the little that I do here at GR. I'm still stuck and confused. I read that librarian manual ( http://www.goodreads.com/help/librari... ) over and over and all I can see there is "Enter in the official title of the book as it is shown on the cover or binding. Use proper capitalization and punctuation (i.e. do not use all-caps or no-caps unless the author specifically formatted the title that way)."

What I see in the cover of that edition is THE LITIGATORS. Why would it "probably get corrected eventually to The Litigators by a librarian"? Jessica says that "many books have titles in all caps." Why is that not the case for THE LITIGATORS?

Correct me please if I'm wrong: book discussions are in "free format." I could not find any rules concerning author's names, titles, publishers, or anything else.

Jessica states that The Litigators look a lot better than THE LITIGATORS. I couldn't agree more. What if three people disagree with both of us? Does the majority win?

Rivka mentioned some "GR guidelines." What are those? Where can I find them? Are they different from the librarian manual? I seem to recall reading that the manual is quite outdated. Finding and learning those GR guidelines would be very helpful.

I am very sorry to learn that you think that the ISBN hyphens are "unnecessary and inconsistently used" using as your only reference a "previous discussion" that was so downright embarrassing, that I refused to participate. I could help you and some other members on that "previous discussion" to learn a thing or two about ISBNs, but that doesn't come cheap since it is above and beyond my role here at GR. Let's leave those ISBN questions to the professionals. My only wish is that you all will refrain from making peremptory statements, without having the proper information and knowledge.

Rivka, I'll take your word, for all it is worth, that "GR does not convert ISBN-13s to ISBN-10s, so the ISBN-10 must have come from the publisher."

I don't recall writing anywhere, ever in my entire life, that on GR the ISBN-13s should be listed first. "Frankly, my dear, I don't give a damn."

If you say that "making that change in the existing GR system is not so easy," it will be your word against my intuition that all it takes is a change in a line of code. I'm only talking about the way the ISBNs are shown in the book page. For example:

instead of displaying: ISBN 0452284244 (ISBN13: 9780452284241)

to display: ISBN 9780452284241 (ISBN10: 0452284244)

Sorry to be a pain in the neck. I must refrain from asking questions. I enjoy a good discussion too much. I would appreciate your help with the titles, because that's affecting my small contributions to GR. Everything else is a just a lot of manbo-jumbo. Please don't get mad. If you have to, don't get mad, get even. :-)

Keep up the good work you're doing at GR for everybody's benefit.


message 29: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
Virgilio wrote: "Jessica and Rivka, The question of the title is the only one with any practical significance to the little that I do here at GR. I'm still stuck and confused. I read that librarian manual ( http://www.goodreads.com/help/librarian/... ) over and over and all I can see there is "Enter in the official title of the book as it is shown on the cover or binding. Use proper capitalization and punctuation (i.e. do not use all-caps or no-caps unless the author specifically formatted the title that way).""

You're right. That's confusing. I have updated it by removing the exception (which I'm pretty sure is not what has ever been suggested in discussions in this group).


message 30: by Virgilio (new)

Virgilio Machado (vapmachado) | 8 comments Rivka, Good job. Thank you. It does look a lot better. If you have time and patience for one more small detail... (it's not urgent):

What would be "proper capitalization"?

Some reference styles use The Litigators others The litigators. That is, the capitalization rules vary. Could you dig out what is the GR guideline?

I can dig up some detailed rules and examples for either case, if necessary.

This time, I hope I'm being helpful. :-)


message 31: by rivka, Librarian Moderator (new)

rivka | 44666 comments Mod
Maybe we should move this to a different thread.


message 32: by Virgilio (new)

Virgilio Machado (vapmachado) | 8 comments Yes, indeed. I had the same thought after posting. Could you please get it started. I don't have your status around here, nor am I skilled in getting a discussion going in reasonable terms. I wish you the best, and truly believe this is an important matter for which GR should have a clear guideline, even if it says: use the capitalization of your preference, but be consistent throughout the title. The discussion will, hopefully, get into that. Thank you so much for your kind attention.


message 33: by Peter (new)

Peter (pete_c) | 388 comments Otis wrote: "Hi GR Librarians,

Just a heads up that we are starting to import book meta-data directly from publishers...Given this, please report any issues you see. "


It looks like books imported from Penguin are getting 2 of the dimensions fields swapped. I have just fixed Ward Six and Other Stories. It originally had a width of 1.0" and a thickness of 5.0". This is not the first book I've seen this on.


message 34: by Lobstergirl (new)

Lobstergirl Penguin and Macmillan both tend to import a record without format (hardcover, paperback etc.).


message 35: by Andy (new)

Andy | 136 comments Otis wrote: "Vicky and Lobstergirl: I think this post was prompted because we noticed a bug where some books were overridden by blank Random House data - it has been fixed, and the damage was relatively minimal..."
I don't know if these are cases of the relatively minimal or not. In case they aren't...
Four Del-Rey/Ballantine/Random House
http://www.goodreads.com/book/show/93...
http://www.goodreads.com/book/show/16...
http://www.goodreads.com/book/show/16...
http://www.goodreads.com/book/show/21...
Two TSR/Wizards of the Coast
http://www.goodreads.com/book/show/66692
http://www.goodreads.com/book/show/29...
I was looking through all my books for ones with "bad" covers, and saw these. The first two Del-Rey books are now 0 pages long. I tried to make sense of the librarian logs, but that didn't happen. I hope this actually helps... at least a little.


message 36: by Peter (new)

Peter (pete_c) | 388 comments Otis wrote: "Penguin importing author names with two spaces

I haven't heard that one - do you have examples? I will report this to our engineer. This is the kind of stuff we need to know - thanks!"


I just fixed 8 titles imported (on 6/30/2011) from Penguin that were listed under author "Ellery Queen" (2 spaces). They imported without cover, format or page count, as well. One specific example is The Chinese Orange Mystery.


back to top