Goodreads Librarians Group discussion
note: This topic has been closed to new comments.
[Closed] Added Books/Editions
>
Large Book Data Import

My final question: Is it worthwhile to start already to repair imports like
https://www.goodreads.com/book/edits/...
manually (author! etc.), or will this just trigger another import, and we should wait?

That being said, we have a list of cleanup scripts almost ready to run (removing quotation marks from the publisher names, reattributing ASINs to their original GR records, removing duplicate physical editions imported by amazon_sable)
We'll also be trying to merge some of the authors created by amazon_kcw and amazon_sable (with Dr and missing periods after initials) into pre-existing authors. Don't worry, we'll be especially careful to merge the NEW record into the OLD one and not vice versa.
And I'm glad my explanations have been helpful! I know it's a complex tree of if's and then's - we're trying to pare it down and simplify it. And all of your input has been incredibly helpful in revising our strategies :-)

Are we still reporting such errors here?
https://www.goodreads.com/book/edits/...
Imported by kcw (along with one similar edition) without ISBN/ASIN, but does not appear to be a duplicate unless I missed something, which is entirely possible as my eyes are blurring from sitting too long in front of this screen. ;-)

They are coming through as published by "Lonely Planet", so I'm assuming its part of the data import.

Just wanted to give an update on the cleanup process. We've removed the extra quotes from publisher names for all books imported by amazon_sable. Please let me know if you see any we've missed!
Sam

https://www.goodreads.com/book/show/2...
I didn't fix it so as to leave it visible. A quick search for 'inactive' in the title field shows many of these. Should be an easy script to remove these, I should think...

https://www.goodreads.com/book/show/2...
Not deleting because I thought it should be seen. Imported 1/17/14.


(I mean, eventually I (or some librarian) will have to go in and fix all of these records anyway to list Hideyuki Kikuchi as the primary author and 菊地 秀行 as secondary since we still have no aka feature, but it's a lot easier to cut'n'paste author names if I don't have to also fix title field and language.)

This book had Sourcebooks Casablanca, the publisher, listed as the author for the kindle edition:
https://www.goodreads.com/book/show/2...
I changed it but you can still see it in the change logs.

a) turned around and separated by comma
b) and still has the "M.A." addition.
https://www.goodreads.com/book/show/1...

edited to add: I think it's similar to the situation described here: https://www.goodreads.com/topic/show/...
Also, I'm still running into many Kindle editions with the ISBN included as well as the ASIN, will those be fixed on their own or should we try to correct them?
thanks!

I'm hoping to run the (Language Edition) cleanup script this week as well as to continue the removal of duplicate physical editions created during the import - that'll help remove a bunch of the books imported with neither asin nor isbn/isbn13.

Since on Goodreads that's the way to disambiguate authors with the same name, this is potentially dangerous because, while in other cases you can immeditaly see there's something wrong (last name first name, for example), this looks identical to the correct spelling

Harper/Collins 2007 First Edition, first printing, measures 6 1/4" by 9 1/4" by 1", with 376 deckle edged pages and larger than average print. "Action, adventure...passion, a real page-turner in historical fiction"( Booklist).
-----------------------------------------------
Or this one that again overwrote a description:
The Rosetta Key
The Rosetta Key: A Novel, by William Dietrich (Author of Napoleon's Pyramids)
Hardcover book published by HarperCollins, First Edition, 1st printing, 2008

https://www.goodreads.com/book/show/1..."
You don't refer to all of your books by ASIN? ;-)
Yikes. Hopefully that isn't a common pattern... I can't imagine it is. Though my imagination has broadened in regards to styles of formatting titles and authors...

(Someone has already created a new edition with the valid ISBN13, so those two need to be merged, but I didn't want to that before reporting this here.)

Jr.^^First^LastName
.Leading Jr. is wrong - should be after the last name
.Two spaces after Jr. ???
I already fixed the Sr. names, but I bet there are II, III, IV names with the same problem. And probably others not found yet.

https://www.goodreads.com/book/show/1...
One thing I've noticed about foreign-language editions of Kindle books is that it sometimes has (to use this one as an example) "Italian Edition" in the title with "English" listed in the language field (I have checked Amazon.com and Italian is listed in the language field there). This was another one where the ISBN was shown on the front, but auto-updated as soon as I went to edit.

https://www.goodreads.com/book/show/4...
https://www.goodreads.com/book/show/4...
https://www.goodreads.com/book/show/3...

Yep, Amazon has been changing images left and right in the past month. I keep going through my books and finding new ones. I reverted those for you.

Between this amazon data feed and librarians "innocently" helping authors overwrite bookcovers even though it vandalizes reader bookshelves -- it's getting to be a mess. I know the new librarians are responsible for reading the manual, but possibly when accepted that acceptance email should include some FAQs and policy reminders of common issues they'll face. Reinforcing that goodreads keeps all editions, merge versus delete, etc. Not a long list of items because that's what the manual is for, but a few brief reminders.

Signed with a drawing by Peter Sis on the half-title page. Free tracking.
The Book of Imaginary Beings

This way a record could be imported with author (example) "Benedetto XVI" but a librarian shouldn't go and manually merge the profile with "Pope Benedict XVI", because the 2 would already be grouped together.
Since Sarah has been very responsive in this thread, I take the opportunity to ask her to take this new feature into consideration! :-)

This way a record could be imported with author (example) "Benedetto XVI" but a librarian shou..."
Already on our to-do list :-) Not sure when it will happen, but we're totally with you on this. I'll let you know!
Also, just a quick update - I'm running one of the scripts to remove duplicate books and books with no isbns or asins.

Will this affect ACEs in any way?

https://www.goodreads.com/book/show/1...

Example: https://www.goodreads.com/book/edits/...
Probably a pattern in the image name can be used to filter it out.
no-img-sm._V192198896_BO1,204,203,200_.gif

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...

Just FYI this is not a request topic for librarian edits. But thank you.

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...

There was a contract for it a couple of years ago, but it's not active anymore.

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2......"
There are some other non-book items being imported periodically as well (calendars, cards, etc) - we're working on filtering those out and will do a cleanup once we have a better filtering system in place.

Good--I've had to NAB a number of church service supplies while editing books with the word "bread" in their title. Now what to do about the box of communion wafers someone has added to his/her "Read" list?

Someone had even added the "book". Curiously enough this was not imported by Amazon but by ingram. 0.0

I would be funny if they get a genre box soon. :D
This topic has been frozen by the moderator. No new comments can be posted.
Books mentioned in this topic
Snobs (other topics)The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
Divisadero (other topics)
More...
Authors mentioned in this topic
Unknown (other topics)Various (other topics)
Unknown (other topics)
Unknown (other topics)
Avery T. Willis Jr. (other topics)
More...
That would be fine if we could differentiate between changes that correct what had previously been an erroneous mapping and changes that should maintain the established mapping, but simply clean up the GR book record. For instance, suppose the hardcover of a book on GR had mistakenly been assigned the isbn13 for the paperback edition. If someone corrects the isbn13 to match the hardcover edition we would want to unmap the book from its previous Amazon book and remap it. On the other hand, if a Librarian updates a title to more closely match the best practices for GR, we may want to maintain its mapping to its Amazon book even if the Amazon book's title remains as it was.
Right now there's no way to tell the difference between these two types of changes when a Librarian updates a book record. That's something we're working on.
Also, keep in mind that title and author matching are done using relatively loose matching criteria - they just have to be close enough to count as a match. We're working to improve that logic on the GR side further.