Goodreads Librarians Group discussion

note: This topic has been closed to new comments.
2617 views
[Closed] Added Books/Editions > Large Book Data Import

Comments Showing 201-250 of 472 (472 new)    post a comment »

message 201: by Sarah (new)

Sarah M (sarahsomeone) | 85 comments Michael wrote: "The process has to keep up and cling tight to a once established ASIN-GR BookID relation, regardless whether the original matching criteria still hold or not. It has to "remember" the GR BookID... "

That would be fine if we could differentiate between changes that correct what had previously been an erroneous mapping and changes that should maintain the established mapping, but simply clean up the GR book record. For instance, suppose the hardcover of a book on GR had mistakenly been assigned the isbn13 for the paperback edition. If someone corrects the isbn13 to match the hardcover edition we would want to unmap the book from its previous Amazon book and remap it. On the other hand, if a Librarian updates a title to more closely match the best practices for GR, we may want to maintain its mapping to its Amazon book even if the Amazon book's title remains as it was.

Right now there's no way to tell the difference between these two types of changes when a Librarian updates a book record. That's something we're working on.

Also, keep in mind that title and author matching are done using relatively loose matching criteria - they just have to be close enough to count as a match. We're working to improve that logic on the GR side further.


message 202: by Michael (new)

Michael (mwelser) | 217 comments Thank you, Sara, for elaborating on the issue (that is an all-time record for GR staff actually explaining what is going on).

My final question: Is it worthwhile to start already to repair imports like

https://www.goodreads.com/book/edits/...

manually (author! etc.), or will this just trigger another import, and we should wait?


message 203: by Sarah (last edited Jan 16, 2014 02:38PM) (new)

Sarah M (sarahsomeone) | 85 comments If there's something pressing that you'd like to update, by all means feel free to update it. If the book you're editing is a Kindle Edition with an ASIN, it's incredibly unlikely that it will be imported by amazon_kcw again (there would have to be some sort of race condition for that to happen - i.e. changes happen on both GR and Amazon simultaneously and the Amazon feed beats the GR edit). If it's a physical edition, there's a slight chance that a duplicate edition might be imported again if you edit or delete a GR book.

That being said, we have a list of cleanup scripts almost ready to run (removing quotation marks from the publisher names, reattributing ASINs to their original GR records, removing duplicate physical editions imported by amazon_sable)

We'll also be trying to merge some of the authors created by amazon_kcw and amazon_sable (with Dr and missing periods after initials) into pre-existing authors. Don't worry, we'll be especially careful to merge the NEW record into the OLD one and not vice versa.

And I'm glad my explanations have been helpful! I know it's a complex tree of if's and then's - we're trying to pare it down and simplify it. And all of your input has been incredibly helpful in revising our strategies :-)


message 204: by Keith (new)

Keith (kgf0) | 377 comments Sarah wrote: "Although as a separate note - in some cases kcw will create a physical book sans isbn/isbn13 because those isbns already exist in our catalog but have conflicting author/title information."

Are we still reporting such errors here?

https://www.goodreads.com/book/edits/...

Imported by kcw (along with one similar edition) without ISBN/ASIN, but does not appear to be a duplicate unless I missed something, which is entirely possible as my eyes are blurring from sitting too long in front of this screen. ;-)


message 205: by Sandra (new)

Sandra | 31402 comments I'm now seeing quite a few kindle editions that no longer exist at Amazon, so I can't verify authorship to combine etc.

They are coming through as published by "Lonely Planet", so I'm assuming its part of the data import.


message 206: by Sam (new)

Sam | 5 comments Hey all,
Just wanted to give an update on the cleanup process. We've removed the extra quotes from publisher names for all books imported by amazon_sable. Please let me know if you see any we've missed!
Sam


message 207: by Z-squared (new)

Z-squared | 8576 comments Forgive me if this one has been pointed out, but I ran across an ASIN edition with "[INACTVE]" in the title:

https://www.goodreads.com/book/show/2...

I didn't fix it so as to leave it visible. A quick search for 'inactive' in the title field shows many of these. Should be an easy script to remove these, I should think...


message 208: by Lobstergirl (last edited Jan 21, 2014 06:15PM) (new)

Lobstergirl Not sure what's going on with this entry. No ISBN or ASIN, imported not from Amazon but from "Goodreads," Amazon fake cover, and it has no data other than publisher. There's no need for this record to exist as there are many other extant editions of this work that do have data and legit covers.

https://www.goodreads.com/book/show/2...

Not deleting because I thought it should be seen. Imported 1/17/14.


message 209: by Plethora (new)

Plethora (bookworm_r) | 359 comments The imports have created a new author for Voltaire. Now all these editions of books listed for François Voltaire are not combined with the proper editions under just Voltaire.


message 210: by Cait (last edited Jan 24, 2014 09:30AM) (new)

Cait (tigercait) | 4988 comments Is there any ETA on a cleanup on the "([language] Edition)" thing for the amazon_kcw imports? I was doing some cleanup on 菊地 秀行 and there are ~100 new Kindle editions from amazon.co.jp (yay!) which all have "(Japanese Edition)" in the title and English as the language (boo!).

(I mean, eventually I (or some librarian) will have to go in and fix all of these records anyway to list Hideyuki Kikuchi as the primary author and 菊地 秀行 as secondary since we still have no aka feature, but it's a lot easier to cut'n'paste author names if I don't have to also fix title field and language.)


message 211: by Lorre (new)

Lorre | 108 comments I found another sort of error in the import of data from amazon_kcw.
This book had Sourcebooks Casablanca, the publisher, listed as the author for the kindle edition:
https://www.goodreads.com/book/show/2...

I changed it but you can still see it in the change logs.


message 212: by Helmut (new)


message 213: by Helmut (new)

Helmut (schlimmerdurst) | 43 comments And here's another one where one author's name was
a) turned around and separated by comma
b) and still has the "M.A." addition.

https://www.goodreads.com/book/show/1...


message 214: by Andrea (last edited Jan 27, 2014 04:10AM) (new)

Andrea (andrea_b) | 571 comments This book https://www.goodreads.com/book/show/1... apparently came from an Amazon import, but it seems to be a vendor listing of volumes 1 (https://www.goodreads.com/book/show/1...) & 2 (https://www.goodreads.com/book/show/1...) together (the ISBN is fake)... should it be merged into one of the volumes or should it remain as a separate work?

edited to add: I think it's similar to the situation described here: https://www.goodreads.com/topic/show/...

Also, I'm still running into many Kindle editions with the ISBN included as well as the ASIN, will those be fixed on their own or should we try to correct them?

thanks!


message 215: by Sarah (new)

Sarah M (sarahsomeone) | 85 comments Thanks everyone for your continued input! Just checking in to let you know that last week we had to delay some of our cleanup work, but many of the scripts are ready to go as soon as we get the go ahead to run them. We'll try our best to keep you up to date as we move forward.

I'm hoping to run the (Language Edition) cleanup script this week as well as to continue the removal of duplicate physical editions created during the import - that'll help remove a bunch of the books imported with neither asin nor isbn/isbn13.


message 216: by Cait (new)

Cait (tigercait) | 4988 comments Thanks for keeping us updated, Sarah! :)


message 217: by Moloch (new)

Moloch | 3975 comments Something strange that I'm finding is that amazon_kwc imports authors with 2 spaces between first and last name: I've just fixed this one https://www.goodreads.com/author/show... imported on this record https://www.goodreads.com/book/show/1... as Mineko^^Iwasaki for some reason

Since on Goodreads that's the way to disambiguate authors with the same name, this is potentially dangerous because, while in other cases you can immeditaly see there's something wrong (last name first name, for example), this looks identical to the correct spelling


message 218: by Plethora (last edited Jan 28, 2014 07:56AM) (new)

Plethora (bookworm_r) | 359 comments Another example of a description that needs to be somehow coded to not import Napoleon's Pyramids.

Harper/Collins 2007 First Edition, first printing, measures 6 1/4" by 9 1/4" by 1", with 376 deckle edged pages and larger than average print. "Action, adventure...passion, a real page-turner in historical fiction"( Booklist).
-----------------------------------------------
Or this one that again overwrote a description:
The Rosetta Key

The Rosetta Key: A Novel, by William Dietrich (Author of Napoleon's Pyramids)

Hardcover book published by HarperCollins, First Edition, 1st printing, 2008


message 220: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Looks like that's how it was listed on Amazon: http://webcache.googleusercontent.com...


message 221: by Sarah (new)

Sarah M (sarahsomeone) | 85 comments Lobstergirl wrote: "Lame title...

https://www.goodreads.com/book/show/1..."


You don't refer to all of your books by ASIN? ;-)

Yikes. Hopefully that isn't a common pattern... I can't imagine it is. Though my imagination has broadened in regards to styles of formatting titles and authors...


message 222: by Renske (new)

Renske | 12220 comments This one is from last month, so maybe this is not important any more. But on this edition the import changed a valid ISBN13 https://www.goodreads.com/book/show/1...
(Someone has already created a new edition with the valid ISBN13, so those two need to be merged, but I didn't want to that before reporting this here.)


message 223: by Paw3pals (last edited Jan 29, 2014 10:31PM) (new)

Paw3pals | 1939 comments FYI - there are many Amazon Sable imports, most with no IDs but some with ASINs, that have the following wrong author name format:
Jr.^^First^LastName

.Leading Jr. is wrong - should be after the last name
.Two spaces after Jr. ???

I already fixed the Sr. names, but I bet there are II, III, IV names with the same problem. And probably others not found yet.


message 224: by Susie (last edited Jan 30, 2014 07:47AM) (new)

Susie (dragonsusie) | 2469 comments Here's one, from 5 December:
https://www.goodreads.com/book/show/1...

One thing I've noticed about foreign-language editions of Kindle books is that it sometimes has (to use this one as an example) "Italian Edition" in the title with "English" listed in the language field (I have checked Amazon.com and Italian is listed in the language field there). This was another one where the ISBN was shown on the front, but auto-updated as soon as I went to edit.


message 225: by TW (new)

TW | 4 comments I'm not sure if this is a "bad import" issue with Amazon, but it seems too conincidental. Several of my book cover images have changed over the past month. Here is a few of many. If I need to address this in a new thread let me know. Thanks.

https://www.goodreads.com/book/show/4...
https://www.goodreads.com/book/show/4...
https://www.goodreads.com/book/show/3...


message 226: by Scott (new)

Scott | 8610 comments TW wrote: "I'm not sure if this is a "bad import" issue with Amazon, but it seems too conincidental. Several of my book cover images have changed over the past month. Here is a few of many. If I need to addre..."

Yep, Amazon has been changing images left and right in the past month. I keep going through my books and finding new ones. I reverted those for you.


Debbie's Spurts (D.A.) | 6325 comments I think goodreads should run create a static list of all image reversions for the month for librarians to check thru.

Between this amazon data feed and librarians "innocently" helping authors overwrite bookcovers even though it vandalizes reader bookshelves -- it's getting to be a mess. I know the new librarians are responsible for reading the manual, but possibly when accepted that acceptance email should include some FAQs and policy reminders of common issues they'll face. Reinforcing that goodreads keeps all editions, merge versus delete, etc. Not a long list of items because that's what the manual is for, but a few brief reminders.


message 228: by Plethora (new)

Plethora (bookworm_r) | 359 comments Here is another description that is from third-party.

Signed with a drawing by Peter Sis on the half-title page. Free tracking.

The Book of Imaginary Beings


message 229: by Moloch (new)

Moloch | 3975 comments One thing that this import is clearly showing is how much the AKA feature for authors would be useful.
This way a record could be imported with author (example) "Benedetto XVI" but a librarian shouldn't go and manually merge the profile with "Pope Benedict XVI", because the 2 would already be grouped together.

Since Sarah has been very responsive in this thread, I take the opportunity to ask her to take this new feature into consideration! :-)


message 230: by Sarah (last edited Jan 31, 2014 11:19AM) (new)

Sarah M (sarahsomeone) | 85 comments Moloch wrote: "One thing that this import is clearly showing is how much the AKA feature for authors would be useful.
This way a record could be imported with author (example) "Benedetto XVI" but a librarian shou..."


Already on our to-do list :-) Not sure when it will happen, but we're totally with you on this. I'll let you know!

Also, just a quick update - I'm running one of the scripts to remove duplicate books and books with no isbns or asins.


message 231: by Susie (new)

Susie (dragonsusie) | 2469 comments Sarah wrote: "Also, just a quick update - I'm running one of the scripts to remove duplicate books and books with no isbns or asins."

Will this affect ACEs in any way?


message 232: by Sarah (new)

Sarah M (sarahsomeone) | 85 comments This will only affect books imported by amazon_sable - so ACEs won't be affected.


message 233: by Deon (last edited Jan 31, 2014 01:20PM) (new)

Deon (deonva) | 3718 comments "Librarian" deleted the book with a cover and left the one without a cover.

https://www.goodreads.com/book/show/1...


message 234: by Empress (last edited Feb 02, 2014 01:29PM) (new)

Empress (the_empress) Is it possible the "image-not-available" image from amazon to be excluded when creating entries?

Example: https://www.goodreads.com/book/edits/...


Probably a pattern in the image name can be used to filter it out.
no-img-sm._V192198896_BO1,204,203,200_.gif


message 235: by Lobstergirl (new)

Lobstergirl Your Amazon overlords need to be told that the description "Fictional Novel" is not okay.


message 238: by Lobstergirl (new)

Lobstergirl This description is painful to look at.

https://www.goodreads.com/book/show/2...


message 239: by Denim (new)

Denim Datta (denimdatta) #238 is fixed.


message 240: by Empress (new)

Empress (the_empress) Denim wrote: "#238 is fixed."

Just FYI this is not a request topic for librarian edits. But thank you.


message 242: by Empress (new)

Empress (the_empress) I didn't know barnes noble imports books as well: https://www.goodreads.com/book/edits/...


message 243: by Cait (new)

Cait (tigercait) | 4988 comments Ellie [The Empress] wrote: "I didn't know barnes noble imports books as well: https://www.goodreads.com/book/edits/..."

There was a contract for it a couple of years ago, but it's not active anymore.


message 244: by Sarah (new)

Sarah M (sarahsomeone) | 85 comments Lobstergirl wrote: "Amazon is importing individual Maxim issues.

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2......"


There are some other non-book items being imported periodically as well (calendars, cards, etc) - we're working on filtering those out and will do a cleanup once we have a better filtering system in place.


message 245: by Kathy (last edited Feb 11, 2014 04:53PM) (new)

Kathy | 233 comments Sarah wrote: "There are some other non-book items being imported periodically as well (calendars, cards, etc) - we're working on filtering those out and will do a cleanup once we have a better filtering system in place."

Good--I've had to NAB a number of church service supplies while editing books with the word "bread" in their title. Now what to do about the box of communion wafers someone has added to his/her "Read" list?


message 246: by Scott (new)

Scott | 8610 comments GoodEats


message 247: by Empress (new)

Empress (the_empress) I like the author;s name Artistic Churchware.

Someone had even added the "book". Curiously enough this was not imported by Amazon but by ingram. 0.0


message 248: by Lobstergirl (new)

Lobstergirl Oh now I want to shelve the communion wafers, dammit.


message 249: by Empress (new)

Empress (the_empress) Lobstergirl wrote: "Oh now I want to shelve the communion wafers, dammit."

I would be funny if they get a genre box soon. :D


Susanna - Censored by GoodReads (susannag) | 68 comments Lobstergirl wrote: "Oh now I want to shelve the communion wafers, dammit."

Same here!


back to top
This topic has been frozen by the moderator. No new comments can be posted.