Goodreads Librarians Group discussion
      note: This topic has been closed to new comments.
    
  
  
      [Closed] Added Books/Editions
      >
    Large Book Data Import
    
  
  
      N/AThe above author has 632 works. They appear to be imports from Amazon_kcw and Do Not Have ASIN or ISBN - all apparently are duplicates that need deleting?
      Sorry all - didn't mean to abandon you. I thought that someone had released the fix to stop books without asins or isbns from being created, but apparently that's not the case. I'll look into it today and report back with some info for you all.
    
      Thank you. It would be great if any duplicates can be dealt with as well (Or even deleted if they are unshelved)
    
      I'll try to fit the deletion of duplicates into this project. I'll let you know if it fits in schedule-wise. If it doesn't, I'll try to at least let you know when you can expect that to happen.
    
      Paw3pals wrote: "N/AThe above author has 632 works. They appear to be imports from Amazon_kcw and Do Not Have ASIN or ISBN - all apparently are duplicates that need deleting?"
Wow, who's going to fix all that? It's ridiculous.
      Erin (Paperback Stash) wrote: "Wow, who's going to fix all that? It's ridiculous."Voluntary masochists (aka librarians), I assume...
      I hear the word 'masochist' and 'librarian' buzzing in my ear... Are you talking about ME? hahahaNot sure what would be a bigger challenge: N/A or Various (and its variations) :P
I wouldn't mind, and actually is not such a bad idea to work on in between :)
      This book has had its real cover replaced by a generic one by amazon_kcw! >:(https://www.goodreads.com/book/show/1...
      Scott wrote: "This book has had its real cover replaced by a generic one by amazon_kcw!"...and, what makes matters worse, the overwritten original image was provided by a GR user and not a robot - which is in clear contradiction to "robots not allowed to change 'human' input" stated many times earlier in this thread.
Sarah, could you please enlighten us?
      Ellie [The Empress] wrote: "Just out of curiosity, how do you know the image has been replaced?"Not being Scott, still trying to answer.
Yes, you have to look into the librarian log, and you have to select "this edition", which results in this link:
https://www.goodreads.com/book/edits/...
The entries are ordered from most recent to past. You can see that on June 4th, 2014, amazon_kcw changed the image (to the generic greenish one currently in use). Scrolling down (last entry), you will find that user "Netanella" has uploaded a (seemingly fitting, but one can "see" that only after undoing amazon_kcw's modification) cover image on Oct 15th, 2008.
Hth
      Michael wrote: "The entries are ordered from most recent to past. You can see that on June 4th, 2014, amazon_kcw changed the image (to the generic greenish one currently in use). Scrolling down (last entry), you will find that user "Netanella" has uploaded a (seemingly fitting, but one can "see" that only after undoing amazon_kcw's modification) cover image on Oct 15th, 2008."I know. And I'm sure in this case Scott is right as it is a script replacing an image uploaded by a user, but if that was two users, the log means nothing. Anyway I don't want to flood this topic with irrelevant rants of how bullshit the log is!
      Sarah, I forgot something. The editions that were imported as Kindle but had ISBN numbers are changed by the fix to Kindles with ASIN numbers, currently some display THE ISBN number but when you go to the edit page you see the ASIN number. Also well as changing the language reverts the fix and assigns the change to the user changing the lang. code. I had another topic giving example of this, but I have no time today to be looking for it.
      Sarah ----- Stop the imports....the amazon_kcw is overwriting covers again!
What a mess, every time I look at my bookshelf books have randomly changed again.
        
      That one appears to have been KCW updating an image it (or another import) had previously added. That is expected behavior. Otherwise we get stuck with a LOT of "cover soon" and "cover not available" images.
What is not expected (and we agree definitely problematic) is if KCW (or any other import) is updating an image uploaded by a user.
  
  
  What is not expected (and we agree definitely problematic) is if KCW (or any other import) is updating an image uploaded by a user.
      rivka wrote: "That one appears to have been KCW updating an image it (or another import) had previously added. That is expected behavior. Otherwise we get stuck with a LOT of "cover soon" and "cover not availabl..."I don't think that should be the behavior though. As Amazon allows cover changes willie nillie on books, especially kindle ones.
      The code to reject imports of books without isbns/asins is written and under review at the moment. I'll let you know when it goes live. There is a ticket to investigate KCW overwriting cover images, but I don't think I'll be working on that - I'll try to let you know what we find, though.
      Sarah wrote: "The code to reject imports of books without isbns/asins is written and under review at the moment. I'll let you know when it goes live."What about the code to reject imports which overwrite attributes set by non-robots (i.e. librarians)? That would also take care of the cover images...
      Later this afternoon the code should go live that will:1) Require imports to have at least one identifier
2) Disallow the author 'N/A'
I'd like to run a script to clean up all the books without identifiers (at least those with no reviews or shelvings), but that likely won't happen for a couple of weeks just due to timing reasons.
As for rejecting overwrites of user-provided data, that should already be the case. I've created a ticket to look at the cover image overwriting issue that's been reported. Let me know if you see more of those.
      Sarah wrote: "As for rejecting overwrites of user-provided data, that should already be the case. I've created a ticket to look at the cover image overwriting issue that's been reported. Let me know if you see more of those."Hi Sarah, great. Could you please indicate the relevant cut-off date, i.e. since when has this rule been (supposedly :-) ) imposed?
We all happen to stumble across offenders times and again, but it would be useless to report cases from, let's say, February if the rule had been in place only from April onwards...
      Michael wrote: "Hi Sarah, great. Could you please indicate the relevant cut-off date, i.e. since when has this rule been (supposedly :-) ) imposed?..."I'll have to look up the exact date for the cutoff, but certainly if you see a cover being overwritten May and afterward, report it!
      Sarah wrote: "There is a ticket to investigate KCW overwriting cover images, but I don't think I'll be working on that - I'll try to let you know what we find, though."amazon_kcw rewriting itself:
https://www.goodreads.com/book/edits/...
        
      That is expected behavior; import sources are expected to overwrite themselves (and sometimes other sources, depending on which is higher in our hierarchy). Otherwise we get stuck with a LOT of "cover soon" and "cover not available" images.
What is not expected -- and definitely problematic -- is if KCW (or any other import) is updating an image uploaded by a user.
In this particular case it also appears to have been exactly the same image.
  
  
  What is not expected -- and definitely problematic -- is if KCW (or any other import) is updating an image uploaded by a user.
In this particular case it also appears to have been exactly the same image.
      rivka wrote: "That is expected behavior; import sources are expected to overwrite themselves (and sometimes other sources, depending on which is higher in our hierarchy). Otherwise we get stuck with a LOT of "co..."I don't know if that's such a good idea anymore, considering Amazon has, for some reason, replaced a large quantity of real covers with generic ones on its site.
      Scott wrote: "I don't know if that's such a good idea anymore, considering Amazon has, for some reason, replaced a large quantity of real covers with generic ones on its site. ..."I agree.
I also don't know that you would have many "cover soon" or "cover not available" from amazon_kcw.
I would rather have us update with an image when it becomes available manually. Because right now it is overwriting its self with legitimate covers with another cover that should be an alternate cover edition. Example with the one I posted a few msgs back, the cover shown when I shelved the book was legit and matched book I had in-hand (this was a paperback book).
      I agree with Bookworm R -- allowing amazon_kcw to overwrite images means that alternate cover editions are not being created. And that any member shelving the older image (or using for cover challenges, listopias, etc.) is seeing their book catalogs and efforts vandalized (or even being disqualified from team and individual challenges/games with cover parameters).
    
      rivka wrote: "... import sources are expected to overwrite themselves (and sometimes other sources, depending on which is higher in our hierarchy)..."Would you care to divulge the currently applied hierarchy? As it stands, e.g. ingram may be overwritten by amazon_kcw. [sarcasm] For better data quality, or because it's the owner's feed? [/sarcasm]
        
      I don't happen to know it. But Sarah mentioned previously that we had moved KCW down to lower priority, mostly based on feedback in this thread. So it should no longer be happening that KCW is overwriting Ingram.
    
  
  
  
      kcw_amazon and sable_amazon have been set to a lower priority than pretty much any of our other book import sources. All data sources are considered lower priority than user-set data.It does seem like we're having the same book imported multiple times (resulting in kcw overwriting kcw-uploaded covers) in contexts where we don't expect that behavior. I'll reach out to someone and see if I can figure out why we're seeing extra imports.
      I have been going through a working on cleaning up the improperly formatted "J" initials where no period exists, and when went in to do some today there were a ton of newly created authors by the book import I believe (source was onix ingram).There must have been a large import yesterday from the book creation dates, but the author names were created violating the GR policy of periods after all initials and no spaces.
An example book (I did not fix this one yet) is https://www.goodreads.com/book/show/2...
And I mean a very large number, it kind of makes the work I had been doing null and void.
      I'm also confused and wondering if the import is what is doing this - some of my books are now foreign editions on my shelves. I know for a fact I didn't pick that version of the book. Likewise, I've had some books shelved for awhile but the covers have been changed to an alternate cover.It could be a librarian changing these things, yet it seems an odd coincidence this happens more often now since we started the amazon imports.
      Erin (Paperback Stash) wrote: "I'm also confused and wondering if the import is what is doing this - some of my books are now foreign editions on my shelves. I know for a fact I didn't pick that version of the book. Likewise, I'..."Was the log not helpful with identifying the reason for the change?
      FYI re: odd amazon_kcw import (esp title) @ Jun 07, 2014 05:24AMwww.goodreads.com/book/edits/22438305
ETA: left uncombined for now so to leave edit history clear.
ETA2: especially b/c this edition will probably require merging w/ matching existing one.
      Question: if I delete an incorrect cover and don't replace it with anything, will any of the imports come along and replace my deletion with a cover (I'm thinking of those hideous generic covers amazon is so fond of, but I mean the question more generally as well)?Because in the log, the language used when I delete a cover is that I "uploaded" a cover. I "uploaded" a non-image, I suppose.
It's not as big a deal as if an import deletes a proper cover that a user has uploaded, which is actual lost data, but it's still a big annoyance given that it takes time to do librarian edits even if they are just deletes, and it's a waste of my time if my proper deletion is then overwritten with an incorrect cover.
      Is it possible for the scripts NOT to add items with edition handle DVD?Example: https://www.goodreads.com/book/edits/...
https://www.goodreads.com/book/edits/...
https://www.goodreads.com/topic/show/...
      Hmmmm...What's up with these imported on 7/14/14 by amazon_kcw?https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
There are many many more: https://www.goodreads.com/search?page...
      Is anyone still checking this thing? amazon_kcw just imported this monstrosity:https://www.goodreads.com/book/show/2...
      There's also this one imported on Jul 17https://www.goodreads.com/book/show/2...
The Kindle edition for this title was already imported on Jul 2
Also the ASIN B00LO767D2 doesn't exist on Amazon.it
      One thing I never understood is if it's fine to correct these bad imports once reported or if it's better to leave them alone, can you tell me? Thanks
    
      Hey hey - just returning from a little time off. Back to business now :-)I've corresponded with the team that should be filtering out the TestASIN books and we'll hopefully get that prioritized for a fix.
Moloch - unless you're seeing a problem that you think will be really difficult to reproduce or explain if it's been fixed, you can go ahead and correct bad imports once they've been reported.
      Looks like another Onix Ingram import created a ton of Author Profiles with space separation to initials instead of the GR policy of period separation.
    
      Here's a fun one: https://www.goodreads.com/book/show/2...
Apparently J.K. Rowling is also a bassist named Andy Irvine. Who knew?
      Look - Twins!Onix, this time. Yet another "H G Wells", yet another "Stanislaw Lem".
And, to be on the safe side, twice.
https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
Commited Jul 23, 2014 02:20AM, so fairly recent.
Must be the import of #394...
      imports with testasin on title:https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
      It seems something has gone terribly wrong with this import source (amazon_kcw) and it is running rampant and needs to be put in time-out. See this thread about all the books given a single author and combined into a massive mess.
    
      Posted on the other thread as well re: the Harry Potter Box Set:Hi all - found the source of this bug and will be pushing a fix later today. Thanks for your vigilance and work on this one. The bug had been there for awhile, but other site behavior had stopped it from manifesting as a problem until relatively recently.
      I wonder if an automatic import is behind the error I fixed a few days ago in which a book by Martin Gottfried had been somehow listed as by Christopher Pike (the teen horror author). I can't tell from the changelog.https://www.goodreads.com/book/edits/...
In general, I admit I have no idea how the programming behind Goodreads works, but once a source imports a book incorrectly and then we change it, what's to stop the source from importing it again?
      This topic has been frozen by the moderator. No new comments can be posted.
  
Books mentioned in this topic
Snobs (other topics)The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
Divisadero (other topics)
More...
Authors mentioned in this topic
Unknown (other topics)Various (other topics)
Unknown (other topics)
Unknown (other topics)
Avery T. Willis Jr. (other topics)
More...




PLEASE STOP IMPORTS WITHOUT ANY IDENTIFYING NUMBER SUCH AS ISBN OR ASIN.