Goodreads Librarians Group discussion
note: This topic has been closed to new comments.
[Closed] Added Books/Editions
>
Large Book Data Import

The above author has 632 works. They appear to be imports from Amazon_kcw and Do Not Have ASIN or ISBN - all apparently are duplicates that need deleting?




The above author has 632 works. They appear to be imports from Amazon_kcw and Do Not Have ASIN or ISBN - all apparently are duplicates that need deleting?"
Wow, who's going to fix all that? It's ridiculous.

Voluntary masochists (aka librarians), I assume...

Not sure what would be a bigger challenge: N/A or Various (and its variations) :P
I wouldn't mind, and actually is not such a bad idea to work on in between :)

https://www.goodreads.com/book/show/1...

...and, what makes matters worse, the overwritten original image was provided by a GR user and not a robot - which is in clear contradiction to "robots not allowed to change 'human' input" stated many times earlier in this thread.
Sarah, could you please enlighten us?

Not being Scott, still trying to answer.
Yes, you have to look into the librarian log, and you have to select "this edition", which results in this link:
https://www.goodreads.com/book/edits/...
The entries are ordered from most recent to past. You can see that on June 4th, 2014, amazon_kcw changed the image (to the generic greenish one currently in use). Scrolling down (last entry), you will find that user "Netanella" has uploaded a (seemingly fitting, but one can "see" that only after undoing amazon_kcw's modification) cover image on Oct 15th, 2008.
Hth

I know. And I'm sure in this case Scott is right as it is a script replacing an image uploaded by a user, but if that was two users, the log means nothing. Anyway I don't want to flood this topic with irrelevant rants of how bullshit the log is!

The editions that were imported as Kindle but had ISBN numbers are changed by the fix to Kindles with ASIN numbers, currently some display THE ISBN number but when you go to the edit page you see the ASIN number. Also well as changing the language reverts the fix and assigns the change to the user changing the lang. code. I had another topic giving example of this, but I have no time today to be looking for it.

Stop the imports....the amazon_kcw is overwriting covers again!
What a mess, every time I look at my bookshelf books have randomly changed again.
That one appears to have been KCW updating an image it (or another import) had previously added. That is expected behavior. Otherwise we get stuck with a LOT of "cover soon" and "cover not available" images.
What is not expected (and we agree definitely problematic) is if KCW (or any other import) is updating an image uploaded by a user.
What is not expected (and we agree definitely problematic) is if KCW (or any other import) is updating an image uploaded by a user.

I don't think that should be the behavior though. As Amazon allows cover changes willie nillie on books, especially kindle ones.

There is a ticket to investigate KCW overwriting cover images, but I don't think I'll be working on that - I'll try to let you know what we find, though.

What about the code to reject imports which overwrite attributes set by non-robots (i.e. librarians)? That would also take care of the cover images...

1) Require imports to have at least one identifier
2) Disallow the author 'N/A'
I'd like to run a script to clean up all the books without identifiers (at least those with no reviews or shelvings), but that likely won't happen for a couple of weeks just due to timing reasons.
As for rejecting overwrites of user-provided data, that should already be the case. I've created a ticket to look at the cover image overwriting issue that's been reported. Let me know if you see more of those.

Hi Sarah, great. Could you please indicate the relevant cut-off date, i.e. since when has this rule been (supposedly :-) ) imposed?
We all happen to stumble across offenders times and again, but it would be useless to report cases from, let's say, February if the rule had been in place only from April onwards...

I'll have to look up the exact date for the cutoff, but certainly if you see a cover being overwritten May and afterward, report it!

amazon_kcw rewriting itself:
https://www.goodreads.com/book/edits/...
That is expected behavior; import sources are expected to overwrite themselves (and sometimes other sources, depending on which is higher in our hierarchy). Otherwise we get stuck with a LOT of "cover soon" and "cover not available" images.
What is not expected -- and definitely problematic -- is if KCW (or any other import) is updating an image uploaded by a user.
In this particular case it also appears to have been exactly the same image.
What is not expected -- and definitely problematic -- is if KCW (or any other import) is updating an image uploaded by a user.
In this particular case it also appears to have been exactly the same image.

I don't know if that's such a good idea anymore, considering Amazon has, for some reason, replaced a large quantity of real covers with generic ones on its site.

I agree.
I also don't know that you would have many "cover soon" or "cover not available" from amazon_kcw.
I would rather have us update with an image when it becomes available manually. Because right now it is overwriting its self with legitimate covers with another cover that should be an alternate cover edition. Example with the one I posted a few msgs back, the cover shown when I shelved the book was legit and matched book I had in-hand (this was a paperback book).


Would you care to divulge the currently applied hierarchy? As it stands, e.g. ingram may be overwritten by amazon_kcw. [sarcasm] For better data quality, or because it's the owner's feed? [/sarcasm]
I don't happen to know it. But Sarah mentioned previously that we had moved KCW down to lower priority, mostly based on feedback in this thread. So it should no longer be happening that KCW is overwriting Ingram.

It does seem like we're having the same book imported multiple times (resulting in kcw overwriting kcw-uploaded covers) in contexts where we don't expect that behavior. I'll reach out to someone and see if I can figure out why we're seeing extra imports.

There must have been a large import yesterday from the book creation dates, but the author names were created violating the GR policy of periods after all initials and no spaces.
An example book (I did not fix this one yet) is https://www.goodreads.com/book/show/2...
And I mean a very large number, it kind of makes the work I had been doing null and void.

It could be a librarian changing these things, yet it seems an odd coincidence this happens more often now since we started the amazon imports.

Was the log not helpful with identifying the reason for the change?

www.goodreads.com/book/edits/22438305
ETA: left uncombined for now so to leave edit history clear.
ETA2: especially b/c this edition will probably require merging w/ matching existing one.

Because in the log, the language used when I delete a cover is that I "uploaded" a cover. I "uploaded" a non-image, I suppose.
It's not as big a deal as if an import deletes a proper cover that a user has uploaded, which is actual lost data, but it's still a big annoyance given that it takes time to do librarian edits even if they are just deletes, and it's a waste of my time if my proper deletion is then overwritten with an incorrect cover.

Example: https://www.goodreads.com/book/edits/...
https://www.goodreads.com/book/edits/...
https://www.goodreads.com/topic/show/...

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
There are many many more: https://www.goodreads.com/search?page...

https://www.goodreads.com/book/show/2...

https://www.goodreads.com/book/show/2...
The Kindle edition for this title was already imported on Jul 2
Also the ASIN B00LO767D2 doesn't exist on Amazon.it


I've corresponded with the team that should be filtering out the TestASIN books and we'll hopefully get that prioritized for a fix.
Moloch - unless you're seeing a problem that you think will be really difficult to reproduce or explain if it's been fixed, you can go ahead and correct bad imports once they've been reported.


https://www.goodreads.com/book/show/2...
Apparently J.K. Rowling is also a bassist named Andy Irvine. Who knew?

Onix, this time. Yet another "H G Wells", yet another "Stanislaw Lem".
And, to be on the safe side, twice.
https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
Commited Jul 23, 2014 02:20AM, so fairly recent.
Must be the import of #394...

https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...
https://www.goodreads.com/book/show/2...


Hi all - found the source of this bug and will be pushing a fix later today. Thanks for your vigilance and work on this one. The bug had been there for awhile, but other site behavior had stopped it from manifesting as a problem until relatively recently.

https://www.goodreads.com/book/edits/...
In general, I admit I have no idea how the programming behind Goodreads works, but once a source imports a book incorrectly and then we change it, what's to stop the source from importing it again?
This topic has been frozen by the moderator. No new comments can be posted.
Books mentioned in this topic
Snobs (other topics)The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
The Twelve Dates of Christmas: Dates 1 and 2 (other topics)
Divisadero (other topics)
More...
Authors mentioned in this topic
Unknown (other topics)Various (other topics)
Unknown (other topics)
Unknown (other topics)
Avery T. Willis Jr. (other topics)
More...
PLEASE STOP IMPORTS WITHOUT ANY IDENTIFYING NUMBER SUCH AS ISBN OR ASIN.