Goodreads Librarians Group discussion

86 views
Book & Author Page Issues > Warning: GoodReads scripts messing with authors

Comments Showing 1-36 of 36 (36 new)    post a comment »
dateUp arrow    newest »

This Is Not The Michael You're Looking For | 949 comments I'm going to post this as a bug in the Feedback group, but wanted to warn librarians:

In the last day or so, "GoodReads" (according to the change log) has been updating authors on a number of books. I'm not sure what the motivation is (my guess is there is a script trying to update missing info; I don't know if this is related to the combine script or something else), but unfortunately it is starting to introduce errors into books.

For example, if the first author has been disambiguated by adding a single space, it's adding the non-spaced author as a second author (this happened to me when I discovered a new book on my home page by another author with the same name...a book which had already been disambiguated and still had the other disambiguated author as first author). It's also adding name variants (another librarian thread noted that GoodReads added Angela Hunter to a book by Angela M. Hunter) as secondary authors.

I don't know how widespread this is, but you may want to keep your eyes open for similar sorts of errors.


message 2: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Michael, I already told Otis about this issue, but I hadn't realized the degree, so I'll update him.


This Is Not The Michael You're Looking For | 949 comments Oh. I just posted the bug to feedback. I'll add a note that Otis has been informed.

Hard to know how serious it is, but seeing two examples of the same thing 10 minutes apart, without even looking for it, makes me think it might be pretty wide spread.

I didn't really think about it much when I saw the odd error on the book with my name on it, but when I saw ♥Eva♥'s post bells went off. I'm currently looking at the global librarian change log. Almost every single edit on the first page is the addition of authors to a book by "GoodReads" and just skimming through the list I can see:

* "Eve Berman Do" added as second author to book by "Eve Berman"
* "Mike Bales" added as second author to book by "Michael C. Bales"
* "Johnson Wales University Staf" added as second author to book by "Johnson & Wales University"
* "Maureen E. Reed" added as second author to book by "Maureen Reed"
* "Bakvis, Herman / Skogstad, Grace Bakvis, Herman / Skogstad, Grace" added as second author to book by "Grace Skogstad"

And those are just from the first 20 or so entries...


This Is Not The Michael You're Looking For | 949 comments At the rate at which these changes are being made and the frequency with which they are redundant or wrong with respect to the already correct entries (I just did a refresh of the librarian edit page...) we're looking at potentially tens of thousands of errors.


message 5: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
>_<


message 6: by Melody (new)

Melody (runningtune) | 13300 comments I've noticed several authors like this:
http://www.goodreads.com/author/show/...

With entries for both with the punctuation after the middle initial and without listed with the same book. Just started noticing this this morning.


message 7: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
I'm going to leave that (twitchy as it makes me) so Otis can see it.


This Is Not The Michael You're Looking For | 949 comments Just think of all the additional edits you're going to be able to add to your librarian status! :-)


message 9: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Whut....-glare-


message 10: by Shanon (new)

Shanon (boban) Should we be editing these as we find them or waiting until the bug is definitely fixed?


message 11: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
I suspect that until the script is repaired these will keep coming back. So it's up to you.


message 12: by Sandra (new)

Sandra | 31493 comments Thanks for the heads up people, if I see any I'll forcibly restrain myself until the bug is fixed.


This Is Not The Michael You're Looking For | 949 comments This is all part of a ploy by Otis to get top librarian. According to the logs he's made 312,649 edits this week!

11 hours later, the script is still chugging along. This should be fun.


message 14: by Lisa (new)

Lisa Vegan (lisavegan) | 2400 comments And I think Otis is goodreads' forever top librarian! ;-)


message 15: by Sandra (new)

Sandra | 31493 comments 317,000+ seems a bit steep for the rest of us to ever achieve.


message 16: by Melody (new)

Melody (runningtune) | 13300 comments http://www.goodreads.com/book/show/30...

Here's another thing - variations of the spelling of the last name.


message 17: by MissJessie (last edited May 15, 2010 10:44AM) (new)

MissJessie | 866 comments Earth to Good Reads (Otis????)

At the risk of sounding like a b____ once again a lot of work is being made for (Volunteer) librarians.

(And I'm not one, so it isn't personal.)

(Perhaps there is an "undo" option and this can be repaired.)

Testing is a good idea.

This isn't a situation where the good outweighs the bad, is it?

Good Reads is wonderful, but it rides on the backs of volunteers who must eventually get tired of undoing things (or, I hope they will) that could have been prevented!

I know I am making myself unpopular with the Powers that Be, but someone has to say it.

But I'll export by booklist now just the same....

OK, now rant back at me :)


This Is Not The Michael You're Looking For | 949 comments I have to admit this one irritates me a bit. But it also needs to be noted that this is not a "bug" per se...the script it doing exactly what it was supposed to do (which appears to be looking up books and adding 'missing' authors). The problem is I'm not sure the ramifications of the script were well thought out. They never thought about alternate spellings and disambiguations and/or they never realized how massive the changes would end up being.

It might be worthwhile for Otis to start asking the librarian group about massive auto-script updating before implementation. Despite some of the errors it introduced, the auto-combine script from last week seems to have been fairly useful (it did have some bugs in it, which is a different issue). The more I look at it, the more this author script is turning into a nightmare and I think a lot of the problems could have been predicted.

In some sense, this has the potential to partly undo every author correction ever made. Every disambiguation may have the single space version added as a co-author (it's not, thankfully, replacing any authors, just adding), every name variant correction will be re-added to books, etc. It's going to mean merging hundreds to thousands of additional authors (variant names) as well as deleting thousands to (don't want to think about it) secondary authors from books.

Any why do these changes always get implemented right before the weekend?


message 19: by MissJessie (new)

MissJessie | 866 comments Thank you Michael for your comments. In the past I have felt like I was shouting into the wind; either I got "explanations" or more often, silence from the powers that be.

I really hope there is an undo feature that can be implemented, or failing that, a backup made before this potential disaster of an improvement was implemented.

As to weekend running, do you see anyone else either commenting or attempting to fix it :) ? Turn it on and run...... :)

As I said, I am not a librarian, and this is one of the reasons why I have never applied. I truly feel that GR (I will blame Otis, but just because he's handy) is taking advantage of a free labor force.

It is not impossible to A: TEST on a small data set. This kind of mess would show up on even a data set of 1000. Why not try that?

B. As you said Michael, ASKING the people affected who do the work is a concept too.

Sorry to sound so irritable, but this kind of thing really pi...es me off.

I realize that GR is free to you and me and so to a point complaining is rather churlish, but taking advantage of people's goodwill and free work for profit (and I don't believe for a minute there isn't one, and there should be) is unethical and frankly, not so bright. Sooner or later volunteers will get fed up.


message 20: by Patrick (new)

Patrick Brown | 101 comments Just popping in to say that Otis is well aware of this problem and working on fixing it. My guess is that it will involve reverting to an earlier, saved database, and that takes a LOT of time to do. So while a solution may not appear in the next few hours, something will be done about it.

As for the situation of taking advantage of librarians, I think that's unfounded. It's a volunteer position, after all. Anyone who doesn't want to be one doesn't have to. And even those with librarian status are under no obligation to do anything. Some people have librarian status and rarely use it. We're grateful for those who do, but I suspect they do it out of a desire to improve the site, not out of a deep sense of obligation.

As for checking with librarians before running scripts like this, I'll let Otis speak to that, as that's really his territory.


message 21: by MissJessie (new)

MissJessie | 866 comments Patrick: Good to know it's being worked on. Thanks for the update.

I understand that it's a volunteer position with no obligation imposed at all.

My point is that at some point even volunteers get P/O'd when changes are made that mess up work done without any apparent test/consideration whatever.

For example, when you revert to a saved database, whatever work done (correctly) since then will be lost. Maybe not much, maybe more, who knows? But that's very frustrating for someone who has done it.

I know there are volunteers on this site (and God Bless them) whose life seems to heavily revolve around being librarians, super librarians, whatever.
But I wonder at what point even those dedicated souls will get annoyed. And sorry, but I do think it's taking advantage of everyone's current good nature.

No one is forced, everyone wants a good database.
And they get a sense of satisfaction out of the job I imagine, or a sense of empowerment, or something.

BUT, if GR didn't have volunteers and had to pay someone to do this, what would it cost? Paid employees can't complain much if they wish to remain such, but volunteers can go elsewhere. Most won't. But some will.


message 22: by Otis (last edited May 15, 2010 03:41PM) (new)

Otis Chandler | 315 comments I have to say that I'm not aware that we're running any script to fix author data. Clearly this is happening in an automated way however, so maybe it's a bug in the site somewhere. Will report when it's found. Anyone have a good idea when it started?

Apologies for the errors. Please don't assume that we do these things on purpose or that we'll rely on librarians alone to clean it up. We're all in this together! :)


message 23: by Patrick (new)

Patrick Brown | 101 comments MissJessie wrote: "Patrick: Good to know it's being worked on. Thanks for the update.

I understand that it's a volunteer position with no obligation imposed at all.

My point is that at some point even volunteers ..."


Oh, it's certainly annoying. I didn't mean to imply otherwise. It's obviously a deeply frustrating experience to change something and see that work undone by an automated process. But to suggest that there is exploitation happening is incorrect. People are librarians because they want to be. If they no longer want to be librarians, as you point out, they will choose that. But the choice is, as always, theirs. We try to make the site the best it can be. Sometimes those efforts backfire, but there's nothing malicious about that.


message 24: by Otis (new)

Otis Chandler | 315 comments Ok I think I found it. We were importing some more Barnes & Noble book meta-data, and it was adding secondary authors and not doing good checks if the secondary author was already the primary author! :(

Good thing is we have logs of all of it, so we should be able to fix easily.


This Is Not The Michael You're Looking For | 949 comments The rate seems to have slowed down a lot, which is good.

I'm playing high/low with with the librarian logs trying to see when it started. To my surprise, some of the GoodReads changes seem to be much older than I expected. I don't know if every occurrence of

"GoodReads updated the book XXXX" by XXXX
additional author added: xxxx

is part of the problem or not. I don't recall ever seeing these in the past, but it's been a long time since I've scanned through the global librarian edit logs.

Based on that sort of log entry, the very earliest of these I can find (17000+ log pages back) is from March 18, 10:47am (#2732905) (this is approximately 1.7 million logged edits (from all librarians) ago). (Interestingly, the second earliest of all of these log entries that I found is an error where a variant of the author's name was added as second author). Prior to that entry, I don't see any GoodReads edits that involve adding additional authors to an existing book. I could easily have missed an earlier one, but they pick up pretty regularly in the log at that point, so I doubt I missed too many. From that point onward they start to dominate most log pages (except for the period where Otis was running his combine script, which overwhelmed most other edits at that point).

Wow, looks like this has been going on for awhile, but we've only noticed the problems it was causing very recently. I don't know if its sped up in the last week or two or it just had to reach a saturation point until we could realize the problem was systemic (probably the latter).

When GoodReads is automatically updating a book, where is it pulling the data from? Amazon, B&N or both? What triggers an auto-update? (as opposed to a manual update). Could Amazon or B&N have reset their DB in some way, causing GR to automatically try to update everything?


This Is Not The Michael You're Looking For | 949 comments Ah...Otis found it in the time it took me to pin down the date.


message 27: by MissJessie (new)

MissJessie | 866 comments Hello Patrick:

Implications of maliciousness were not my intent. I NEVER supposed you/whoever sat there with a gleam in your eye and thought, "now what can I do to mess it up." :)

AND, am glad to know it's being attended to.

So sorry Patrick and Otis; malicious I do not think you are.

Overworked, yes.

After all, this IS the weekend.

And Bunwat, excellent points. I am not converted completely, but I see you points.

Jess


message 28: by Banjomike (last edited May 15, 2010 05:30PM) (new)

Banjomike | 5166 comments MissJessie wrote: "I NEVER supposed you/whoever sat there with a gleam in your eye and thought, "now what can I do to mess it up."

Have faith. There is always SOMEONE willing to screw up data, either deliberately or accidentally or stupidly. Like the helpful soul who deleted a popular Sony e-reader from the e-reader page.


message 29: by Sherry (last edited May 15, 2010 08:54PM) (new)

Sherry (ssaccoliti) | 601 comments So are we fixing, or are we waiting?
If we are reverting, I'll sit tight.
http://www.goodreads.com/book/show/24...
Just to make this more exciting, here's an author who has now been added automatically Arthur M. Schlesinger without the Jr. and both the father and the son are authors. This will not be a simple combine to fix this one manually. Guessing that there are many more examples as well.
Think I'll sit tight and wait until Monday


message 30: by Otis (new)

Otis Chandler | 315 comments We will fix - but as it's the weekend we may not get to it until next week :)


message 31: by MissJessie (new)

MissJessie | 866 comments Should librarians be warned to stop making corrections until a remedy is finished?

If GR has to revert, all work would be lost.

Although where to post it best, I have no idea.


message 32: by MissJessie (new)

MissJessie | 866 comments BunWat wrote: "Thanks for considering my points Jess, even if not converted completely. I am impressed that you considered them and didn't entrench. Very rational, I like it."

Thanks. I try to be.

AND, I didn't call you any pejorative names (nor you me). What a concept. :)


message 33: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Otis wrote: "We will fix - but as it's the weekend we may not get to it until next week :)"

Really, some people. ;)

I suspect that waiting until the fix is implemented is the best course, Jessie.


message 34: by Lisa (new)

Lisa Vegan (lisavegan) | 2400 comments Does this go for all librarian work or just author corrections? I've been doing a lot of work (for me, these days) recently. Thanks.


message 35: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Pretty sure just correcting these issues, as I believe Otis's implication is that fixing from logs should affect just the incorrectly added authors.

Anyway, that's what I'm doing. I just made several other edits.


message 36: by Lisa (new)

Lisa Vegan (lisavegan) | 2400 comments Thanks, Rivka.


back to top