Goodreads Developers discussion

60 views
bugs > Encoding Errors

Comments Showing 1-25 of 25 (25 new)    post a comment »
dateDown arrow    newest »

message 1: by Adam (new)

Adam (jademason) | 66 comments I am seeing a lot of encoding errors when using the book.show call. At first I thought it might just be a random book here or there, but it seems to be a pretty big issue affecting a large number of books. I went down the list of the top 10 bestsellers at B&N to see which returned encoding errors.

1. The Short Second Life of Bree Tanner - ENCODING ERROR at line 959 (Book ID:7937462)
2. The Girl Who Kicked the Hornets Nest - ENCODING ERROR at line 384 (Book ID:6892870)
3. Mockingjay - OK (Book ID:7260188)
4. The Help - OK (Book ID:4667024)
5. The Girl with the Dragon Tattoo - OK (Book ID:2429135)
6. Delivering Happiness - OK (Book ID:6828896)
7. Dead in the Family - OK (Book ID:7091488)
8. Game Change - OK (Book ID:6694937)
9. Women, Food, and God - OK (Book ID:6758423)
10. The Girl Who Played with Fire - ENCODING ERROR at line 95 (Book ID:5060378)

The book.show call is not the only one that returns results with encoding errors either; encoding errors have appeared in any call the includes user generated content. Assuming this sampling is representative of the entire library, users accessing Goodreads content via the API have a 30% chance of running into an encoding error.

MICHAEL mentioned in another thread that this error is due to a bug in the version of Rails currently in use by the Goodreads site, and that upgrading to the latest version should eliminate the issue. Is there a date for that switchover? Is there any chance of resolving this issue in some other way in the meantime?

Thanks for your help, and a great site!


message 2: by Michael (new)

Michael Economy (michaeleconomy) OK, clearly this is a lot more severe than i thought. I think this particular bug is in my code, I'll see if i can't fix this.

Thanks for bringing it to my attention.


message 3: by Adam (new)

Adam (jademason) | 66 comments No problem, and thanks for the quick response!


message 4: by Jonathan (new)

Jonathan Lin | 6 comments Hey Michael! Yah, I just wanted to add my appreciation for the quick responses in general for issues across the board. It makes developing easier to have someone to talk to when things are wonky.


message 5: by Michael (new)

Michael Economy (michaeleconomy) :D

You guys are really great at helping each other also!


message 6: by Ken (new)

Ken Dempster | 6 comments I am currently working with the api and ran into this bug. We are just interested in finding and retrieving book reviews. Do you have a better idea of how wide spread this bug is? Do you have an estimated date when we are likely to see a fix? Thanks.


message 7: by Michael (new)

Michael Economy (michaeleconomy) Soon, I've fixed a lot of cases of this, but they're held up by another change.


message 8: by Ken (new)

Ken Dempster | 6 comments Thanks for the quick reply. When you get the fix in place, can you post it has been resolved? Thanks.


message 9: by Michael (new)

Michael Economy (michaeleconomy) I'll try my best.


message 10: by Ken (new)

Ken Dempster | 6 comments I know the feeling. :)


message 11: by Rajababa (new)

Rajababa | 1 comments lovely gift


message 12: by Michael (last edited Oct 08, 2010 12:08PM) (new)

Michael Economy (michaeleconomy) Ok, i think i fixed the major cause of improperly truncated unicode strings.


Let me know if this actually makes a difference.


message 13: by Adam (new)

Adam (jademason) | 66 comments Good work MICHAEL! I went through my list of books with known encoding issues and all appear to be coming up roses in my app. Thanks so much for looking into and addressing this issue!


message 14: by Michael (new)

Michael Testing my app I was browsing around and I think I ran into the same problem:

http://www.goodreads.com/book/show/68...

Thanks


message 15: by Michael (last edited Oct 11, 2010 01:27PM) (new)

Michael Economy (michaeleconomy) Michael wrote: "Testing my app I was browsing around and I think I ran into the same problem:

http://www.goodreads.com/book/show/68...

Thanks"


Someone got an End of Medium character in their review. I'm not sure what that is, or why it matters, but i think it makes the xml invalid, so I edited it out.


message 16: by Ken (last edited Oct 25, 2010 07:00PM) (new)

Ken Dempster | 6 comments We continue to see several encoding errors. What is weird with the one below is it works fine when text_only=true. Not sure how consistent this is with the others that we are getting, but I plan on looking in them a little more.

http://www.goodreads.com/book/isbn?ke...


message 17: by Michael (new)

Michael Economy (michaeleconomy) Ken,

Thanks for reporting this. We'll look into it soon.


message 18: by M (new)

M (pandabearchews) | 2 comments Hi Ken,
We've removed the bad data. We'll continue to look for more examples. Please keep us posted if you find any more.
Thanks!


message 19: by Ken (last edited Oct 27, 2010 04:49PM) (new)

Ken Dempster | 6 comments No prob. I believe my co-worker has sent Michael directly a list that are having problems. I have not had a chance to look into those myself. If I found out any details on them, I will let you know.


message 20: by Michael (new)

Michael Economy (michaeleconomy) Oh, I did not know you guys were using the forums. :)

So yeah, we've got a list of 100 or so isbns that are giving back invalid XML, and we're working through them, hopefully we'll push out another change tomorrow to address a large number of those.


message 21: by Casper (new)

Casper Gasper (caspergasper) | 32 comments I guess I'm missing something, but rather than manually checking each book in turn why don't you just force the encoding of your data output into utf-8? Most libraries should replace invalid utf-8 with the std replacement char U+FFFD (the question mark in a diamond). This is what I do at my end to work around the problem, although it would be better if it was solved at the server side.

Casper.


message 22: by Michael (new)

Michael Economy (michaeleconomy) Casper wrote: "I guess I'm missing something, but rather than manually checking each book in turn why don't you just force the encoding of your data output into utf-8? Most libraries should replace invalid utf-8..."

THats pretty much exactly the solution we're working on today. :)


message 23: by Ken (new)

Ken Dempster | 6 comments I was planning on doing something like that. I just needed a bad one to test the solution out. That is why I was planning on going through the list. I wasn't sure if the problem was the same in all cases. I just having looked into the problem that in depth. Thanks for the info. It will definitely save me some time.

I am using the forum because it might help some else out and the responses have been pretty quick. :)


message 24: by Michael (new)

Michael (steelwolf) | 4 comments It looks like folks attempting to pass the RSS feed through SimpleXML are running into problems. I noticed this happening with my Wordpress plugin as well with similar plugins using the same method.

Are there still books out there causing non-UTF-8 data to be generated in the feeds?


message 25: by Michael (new)

Michael Economy (michaeleconomy) Michael wrote: "It looks like folks attempting to pass the RSS feed through SimpleXML are running into problems. I noticed this happening with my Wordpress plugin as well with similar plugins using the same method..."

There are probably some, but if you find any examples let us know and we'll try and scrub them. We wrote a little method to clean out non utf-8 stuff.


back to top