Goodreads Librarians Group discussion

note: This topic has been closed to new comments.
813 views
Additions to Librarian Manual > Added to the Manual: Author Sort fields

Comments Showing 51-85 of 85 (85 new)    post a comment »
« previous 1 2 next »
dateUp arrow    newest »

message 51: by lethe (new)

lethe | 16359 comments Perhaps a distinction can be made between accented letters (that use ` ´ ^) and extra letters, such as Krazykiwi mentions?

Accented letters are only used to stress syllables and difference in pronunciation* (and in French, ^ denotes that the vowel used to be followed by an s).
This category also contains the Dutch ë, which just denotes the start of a new syllable:
Adriaan Morriën (mor-ri-en) versus Morrien (mor-rien)

(If anyone could tell me where the ë in Brontë comes from, I'd be much obliged! I've always been bemused by that one.)

(*I'm conveniently ignoring Icelandic sort order here, it's their own fault for being so difficult :P )


message 52: by Gaia (new)

Gaia | 125 comments Should be German names such as Hans Müller or Christine Schäffer be registered like that or should they be spelled Mueller and Schaeffer?


message 53: by Keith (new)

Keith (kgf0) | 377 comments I have a few things that don't appear to have been fully addressed yet in any thread I've yet seen, nor in the linked section of the Librarian Manual. I'll put each one in a separate comment in this thread since that seems easier to track.

First the easy(er) one: given that there are innumerable standards, and non-standard conventions, regarding proper-name alpha sorting across hundreds of languages, and given that this is, fundamentally, a computer program running n English-language servers, might it not be easiest on everyone—especially our over-worked developers who are probably already deeply sorry that they tried to give us what we so frequently requested with full requirements documentation or user stories—to just have the sort-by fields all run off of ASCII/UTF-8 sort order?

At least that would be a single, canonical, discoverable, standardized rule common to computing for over 60 years, and likely easily implemented. Fundamentally, this is probably a question more for the development team than for us volunteers.


message 54: by Krazykiwi (new)

Krazykiwi | 1767 comments It'd be nice, but ASCII pretty much spectacularly manages to be out of order in every language I know anything about other than English, since it ignores the existence of any of these other language characters entirely. Not to mention it's other eccentricities (A-Z sort before a-z, which would really really mess up all the nobility particles, the de/di/von/af etc).

ISO8859-1 (aka latin-1, or latin-9, but I doubt the euro character will show up in author names, so the difference is moot) is better, but still wildly out of order for... well everything. Those are charset standards more interested in inclusivity and tend to group "lookalike" characters together, having not really been intended for sort order/collation.

There is the Unicode Collation Algorithm, as well as ISO14651, and the EOR European sort ordering rules, all of which are intended to be tailored somewhat in implementation. Any of those would make a decent internationally agreed standard base to work from though for at least the European languages.

http://www.iso.org/iso/home/store/cat...


message 55: by Keith (new)

Keith (kgf0) | 377 comments The Manual does not seem to address fully what to do with numbers, and how numbers interact with suffixes such as Jr. and Sr..

With the title sort, it has been easy enough to replace "Volume XI" in Title with "Volume 11" in Sort, and even to replace "Volume 9" with "Volume 09" so that the single-digit volumes don't get scattered among the double-digit ones, like:

Volume 89
Volume 9
Volume 90

Particularly given that Roman numerals, duplicating letters, sort especially badly in name fields—regardless of whether they go "Surname IX, Forename", or "Surname, Forename, IX" as I believe they should—it would seem similarly advantageous to have at least a suggestion in the Manual that Roman numerals be replaced by the corresponding Arabic numbers: "Surname, Forename, 9".

This may seem an esoteric rarity to those who rarely see personal names beyond the fourth generation like Walter Cronkite IV, but once you start getting into monarchs like Louis XIV (duplicated as Louis XIV of France, and Louis XIV Bourbon), Popes, Lamas, and assorted other types of religious and civil nobility, it can start to get a right mess.

Relatedly, I note that Jr. sorts before Sr. alphabetically, and III sorts before them both, which also gets silly.

Sort as written:

Lopez, Anna [lopez, anna]
Lopez, Carlos [lopez, carlos]
Lopez, Carlos X. [lopez, carlos x.]
Lopez III, Carlos [lopez, carlos, iii]
Lopez IX, Carlos [lopez, carlos, ix]
Lopez Jr., Carlos [lopez, carlos, jr.]
Lopez Sr., Carlos [lopez, carlos, sr.]
Lopez V, Carlos [lopez, carlos, v]
Lopez XI, Carlos [lopez, carlos, xi]
Lopez, George


Sort with Arabic numbers:

Lopez, Anna [lopez, anna]
Lopez, Carlos [lopez, carlos]
Lopez, Carlos X. [lopez, carlos x.]
Lopez Sr., Carlos [lopez, carlos, 01]
Lopez Jr., Carlos [lopez, carlos, 02]
Lopez III, Carlos [lopez, carlos, 03]
Lopez V, Carlos [lopez, carlos, 05]
Lopez IX, Carlos [lopez, carlos, 09]
Lopez XI, Carlos [lopez, carlos, 11]
Lopez, George


Finally, I note for everyone who might've overlooked it that the comma before the suffix/number/numeral in the "sort" field is important to distinguish those from middle names/initials.


message 56: by lethe (last edited Oct 12, 2015 11:29PM) (new)

lethe | 16359 comments Sorting Sr. before Jr. looks really odd to me. Sorting should be alphabetical and not according to meaning IMO.

ETA In other languages, they may not use Sr. and Jr. Portuguese f.e. uses Filho and Neto for Jr. and III. In this case they happen to be alphabetically in the right order, but it becomes much too complicated if we have to sort suffixes according to meaning.

ETA something I thought of after logging off last night: you wouldn't sort Henry VIII before Elizabeth I, would you? And what's the difference? (And how often would one have books by both Sr. and Jr. on their shelves?)


Elizabeth (Alaska) I'm going to make one last effort at getting the GR display to conform to the standard. GR will be the only place where Last Jr, First appears.

http://www.chicagomanualofstyle.org/q...
http://www.english-for-students.com/L...
http://blog.apastyle.org/apastyle/201...


message 58: by lethe (new)

lethe | 16359 comments Elizabeth (Alaska) wrote: "I'm going to make one last effort at getting the GR display to conform to the standard. GR will be the only place where Last Jr, First appears."

I'm with you, Elizabeth.

(And I learned an interesting tidbit: 'Charles de Gaulle' is sorted 'de Gaulle, Charles', because Gaulle has only one syllable, contrary to Maupassant, Guy de'.)


message 59: by Krazykiwi (new)

Krazykiwi | 1767 comments That's really the only way that makes sense to me.

Additionally, sorting Jr, Sr, III by "meaning" is again, anglocentric, and would in the first case need additional logic added just to do that, and in the second, would need *mountains* of additional logic to make it non-anglocentric.

Imma stick a bunch of 'why did they DO that' examples in the spoiler, cos I can come up with these all day long, but skip 'em if they bore you.

(view spoiler)

Basically, getting *alphabetization* right is so insanely hard, trying to do anything past that seems slightly insane to me.


message 60: by lethe (new)

lethe | 16359 comments Krazykiwi wrote: "Additionally, sorting Jr, Sr, III by "meaning" is again, anglocentric, and would in the first case need additional logic added just to do that, and in the second, would need *mountains* of additional logic to make it non-anglocentric."

Yes, and it makes no sense. If a father John Smith has a son Jacob Smith, John is not going to be sorted before Jacob either. The only sorting should be alphabetical and numerical (with proper numbers, not words meaning 'the elder' and 'the younger').

And thank goodness Swedish kings weren't in the habit some other royalties were of writing books.
ROTFL

Willem van Oranje, since Orange isn't his surname either. I guess lethe knows what to do with him though :)

I got 10/10 for sorting back when I studied LIS. ;)
It's been a while, but I still have the book! It does date back to when it was still all about the card catalogue though. Computer catalogues were not really acknowledged yet in those rules.

Kings and queens should be sorted on first name. Willem van Oranje, Henry VIII, Elizabeth I, and the Roman numerals should be sorted like their Arabic counterparts (8, 1), as Keith already said.

I'm supposing Carl XIV Gustaf was originally named Carl Gustaf and got the numeral when he became king? He should be sorted on Carl. Same with Charles. (My book actually gives the example of Willem IV Alexander van Nassau: should be sorted on Willem.)


message 61: by Krazykiwi (new)

Krazykiwi | 1767 comments Thanks lethe!

That's probably worth noting in the manual, since it's not obvious to those of us who didn't study LIS :)


message 62: by lethe (new)

lethe | 16359 comments Krazykiwi wrote: "Thanks lethe!

That's probably worth noting in the manual, since it's not obvious to those of us who didn't study LIS :)"


I'm not sure if GR is going to follow that rule though. They already decided to sort on pope (title) instead of first name as it should be, analogous to the king/queen rule.


Elizabeth (Alaska) lethe wrote: "I'm not sure if GR is going to follow that rule though. They already decided to sort on pope (title) instead of first name as it should be, analogous to the king/queen rule. "

I think some discussion about some of this is ongoing, so maybe they'll give one last review to their decisions about sorting.

https://www.goodreads.com/topic/show/...


message 64: by lethe (new)

lethe | 16359 comments Here's hoping then :-)


message 65: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Thanks for all the feedback, suggestions, documentation of various methodologies, and other contributions to this thread.

After much debate, we have revised the new Manual section. It can be seen here: https://www.goodreads.com/help/show/4...


message 66: by lethe (last edited Oct 21, 2015 09:23AM) (new)

lethe | 16359 comments On the whole, I'm very happy with the revisions (special characters, titles, etc.)!

Personally, I don't mind that the suffixes are sorted shown after the last name. It still looks more natural to me, even though from a sorting standpoint it's wrong.

Thank you very much for all your hard work!


Elizabeth (Alaska) Yes, thanks, Rivka. The sorting clarification for the accented characters (especially) gives us good direction.


Elizabeth (Alaska) lethe wrote: "It still looks more natural to me, even though from a sorting standpoint it's wrong.
"


The sorting part is correct. It's the display part that is wrong, but I'll learn to live with it.


message 69: by lethe (new)

lethe | 16359 comments Elizabeth (Alaska) wrote: "lethe wrote: "It still looks more natural to me, even though from a sorting standpoint it's wrong."

The sorting part is correct. It's the display part that is wrong, but I'll learn to live with it."


Yes, I keep confusing the two. Shown, not sorted. (Also, I still think that in the sort field the Roman numbers should be replaced by Arabic numbers. Guess I'll have to live with that :-P )


message 70: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
lethe wrote: "Thank you very much for all your hard work!"

I am passing that along. Glad you like at least some of the changes.


message 71: by lethe (new)

lethe | 16359 comments rivka wrote: "lethe wrote: "Thank you very much for all your hard work!"

I am passing that along. Glad you like at least some of the changes."


It's more than "at least some", really!


message 72: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Even better.


Elizabeth (Alaska) Rivka, for me it's all but one. ;-)

Really ... from the developers to the manual writers and all in between, this has been a very successful effort for everyone.


message 74: by lethe (new)

lethe | 16359 comments I have a question regarding an example in the manual.

Under 'Special characters', we are told that special characters should be excluded from the sort by field and entered as follows:
capek, karel (display field: Čapek, Karel).

But under 'Diacritics', it says "The first column displays correct entry in the shelf display field, and the second column displays correct entry in the sort by field."
Č - č

Shouldn't that be c in the sort by field? And doesn't that also count for the ç and the ž?


message 75: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Good catch. Fixed.


message 76: by lethe (new)

lethe | 16359 comments Thank you!


message 77: by Krazykiwi (new)

Krazykiwi | 1767 comments Since I was plenty vocal about the issues with the last revision, I better also say, it looks great now. The guidelines are nice and clear, unambiguous and easy to apply.


message 78: by Keith (new)

Keith (kgf0) | 377 comments Thank you to everyone for the hard work on a much needed and long begged for feature. I have one area left for clarification which I meant to include with my previous posts before I ran out of time. Admittedly, this is something of a corner case.

As all of us here know, the disambiguation hack we've been using for authors with the same name has been to include extra spaces before the surname. In the sort field, we have no standard defined for discarding those spaces, or where to put them if they are to be retained (since they cannot be before the surname, and will be truncated if left trailing in their natural place).

Do we care about sorting within same-named authors, so that John Stanley's horror movie books are not scattered among John Stanley's Little Lulu comics? If we do care (and I'm not suggesting that we have to care), I think it would be worthwhile to decide upon and document a consistent means for doing so. That could look like including the number of spaces—only in the sort field—as a number like "Stanley, John 2", or our other hack of john^^stanley, like "stanley, john^^" in the sort field, or...?

FWIW, if we're going to do it at all, I prefer the numbers rather than the carets, but somebody may have a better idea. Clearly my ideas are not always useful (and thanks to Krazykiwi for reminding me why ASCII sort is a dumb idea even if we don't use mixed case in the sort field).

And if we're not going to do anything about these cases, we might want one line in the manual to say "ignore intervening disambiguation spaces" or something like that, just to make it clear that the issue was considered and settled.

Thanks again!


message 79: by Moloch (new)

Moloch | 3975 comments Thank you very much, this feature meant a lot to me too and I'm happy to have it.


message 80: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
Keith wrote: "if we're not going to do anything about these cases, we might want one line in the manual to say "ignore intervening disambiguation spaces" or something like that, just to make it clear that the issue was considered and settled."

We have decided it is best not to add that additional layer of complexity. Your suggestion to add a line to the Manual entry is a good one though, and we have done so.


message 81: by Keith (new)

Keith (kgf0) | 377 comments rivka wrote: "Your suggestion to add a line to the Manual entry is a good one though, and we have done so."

Acknowledged; thank you.


message 82: by Krazykiwi (new)

Krazykiwi | 1767 comments After a little experimentation, I've found they do auto-correct to the defaults if you blank them before saving. Which is awesome if the default sort/display is correct, which it is for the vast majority of authors. So it's only the ones that need manually correcting anyway that don't.


message 83: by lethe (new)

lethe | 16359 comments Do we sort the letters not in the list analogous to the ones mentioned? F.e., should
Ondřej Mrázek
be sorted
mrazek, ondrej

i.e.,

Ř ř - r
Š š - s
ů - u
etc.?


message 84: by rivka, Former Moderator (new)

rivka | 45177 comments Mod
lethe wrote: "Do we sort the letters not in the list analogous to the ones mentioned? F.e., should
Ondřej Mrázek
be sorted
mrazek, ondrej

i.e.,

Ř ř - r
Š š - s
ů - u
etc.?"


Yes. The list is meant to be examples, not exhaustive.


message 85: by lethe (new)

lethe | 16359 comments Thanks for the clarification!


« previous 1 2 next »
back to top
This topic has been frozen by the moderator. No new comments can be posted.