More on this book
Community
Kindle Notes & Highlights
a comparator construction method is used in place of a lambda (Items 14. 43): Click here to view code image Collections.sort(words, comparingInt(String::length)
'comparingInt' is strange as a factory method name, to me. It's questionable that it falls into any of the example factory name styles listed in Item 1.
Item 34 says that enum instance fields are preferable to constant-specific class bodies. Lambdas make it easy to implement constant-specific behavior using the former instead of the latter.
To be fair, you could have done this without lambdas, too, by passing the operation code as an anonymous function as a constructor arg.
Constant-specific class bodies are still the way to go if an enum type has constant-specific behavior that is difficult to understand, that can’t be implemented in a few lines, or that requires access to instance fields or methods.
The streams API was added in Java 8 to ease the task of performing bulk operations, sequentially or in parallel. This API provides two key abstractions: the stream, which represents a finite or infinite sequence of data elements, and the stream pipeline, which represents a multistage computation on these elements.
I think a great analogy, which comes from the Designing Data Intensive Applications book, is piping commands in Unix. The first command emits data elements (very often lines from a file), each command operates on the lines (such as `grep` for filtering), eventually outputting to STDOUT, file, etc..
It is shorter, but it is also less readable, especially to programmers who are not experts in the use of streams. Overusing streams makes programs hard to read and maintain.
It'd also be great to understand what the performance difference is, even if that is in the realm of micro-optimization. All the extra iterations and function invocations might have real an impact in contexts such as mobile apps.
In summary, some tasks are best accomplished with streams, and others with iteration. Many tasks are best accomplished by combining the two approaches. There are no hard and fast rules for choosing which approach to use for a task, but there are some useful heuristics. In many cases, it will be clear which approach to use; in some cases, it won’t. If you’re not sure whether a task is better served by streams or iteration, try both and see which works better.
Something that's not touched on here (will it be later?) is a recommendation on the use of streams at a more architectural level.
That is, here we see the comparison of streams vs non-streams mostly just inside of a method. However, I've seen applications that "replace" their stack with streams. For example, whereas a traditional mobile app will have networking, data, and UI layers fairly well separated, I've seen apps brings all of those into a stream built in the controller. (Long story short, I think that's bad.)
Note that the doc comment does not say “mod throws NullPointerException if m is null,” even though the method does exactly that, as a byproduct of invoking m.signum(). This exception is documented in the class-level doc comment for the enclosing BigInteger class. The class-level comment applies to all parameters in all of the class’s public methods. This is a good way to avoid the clutter of documenting every NullPointerException on every method individually.
I would err on documenting it on the method. You have to tailor to real-world behavior, and people will too often not read class documentation, often because it can be very lengthy.java.util.Formatter's class doc is more than 10 pages, and it's the very last line that has this NullPointerException message.
assertions are claims that the asserted condition will be true, regardless of how the enclosing package is used by its clients. Unlike normal validity checks, assertions throw AssertionError if they fail. And unlike normal validity checks, they have no effect and essentially no cost unless you enable them, which you do by passing the -ea (or -enableassertions) flag to the java command.
I don't like assert; it communicates a sense of guarantee or safety that may not be present. That is, it might tell the developer "we know the input value is never negative" but new code might not adhere to that. It might tell the developer "don't worry if a negative value comes in, this code will catch it before it does any damage to state," but assertion checks can be turned off.
To add on to that last point, it introduces a real and likely possibility of behavior that's different between development and production.
An important exception is the case in which the validity check would be expensive or impractical and the check is performed implicitly in the process of doing the computation.
I have a recommended exception that I like to call "garbage-in-garbage out micro-optimization."
If the check
1. Checks for programmer error (as opposed to input/user error) that is moderately expensive to check,
2. and is therefore expected to almost never happen in production, and
3. Letting it be thrown during computation would not leave the object in an inconsistent state, then
It's better to not check in advance. Don't let protecting programmers from bad code compromise the good code that will happen most of the time.
Note also that we did not use Date’s clone method to make the defensive copies. Because Date is nonfinal, the clone method is not guaranteed to return an object whose class is java.util.Date: it could return an instance of an untrusted subclass that is specifically designed for malicious mischief. Such a subclass could, for example, record a reference to each instance in a private static list at the time of its creation and allow the attacker to access this list. This would give the attacker free rein over all instances. To prevent this sort of attack, do not use the clone method to make a
...more
This is venturing too deep into "making your java app secure against malicious code," which a blurb here and there are wholly inadequate for.
For example, what do you even know about your runtime? Can it give the attacker access to the class loader? Instead, this should really just say something like "This is for defending against mistakes, but for Java security, here's a recommendation for where to learn more."
Long sequences of identically typed parameters are especially harmful. Not only won’t users be able to remember the order of the parameters, but when they transpose parameters accidentally, their programs will still compile and run. They just won’t do what their authors intended.
A third technique that combines aspects of the first two is to adapt the Builder pattern (Item 2) from object construction to method invocation. If you have a method with many parameters, especially if some of them are optional, it can be beneficial to define an object that represents all of the parameters and to allow the client to make multiple “setter” calls on this object, each of which sets a single parameter or a small, related group.
I'll repeat: it'd be great if Java supported named parameters (with support for some being optional).
Therefore you should avoid confusing uses of overloading. Exactly what constitutes a confusing use of overloading is open to some debate.
Let's try to word this recommendation in positive terms - when should you use overloading?
Use overloading to make it convenient for the caller - give them several entry points - to what will essentially be a single operation. That is, regardless of which they call, the behavior should be the same/correct; overloading should simply make it easier to accept multiple types and/or relieve the caller from having to provide certain parameters (either so they can rely on default values, or because the parameters aren't applicable for some other inputs).
An obvious indicator of "essentially a single operation" is when an overload ends up calling one of its other overloads.
The confusing behavior demonstrated by the previous example came about because the List<E> interface has two overloadings of the remove method: remove(E) and remove(int). Prior to Java 5 when the List interface was “generified,” it had a remove(Object) method in place of remove(E), and the corresponding parameter types, Object and int, were radically different.
The problem isn't that these two overloadings have the same number of parameters (well, it's a secondary problem), but that two very different operations were given the same name.
In summary, varargs are invaluable when you need to define methods with a variable number of arguments.
I think it'd be worth spelling out: varargs is designed for when you expect *developers* to have need to provide a variable number of arguments, not simply to accept inputs with a variable number of elements..
In other words
* Will different code invocations of this method provide different numbers of arguments?→ Use varargs
* Will the calling code collect a variable length input into an array and then pass it into this method? → Don't use varagrs.
In the unlikely event that you have evidence suggesting that allocating empty collections is harming performance, you can avoid the allocations by returning the same immutable empty collection repeatedly, as immutable objects may be shared freely (Item 17). Here is the code to do it, using the Collections.emptyList method. If you were returning a set, you’d use Collections.emptySet; if you were returning a map, you’d use Collections.emptyMap.
Caveat/suggestion: First, you should be confident that your client is okay with an immutable collection (or that they're okay with copying it into a new one themselves if they want to modify it).
Second, be sure to state in your documentation that the collection you're returning is immutable, as there isn't otherwise clear indication—as pointed out earlier, there isn't a "ImmutableList" interface, which means your client might have to find out the hard way: at runtime.
If you believe that allocating zero-length arrays is harming performance, you can return the same zero-length array repeatedly because all zero-length arrays are immutable: Click here to view code image // Optimization - avoids allocating empty arrays private static final Cheese[] EMPTY_CHEESE_ARRAY = new Cheese[0]; public Cheese[] getCheeses() { return cheesesInStock.toArray(EMPTY_CHEESE_ARRAY); }
Why not return the empty immutable array directly (rather than "copying" no contents into the empty array)?
In summary, never return null in place of an empty array or collection.
Never say never: There are exceptions—very few, to be fair—where null can be used to communicate something different from empty. For example, null can be used to indicate that there is no value (where the value is a list of values) stored in the map or key-value store yet (and where storing an empty array may not be desirable).
In Java 8, there is a third approach to writing methods that may not be able to return a value. The Optional<T> class represents an immutable container that can hold either a single non-null T reference or nothing at all.
I'm still not convinced that Optionals is a good thing, that Optionals is something that I want to use. Some of the reasons:
• It actually introduces another possibility of failure: if a method returns null instead of an empty optional.
• It introduces another object (instantiation/allocation).
I'd really love to know why this was chosen over adoption of a syntactic solution or annotations (such as @Nullable).
Returning an optional that contains a boxed primitive type is prohibitively expensive compared to returning a primitive type because the optional has two levels of boxing instead of zero. Therefore, the library designers saw fit to provide analogues of Optional<T> for the primitive types int, long, and double.
Except it shouldn't be compared to returning just a primitive type, as that doesn't have an actual way to indicate "no value," (other than agreeing that some value, such as negative one, can mean no value, if that option is available). So, it should be compared to just a boxed primitive (one level).
If the text in the @return tag would be identical to the description of the method, it may be permissible to omit it, depending on the coding standards you are following.
Even when this is the case—which is very often—a good detail I often find myself putting here is whether/when the returned object may be null (often all I put here is `never {@code null}`)
“A college degree, such as B.S., M.S. or Ph.D.” will result in the summary description “A college degree, such as B.S., M.S.” The problem is that the summary description ends at the first period that is followed by a space, tab, or line terminator (or at the first block tag) [Javadoc-ref]. Here, the second period in the abbreviation “M.S.” is followed by a space. The best solution is to surround the offending period and any associated text with an {@literal} tag, so the period is no longer followed by a space in the source code:
Nearly every local variable declaration should contain an initializer. If you don’t yet have enough information to initialize a variable sensibly, you should postpone the declaration until you do. One exception to this rule concerns try-catch statements. If a variable is initialized to an expression whose evaluation can throw a checked exception, the variable must be initialized inside a try block (unless the enclosing method can propagate the exception).
I would take this exception further: For variables that are initialized inside of a try-catch (or other control block, like if/elseif/else), do _not_ also initialize the variable at the time of declaration.
•This makes it clear (to someone else reading the code) that the variable only gets a usable value by making it through the block.
• It's also a useful practice while writing the block; it helps ensure every path that should initialize it does
• It makes it clearer what paths result in that "default" value (rather than having to determine/infer from the paths that don't set it)
As of Java 7, you should no longer use Random. For most uses, the random number generator of choice is now ThreadLocalRandom. It produces higher quality random numbers, and it’s very fast. On my machine, it is 3.6 times faster than Random.
It'd be nice to hear about what's different about it and why it's better (what is it about Random's implementation that makes it bad / how was ThreadLocalRandom able to be better?). Seems like a small ask considering how much was just put into analyzing the bad example.
As of Java 7, you should no longer use Random. For most uses, the random number generator of choice is now ThreadLocalRandom. It produces higher quality random numbers, and it’s very fast. On my machine, it is 3.6 times faster than Random.
What if threads are short-lived and each thread asks for a random only once? Seems like the cost of initializing a thread-local random object could result in a higher cost than the contention of using a shared Random (though in such a case, their difference is almost certainly unsubstantial).
An alternative to using BigDecimal is to use int or long, depending on the amounts involved, and to keep track of the decimal point yourself. In this example, the obvious approach is to do all computation in cents instead of dollars.
This is a good recommendation, but when you're working with money there's a decent chance you'll need to do division or fractional multiplication (e.g., to calculate tax), which means you're right back athaving to worry about rounding/precision, and mixing and matching (going back and forth between int/long and float/BigDecimal).
In nearly every case when you mix primitives and boxed primitives in an operation, the boxed primitive is auto-unboxed.
"In nearly every case," is a weird qualifier to not explain. Are there cases where the primitive is autoboxed (that isn't clear because it's getting provided as a parameter of the boxed primitive type)?
It is entirely appropriate to refer to an object by a class rather than an interface if no appropriate interface exists. For example, consider value classes, such as String and BigInteger.
For Android, even String should often be replaced with the CharSequence interface since non-String CharSequences are often used.
To summarize, do not strive to write fast programs—strive to write good ones; speed will follow. But do think about performance while you’re designing systems, especially while you’re designing APIs, wire-level protocols, and persistent data formats.
"IADA", or "Identifier, API, Data, and Architecture," is the order of difficulty of changing those aspects (i.e., Identifiers are the hardest to change, architecture is the easiest, relatively), and reinforces this recommendation.
That is, APIs are hardest to change, so focus on getting that right; later on, it's easier to change the implementation to make it faster or more efficient.
The remainder of a package name should consist of one or more components describing the package. Components should be short, generally eight or fewer characters. Meaningful abbreviations are encouraged, for example, util rather than utilities. Acronyms are acceptable, for example, awt. Components should generally consist of a single word or abbreviation.
But what about when a package component is multi-word? I haven't found any good stated convention for multi-word components (for example, "com.example.ui.viewmodel" vs "com.example.ui.view_model"; I know most tend toward the former).
There is some disagreement as to whether acronyms should be uppercase or have only their first letter capitalized. While some programmers still use uppercase, a strong argument can be made in favor of capitalizing only the first letter: even if multiple acronyms occur back-to-back, you can still tell where one word starts and the next word ends. Which class name would you rather see, HTTPURL or HttpUrl?
Glad to see that the 3rd edition has seen the light 😄! To elaborate on the reasoning, here's my note copied from the 2nd edition (https://www.goodreads.com/notes/22719418-effective-java/74-robert/4c2de9b9-b59f-44cb-8da7-1266ed8e18f9):
I believe the case for capitalizing just the first letter has strong justifications while the all-uppercase pattern does not.
Examples will use what an "HTTP SSL Request" class and variable might look like...
1. Overall simpler.
* CamelCase: Whether word, acronym, initialism, or abbreviation, capitalize just the first letter of it. For the variable, lowercase just the first letter.
* All-Uppercase: Rules are inevitably more complex (or the variable ends up being weird or inconsistent with the class name; see #3).
2. Consecutive acronyms/initialisms/abbreviations are distinguishable and readable.
CamelCase: HttpSslRequest
All Uppercase: HTTPSSLRequest (where does one acronym end and the next begin?)
3. The variable for a CamelCase class is easy and consistent: simply lowercase the first letter: httpSslRequest (only the first character varies from its class name). For the all-uppercase convention, all variable name options have a drawback:
* hTTPSSLRequest: simply lowercase the first letter (easy rule, but weird looking variable)
* httpSSLRequest: lowercase the first acronym, but not subsequent ones (strange rule; variable name inconsistent with class name)
* httpSslRequest: use camelcase for the variable (inconsistent convention between class name vs the variable name)
* httpsslRequest: lowercase acronyms, but not words (more complex rule; harder to distinguish between acronyms)
* httpsslrequest: lowercase the entire variable (harder to distinguish where one "word" ends and the next begins; greater variance with class name)
* http_ssl_request: use snake case for variables (inconsistent convention between class and variable names; more difficult to perform a 'find' on)
Type parameter names usually consist of a single letter. Most commonly it is one of these five: T for an arbitrary type, E for the element type of a collection, K and V for the key and value types of a map, and X for an exception. The return type of a function is usually R. A sequence of arbitrary types can be T, U, V or T1, T2, T3.
> T, U, V, or T1, T2, T2
I would not follow this convention, for the most part. Why make someone have to keep a mapping in their mind for multiple types to multiple parameter names? And what if you need to insert or rearrange types? I would suggest giving the types more meaningful names, such as "<DM extends DataModel, VM extends ViewModel>"
Methods that return a boolean value usually have names that begin with the word is or, less commonly, has, followed by a noun, noun phrase, or any word or phrase that functions as an adjective, for example, isDigit, isProbablePrime, isEmpty, isEnabled, or hasSiblings.
Because exceptions are designed for exceptional circumstances, there is little incentive for JVM implementors to make them as fast as explicit tests.
Kind of a spurious statement. JVMs (standard and competitors) have evolved a long time, and a ton of effort has gone into optimizing all aspects, including making exceptions as fast as possible.
That's not to say that Exceptions aren't as fast as explicit tests— they're more inherently more costly (e.g., have to collect the stack trace, instantiate the exception, be treated as the less likely code path for predictive optimizations, etc.)— but it's not because there's lack of incentive.
The easiest way to eliminate a checked exception is to return an optional of the desired result type (Item 55). Instead of throwing a checked exception, the method simply returns an empty optional. The disadvantage of this technique is that the method can’t return any additional information detailing its inability to perform the desired computation. Exceptions, by contrast, have descriptive types, and can export methods to provide additional information (Item 70).
A method can also easily throw several types of checked exceptions representing different types of failures. An optional cannot help the caller understand what kind of failure occurred.
If an exception fits your needs, go ahead and use it, but only if the conditions under which you would throw it are consistent with the exception’s documentation: reuse must be based on documented semantics, not just on name.
And also if there aren't other Exceptions that you want to throw that would make to the same Exception. In which case, you'll likely want to subclass the standard exception.
To avoid this problem, higher layers should catch lower-level exceptions and, in their place, throw exceptions that can be explained in terms of the higher-level abstraction. This idiom is known as exception translation:
I'll share my more general version of this recommendation: a layer should insulate the layer above it from the later below it. Ideally, a layer should not know anything about the layer two layers down (how it works or how it's implemented). The implementation of that layer (two down) should be able to be swapped out without affecting the one two above it.
Heeding this should bring to question: If your layer can't insulate its higher and lower layers, is that layer necessary?
The lower-level exception (the cause) is passed to the higher-level exception, which provides an accessor method (Throwable’s getCause method) to retrieve the lower-level exception:
It's worth adding a warning here (in line with my previous note): You shouldn't use exceptions extracted from 'cause' for your code's behavior. You may not be able to count on the layer that that execution came from changing it even being swapped out.
I have had to violate this, as it was the only way to get the detail I needed of the underlying cause in order to present the user with the most appropriate UI. And I knew it made my code more fragile (but that it was worth it, in part because I had a fallback in place — in case the underlying implementation changed — where the UI would gracefully degrade).
Thar reinforces another lesson here: your exception should provide explicit programmatic information about the cause (so the consumer didn't have to actually consult the cause) if it may be important for the application logic (i.e., not just to help the developer with debugging).
Most standard exceptions have chaining-aware constructors. For exceptions that don’t, you can set the cause using Throwable’s initCause method. Not only does exception chaining let you access the cause programmatically (with getCause), but it integrates the cause’s stack trace into that of the higher-level exception.
I don't think there's a strong enough recommendation here (i.e., a bolded statement). Something like: Err on chaining the lower layer exception into the higher one — it can be valuable for debugging — unless you know it doesn't add any value.
So, along with that, consider providing chaining-aware constructors.
A closely related approach to achieving failure atomicity is to order the computation so that any part that may fail takes place before any part that modifies the object. This approach is a natural extension of the previous one when arguments cannot be checked without performing a part of the computation.
I would suggest more explicitly that you choose to allow exceptions to be thrown during computation (if it's before modification, and if there's any cost with checking separately). Don't let inefficiency on behalf of protecting from programming errors affect the performance of non-error cases (code should converge towards non-errors).
An empty catch block defeats the purpose of exceptions, which is to force you to handle exceptional conditions. Ignoring an exception is analogous to ignoring a fire alarm—and turning it off so no one else gets a chance to see if there’s a real fire. You may get away with it, or the results may be disastrous. Whenever you see an empty catch block, alarm bells should go off in your head.
It may be wise to log the exception, so that you can investigate the matter if these exceptions happen often. If you choose to ignore an exception, the catch block should contain a comment explaining why it is appropriate to do so, and the variable should be named ignored:
Too often there's a comment "// will never happen." Well, then you should have no problem having code that re-throw something like an AssertionError (which you're sure will never happen).
And here is how to tell the executor to terminate gracefully (if you fail to do this, it is likely that your VM will not exit):
failing to do this is also a memory leak. If your application creates executors repeatedly (without shutting them down, even if you dereference them), it will build up memory usage, possibly eventually causing the application to fail.
In fact, you can do even better. ConcurrentHashMap is optimized for retrieval operations, such as get. Therefore, it is worth invoking get initially and calling putIfAbsent only if get indicates that it is necessary:
Not that I have a problem with this, but this kinda _seems_ like it contradicts earlier recommendations to only optimize after knowing it's necessary. I guess when it comes to concurrency, the value of optimizing is higher, making it worth it more often. Almost seems worth explicitly calling that out.