Martin Fowler's Blog, page 25

January 5, 2017

Basics of Web Application Security: Authorize Actions



Authentication means you know who your user is, protecting their session
ensures that information stays correct. Now Cade and Daniel move on to authorization:
checking that users only do what they are allowed to do. Authorization should
always be checked on the server and should deny by default. Actual authorization
schemes are domain-specific, but some common patterns help get you started.



more…

 •  0 comments  •  flag
Share on Twitter
Published on January 05, 2017 06:40

December 21, 2016

Dominion Second Edition



If you don't follow the board game news, you may have missed that the excellent
game Dominion has been upgraded to a second edition. Half-a-dozen cards in the base
and Intrigue sets have been replaced, and if you have the existing game you can get
update packs to ensure you're up to date



more…

 •  0 comments  •  flag
Share on Twitter
Published on December 21, 2016 18:00

December 3, 2016

photostream 104





Regents Park, London, England

 •  0 comments  •  flag
Share on Twitter
Published on December 03, 2016 06:55

November 30, 2016

Bliki: FunctionLength

During my career, I've heard many arguments about how long a function should be. This
is a proxy for the more important question - when should we enclose code in its own
function? Some of these guidelines were based on length, such as functions should be no
larger than fit on a screen [1]. Some were based on reuse - any
code used more than once should be put in its own function, but code only used once
should be left inline. The argument that makes most sense to me, however, is the
separation between intention and implementation. If you have to spend effort into
looking at a fragment of code to figure out what it's doing, then you should
extract it into a function and name the function after that “what”. That way when you
read it again, the purpose of the function leaps right out at you, and most of the time
you won't need to care about how the function fulfills its purpose - which is the body
of the function.



Once I accepted this principle, I developed a habit of writing very small functions -
typically only a few lines long [2].
Any function more than half-a-dozen lines of code
starts to smell to me, and it's not unusual for me to have functions that are a single
line of code [3]. The fact that size isn't important was brought
home to me by an example that Kent Beck showed me from the original Smalltalk system.
Smalltalk in those days ran on black-and-white systems. If you wanted to highlight some
text or graphics, you would reverse the video. Smalltalk's graphics class had a method
for this called 'highlight', whose implementation was just a call to the method
'reverse' [4]. The name of the method was longer than its
implementation - but that didn't matter because there was a big distance between the
intention of the code and its implementation.



Some people are concerned about short functions because they are worried about the
performance cost of a function call. When I was young, that was occasionally a factor,
but that's very rare now. Optimizing compilers often work better with shorter functions
which can be cached more easily. As ever, the general guidelines on performance
optimization
are what counts. Sometimes inlining the function later is what you'll
need to do, but often smaller functions suggest other ways to speed things up. I remember
people objecting to having an isEmpty method for a list when the
common idiom is to use aList.length == 0. But here using the
intention-revealing name on a function may also support better performance if it's
faster to figure out if a collection is empty than to determine its length.



Small functions like this only work if the names are good, so you need to pay good
attention to naming. This takes practice, but once you get good at it, this approach can
make code remarkably self-documenting. Larger scale functions can read like a story, and
the reader can choose which functions to dive into for more detail as she needs it.




Acknowledgements


Brandon Byars, Karthik Krishnan, Kevin Yeung, Luciano Ramalho, Pat Kua, Rebecca Parsons, Serge
Gebhardt, Srikanth Venugopalan, and Steven Lowe

discussed drafts of this post on our internal mailing list.



Christian Pekeler reminded me that nested functions don't fit my sizing observations.





Notes


1:
Or in my first programming job: two pages of line printer paper - around 130 lines
of Fortran IV





2:
Many languages allow you to use functions to contain other functions. This is often
used as a scope reduction mechanism, such as using the Function as
Object
pattern to implement a class. Such functions are naturally much
larger.





3: Length of my functions

Recently I got curious about function length in the toolchain that builds this
website. It's mostly Ruby and runs to about 15 KLOC. Here's a cumulative frequency
plot for the method body lengths



As you see there's lots of small methods there - half of the methods in my
codebase are two lines or less. (lines here are non-comment, non-blank, and
excluding the def and end lines.)



Here's the data in a crude tabular form (I'm feeling too lazy to turn it into
proper HTML tables).




lines.freq lines.cumfreq lines.cumrelfreq
[1,2) 875 875 0.4498715
[2,3) 264 1139 0.5856041
[3,4) 195 1334 0.6858612
[4,5) 120 1454 0.7475578
[5,6) 116 1570 0.8071979
[6,7) 69 1639 0.8426735
[7,8) 75 1714 0.8812339
[8,9) 46 1760 0.9048843
[9,10) 50 1810 0.9305913
[10,15) 98 1908 0.9809769
[15,20) 24 1932 0.9933162
[20,50) 12 1944 0.9994859




4:
The example is in Kent's excellent Smalltalk Best Practice
Patterns
in Intention Revealing Message






Share:

if you found this article useful, please share it. I appreciate the feedback and encouragement
1 like ·   •  0 comments  •  flag
Share on Twitter
Published on November 30, 2016 05:58

November 22, 2016

Bliki: HiddenPrecision

Sometimes when I work with some data, that data is more precise than
I expect. One might think that would be a good thing, after all precision is good, so
more is better. But hidden precision can lead to some subtle bugs.



const validityStart = new Date("2016-10-01"); // JavaScript
const validityEnd = new Date("2016-11-08");
const isWithinValidity = aDate => (aDate >= validityStart && aDate <= validityEnd);
const applicationTime = new Date("2016-11-08 08:00");

assert.notOk(isWithinValidity(applicationTime)); // NOT what I want

What happened in the above code is that I intended to create an inclusive date range by
specifying the start and end dates. However I didn't actually specify dates, but
instants in time, so I'm not marking the end date as November 8th, I'm marking the end
as the time 00:00 on November 8th. As a consequence any time (other than midnight)
within November 8th falls outside the date range that's intended to include it.



Hidden precision is a common problem with dates, because it's sadly common to have a
date creation function that actually provides an instant like this. It's an example of
poor naming, and indeed general poor modeling of dates and times.



Dates are a good example of the problems of hidden precision, but another culprit
is floating point numbers.



const tenCharges = [
0.10, 0.10, 0.10, 0.10, 0.10,
0.10, 0.10, 0.10, 0.10, 0.10,
];
const discountThreshold = 1.00;
const totalCharge = tenCharges.reduce((acc, each) => acc += each);
assert.ok(totalCharge < discountThreshold); // NOT what I want

When I just ran it, a log statement showed totalCharge was
0.9999999999999999. This is because floating point doesn't exactly
represent many values, leading to a little invisible precision that can show up at
awkward times.



One conclusion from this is that you should be extremely wary of representing money
with a floating point number. (If you have a fractional currency part like cents, then
usually it's best to use integers on the fractional value, representing €5.00 with 500,
preferably within a money type)
The more general conclusion is that floating point is tricksy when it comes to
comparisons (which is why test framework asserts always have a precision for
comparisons).




Acknowledgements

Arun Murali, James Birnie, Ken McCormack, and Matteo Vaccari

discussed a draft of this post on our internal mailing list.



Share:

if you found this article useful, please share it. I appreciate the feedback and encouragement
 •  0 comments  •  flag
Share on Twitter
Published on November 22, 2016 09:54

November 20, 2016

The Thrilling Adventures of Lovelace and Babbage



As a rule, I don't do book reviews. My main area of activity is software
development, and I know too many authors. If I started reviewing books it would be an
endless task. So I just don't do it.



But I can't help writing a few lines about The Thrilling Adventures of
Lovelace and Babbage
, drawn and written by Sydney Padua. The book is mostly
graphic novel, but with a sizable dollop of fascinating history thrown in. It opens
with a comic book narrative of the collaboration between Countess Lovelace and
Charles Babbage, which explains why we refer to Countess Lovelace as the first
computer programmer.



more…

 •  0 comments  •  flag
Share on Twitter
Published on November 20, 2016 12:48

photostream 103





Stoneham, MA

 •  0 comments  •  flag
Share on Twitter
Published on November 20, 2016 06:40

November 14, 2016

Bliki: ValueObject

When programming, I often find it's useful to represent things as a compound. A
2D coordinate consists of an x value and y value. An amount of money
consists of a number and a currency. A date range consists of start and end dates, which
themselves can be compounds of year, month, and day.



As I do this, I run into the question of whether two compound objects are the same.
If I have two point objects that both represent the Cartesian coordinates
of (2,3), it makes sense to treat them as equal. Objects that are equal due to the value
of their properties, in this case their x and y coordinates, are called value
objects.



But unless I'm careful when programming, I may not get that
behavior in my programs



Say I want to represent a point in JavaScript.



const p1 = {x: 2, y: 3};
const p2 = {x: 2, y: 3};
assert.notEqual(p1,p2); // NOT what I want

Sadly that test passes. It does so because JavaScript tests equality for js objects
by looking at their references, ignoring the values they contain.



In many situations using references rather than values makes sense. If I'm loading
and manipulating a bunch of sales orders, it makes sense to load each order into a
single place. If I then need to see if the Alice's latest order is in the next delivery,
I can take the memory reference, or identity, of Alice's order and see if that reference
is in the list of orders in the delivery. For this test, I don't have to worry about
what's in the order. Similarly I might rely on a unique order number, testing to see if
Alice's order number is on the delivery list.



Therefore I find it useful to think of two classes of object: value objects and reference
objects, depending on how I tell them apart [1]. I need to ensure that I know how I expect each
object to handle equality and to program them so they behave according to my
expectations. How I do that depends on the programming language I'm working in.



Some languages treat all compound data as values. If I make a simple compound in Clojure, it
looks like this.



> (= {:x 2, :y 3} {:x 2, :y 3})
true


That's the functional style - treating everything as immutable values.



But if I'm not in a functional language, I can still often create value objects. In
Java for example, the default point class behaves how I'd like.



assertEquals(new Point(2, 3), new Point(2, 3)); // Java

The way this works is that the point class overrides the default equals
method with the tests for the values. [2] [3]



I can do something similar in JavaScript.



class Point {
constructor(x, y) {
this.x = x;
this.y = y;
}
equals (other) {
return this.x === other.x && this.y === other.y;
}
}


const p1 = new Point(2,3);
const p2 = new Point(2,3);
assert(p1.equals(p2));

The problem with JavaScript here is that this equals method I defined is a mystery to
any other JavaScript library.



const somePoints = [new Point(2,3)];
const p = new Point(2,3);
assert.isFalse(somePoints.includes(p)); // not what I want

//so I have to do this
assert(somePoints.some(i => i.equals(p)));

This isn't an issue in Java because
Object.equals is defined in the core library and all other libraries use it
for comparisons (== is usually used only for primitives).



One of the nice consequences of value objects is that I don't need to care about
whether I have a reference to the same object in memory or a different reference with an
equal value. However if I'm not careful that happy ignorance can lead to a problem,
which I'll illustrate with a bit of Java.



Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));

// this means we need a retirement party
Date partyDate = retirementDate;

// but that date is a Tuesday, let's party on the weekend
partyDate.setDate(5);

assertEquals(new Date(Date.parse("Sat 5 Nov 2016")), retirementDate);
// oops, now I have to work three more days :-(

This is an example of an Aliasing Bug, I change a date in one place
and it has consequences beyond what I expected [4]. To avoid
aliasing bugs I follow a simple but important rule: value objects should be
immutable
. If I want to change my party date, I create a new
object instead.



Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));
Date partyDate = retirementDate;

// treat date as immutable
partyDate = new Date(Date.parse("Sat 5 Nov 2016"));

// and I still retire on Tuesday
assertEquals(new Date(Date.parse("Tue 1 Nov 2016")), retirementDate);

Of course, it makes it much easier to treat value objects as immutable if they really
are immutable. With objects I can usually do this by simply not providing any setting
methods. So my earlier JavaScript class would look like this: [5]



class Point {
constructor(x, y) {
this._data = {x: x, y: y};
}
get x() {return this._data.x;}
get y() {return this._data.y;}
equals (other) {
return this.x === other.x && this.y === other.y;
}
}


While immutability is my favorite technique to avoid aliasing bugs, it's also
possible to avoid them by ensuring assignments always make a copy. Some languages
provide this ability, such as structs in C#.



Whether to treat a concept as a reference object or value object depends on your
context. In many situations it's worth treating a postal address as a simple structure
of text with value equality. But a more sophisticated mapping system might link postal
addresses into a sophisticated hierarchic model where references make more sense. As
with most modeling problems, different contexts lead to different solutions. [6]



It's often a good idea to replace common primitives, such as strings, with appropriate
value objects. While I can represent a telephone number as a string, turning into a
telephone number object makes variables and parameters more explicit (with type checking
when the language supports it), a natural focus for validation, and avoiding
inapplicable behaviors (such as doing arithmetic on integer id numbers).



Small objects, such as points, monies, or ranges, are good examples of value objects.
But larger structures can often be programmed as value objects if they don't have any
conceptual identity or don't need share references around a program. This is a more
natural fit with functional languages that default to immutability. [7]



I find that value objects, particularly small ones, are often overlooked - seen as
too trivial to be worth thinking about. But once I've spotted a good set of value
objects, I find I can create a rich behavior over them. For taste of this try using a
Range class and see how it prevents all sorts of duplicate
fiddling with start and end attributes by using richer behaviors. I often run into code
bases where domain-specific value objects like this can act as a focus for refactoring,
leading to a drastic simplification of a system. Such a simplification often surprises
people, until they've seen it a few times - by then it is a good friend.




Acknowledgements

James Shore, Beth Andres-Beck, and Pete Hodgson shared their experiences of using
value objects in JavaScript.





Graham Brooks, James Birnie, Jeroen Soeters, Mariano Giuffrida, Matteo Vaccari, Ricardo
Cavalcanti, and Steven Lowe


provided valuable comments on our internal mailing lists.





Further Reading

Vaughn Vernon's description is probably the best in-depth
discussion of value objects
from a DDD perspective. He covers how to decide
between values and entities, implementation tips, and the techniques for persisting
value objects.



The term started gaining traction in the early noughties. Two books that talk about
them from that time are are PoEAA and DDD. There was also some interesting discussion on Ward's Wiki.



One source of terminological confusion is that around the turn of the century some
J2EE literature used "value object" for Data Transfer Object. That usage has
mostly disappeared by now, but you might run into it.





Notes


1:
In Domain-Driven Design the Evans Classification contrasts value
objects with entities. I consider entities to be a common form of reference object,
but use the term "entity" only within domain models while the reference/value object
dichotomy is useful for all code.





2:
Strictly this is done in awt.geom.Point2D, which is a superclass of awt.Point





3:
Most object comparisons in Java are done with equals - which is
itself a bit awkward since I have to remember to use that rather than the equals
operator ==. This is annoying, but Java programmers soon get used to it
since String behaves the same way. Other OO languages can avoid this - Ruby uses the
== operator, but allows it to be overridden.





4:
There is robust competition for the worst feature of the pre-Java-8 date and
time system - but my vote would be this one. Thankfully we can avoid most of
this now with Java 8's java.time package





5:
This isn't strictly immutable since a client can manipulate the _data
property. But a suitably disciplined team can make it immutable in practice.
If I was concerned that a team wouldn't be disciplined enough I might use use
freeze. Indeed I could just use freeze on a simple JavaScript object,
but I prefer the explicitness of a class with declared accessors.





6:
There is more discussion of this in Evans's DDD book.





7:
Immutability is valuable for reference objects too - if a sales order doesn't change
during a get request, then making it immutable is valuable; and that would make it
safe to copy it, if that were useful. But that wouldn't make the sales order be a
value object if I'm determining equality based on a unique order number.






Share:

if you found this article useful, please share it. I appreciate the feedback and encouragement
 •  0 comments  •  flag
Share on Twitter
Published on November 14, 2016 07:39

Bliki: AliasingBug

Aliasing occurs when the same memory location is accessed through more than one
reference. Often this is a good thing, but frequently it occurs in an unexpected way,
which leads to confusing bugs.



Here's a simple example of the bug.



Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));

// this means we need a retirement party
Date partyDate = retirementDate;

// but that date is a Tuesday, let's party on the weekend
partyDate.setDate(5);

assertEquals(new Date(Date.parse("Sat 5 Nov 2016")), retirementDate);
// oops, now I have to work three more days :-(

What's happening here is that when we do the assignment, the partyDate variable is
assigned a reference to the same object that the retirement data refers to. If I then
alter the internals of that object (with setDate) then both variables are
updated, since they refer to the same thing.





Although aliasing is a problem in that example, in other contexts it's what I expect.



Person me = new Person("Martin");
me.setPhoneNumber("1234");
Person articleAuthor = me;
me.setPhoneNumber("999");
assertEquals("999", articleAuthor.getPhoneNumber());

It's common to want to share records like this, and then if it changes, it changes
for all references. This is why it's useful to think of reference objects, which
we deliberately share [1], and Value Objects that we don't want this kind of shared update behavior. A good way to
avoid shared updates of value objects is to make value objects immutable.



Functional languages, of course, prefer everything to be immutable. So if we want
changes to be shared, we need to handle that as the exception rather than the rule.
Immutability is a handy property, one that makes it harder to create several kinds of
bugs. But when things do need to change, immutability can introduce complexity, so it's
by no means a free breakfast.




Acknowledgements

Graham Brooks and James Birnie's comments on our internal mailing list led me to
write this post.





Further Reading

The term aliasing bug has been around for a while. It appears in Eric Raymond's
Jargon
file
in the context of the C language where the raw memory accesses make it even more
unpleasant.





Notes


1:
The Evans Classification has the notion of Entity, which I see as a common form of
reference object.






Share:

if you found this article useful, please share it. I appreciate the feedback and encouragement
 •  0 comments  •  flag
Share on Twitter
Published on November 14, 2016 07:38

October 11, 2016

Vote Against Trump

In my writing, I don't usually get into US party politics. I have Opinions, but
most political discussion quickly deteriorates into partisan bickering, which I find
unsatisfying. But this presidential election is striking. Donald Trump is a demagogue
who could do a lot of damage to both the US and the rest of the world. If you're an
American who is undecided about who to vote for, or wishes to vote for a third party
candidate, I feel I must explain why he is uniquely dangerous, and therefore why I it
is necessary to vote for Mrs Clinton.



more…

 •  0 comments  •  flag
Share on Twitter
Published on October 11, 2016 12:59

Martin Fowler's Blog

Martin Fowler
Martin Fowler isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Martin Fowler's blog with rss.