Martin Fowler's Blog, page 39

photostream 71

Oxbow Turnout, Two Ocean Lake Trail, Grand Tetons N.P., WY

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on August 09, 2014 17:58

Retreaded: FaultyTechniqueDichotomy

Retread of post orginally made on 05 Aug 2004

My main inspiration in life is trying to capture and improve the
way in which we do software development. So I spend a lot of time
talking to people about various techniques they've used, which ones
work well and which ones suck.

As I do this, I often hear about faulty techniques: "FIT wasn't
worth the effort", "never put any logic in stored procedures", "test
driven design led to a chaotic mess". The problem with any report of a
faulty technique is to figure out if the technique itself is faulty,
or whether the application of the technique was faulty.

Let's take a couple of examples. Several friends of mine
commented how stored procedures were a disaster because they weren't
kept in version control (instead they had names like GetCust01,
GetCust02, GetCust02B etc). That's not a problem with stored
procedures, that's a problem with people not using them properly.
Similar a criticism that TDD led to a brittle design on further
questioning led to the discovery that the team in question hadn't done
any refactoring - and refactoring is a critical step in TDD.

Of course if you take all this too far, you get the opposite
effect. I often say "no methodology has ever failed". My reason for
this is that given any failure (assuming you can know
WhatIsFailure) you can find some variation from the
methodology. Hence the methodology wasn't followed and therefore
didn't fail. This issue is compounded even further with self-adaptive
agile methods.

So when you hear of techniques failing, you need to ask a lot
more questions.

Was it the technique itself that had problems, or was some
other thing being missed out. Does the technique have an influence on
this? (Version control is a separate thing to stored procedures, but
it can be harder to use version control with stored procedures due to
nature of tools involved.)

Was the technique used in a context that wasn't suitable for
it? (Don't use wide-scale manual refactoring when you don't have
tests.) Remember that software development is a very human activity,
often techniques aren't suitable for a context because of culture and
personality.

Were important pieces missed out of the technique?

Were people focused on outward signs that didn't correspond to
the reality? This kind of thing is what Steve McConnell called Cargo Cult
Software Engineering..

An interesting aspect of this is whether certain techniques are
fragile; that is they are hard to apply correctly and thus more prone
to a faulty application. If it's hard to use a technique properly,
that's a reasonable limitation on the technique, reducing the context
when it can be used.

There's no simple answer to this problem, since with these
techniques we are as unable to measure compliance as we are unable to
measure their successfulness. The important thing to do is whenever
you hear of a technique failing - always remember the dichotomy.

reposted on 06 Aug 2014

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on August 07, 2014 17:23

Retreaded: PreferFunctionalStaffOrganization

Retread of post orginally made on 02 Aug 2004

For as long as I've been in software there's been a debate
between FunctionalStaffOrganization and
TechnicalStaffOrganization. The debate occurs within project teams,
and across whole IT organizations. It's a constant debate because
both sides have good logical arguments to support them, and there's
no real way to test which has an advantage in practice.

Despite the fact that I acknowledge this, I greatly prefer a
functional organization. I say this knowing there are exceptions and
you can't follow one route all the time. But I'd rather side with
too much functional orientation than the other way around.

For me the compelling factor is that of the aligning of
application teams to business value. I very much believe in the
irresistibility of Conway's
Law and see the setting of an ApplicationBoundary to be
primarily a social construct. Since the whole point of software
development is to serve its customers, then the organization should
reflect this - yielding teams that are focused on providing business
value rather than delving deep into technical esoterica.

Fundamentally the argument of
TechnicalStaffOrganization rests on efficiency - that it's
wasteful to duplicate systems and that people are more efficient if
they specialize. I won't deny that you get duplication if you use a
functional organization, but I'm not so convinced it's so
wasteful. After all many people believed a centralized, planned
economy was bound to be more efficient than the wasteful duplication
of capitalist competition. I'm wary of stretching too much of a parallel
between macro-economics and software development, but I suspect the
same issue underlies each of them - human motivation. When people are
focused away from solving business problems, all sorts of factors
creep in that introduce inefficiencies far greater than the
duplication that a technical organization is designed to solve.

I also think there's an inevitability here. Business units need
applications, if they don't get them they will build their own,
creating a functional organization by default. After all they have
the money and power - essentially the same dynamics as drive the
boom-bust cycle of EnterpriseArchitecture.

So I think that most of the time you should organize
functionally. But that doesn't mean that you should be blind to the
problems of the approach. Duplicate work and lack of technical
specialization will be serious problems, and you'll need to do things to
counter those problems. They're just better problems to have than
the alternative.

reposted on 04 Aug 2014

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on August 05, 2014 03:15

Retreaded: ComposedRegex

Retread of post orginally made on 24 Jul 2009

One of the most powerful tools in writing maintainable code is
break large methods into well-named smaller methods - a technique
Kent Beck refers to as the Composed Method pattern.

People can read your programs much more quickly and accurately
if they can understand them in detail, then chunk those details
into higher level structures.

-- Kent Beck

What works for methods often works for other things as well. One
area that I've run into a couple of times where people fail to do
this is with regular expressions.

Let's say you have a file full of rules for scoring frequent
sleeper points for a hotel chain. The rules all look rather like:

score 400 for 2 nights at Minas Tirith Airport

We need to pull out the points (400) the number of nights (2) and
the hotel name (Minas Tirith Airport) for each of these rows.

This is an obvious task for a regex, and I'm sure right now
you're thinking - oh yes we need:

const string pattern =
@"^score\s+(\d+)\s+for\s+(\d+)\s+nights?\s+at\s+(.*)";

Then our three values just pop out of the groups.

I don't know whether or not you're comfortable in understanding
how that regex works and whether it's correct. If you're like me you
have to look at a regex like this and carefully figure out what it's
saying. I often find myself counting parentheses so I can see where
the groups line up (not actually that hard in this case, but I've
seen plenty of others where it's tougher).

You may have read advice to take a pattern like this and to
comment it. (Often needs a switch when you turn it into a regex.)
That way you can write it like this.

protected override string GetPattern() {
const string pattern =
@"^score
\s+
(\d+) # points
\s+
for
\s+
(\d+) # number of nights
\s+
night
s? #optional plural
\s+
at
\s+
(.*) # hotel name
";

return pattern;
}
}

This is easier to follow, but comments never quite satisfy
me. Occasionally I've been accused of saying comments are bad, and
that you shouldn't use them. This is wrong, in both senses.
Comments are not bad - but there are often better options. I always
try to write code that doesn't need comments, usually by good
naming and structure. (I can't always succeed, but I feel I do more
often than not.)

People often don't try to structure regexs, but I find it
useful. Here's one way of doing this one.

const string scoreKeyword = @"^score\s+";
const string numberOfPoints = @"(\d+)";
const string forKeyword = @"\s+for\s+";
const string numberOfNights = @"(\d+)";
const string nightsAtKeyword = @"\s+nights?\s+at\s+";
const string hotelName = @"(.*)";

const string pattern = scoreKeyword + numberOfPoints +
forKeyword + numberOfNights + nightsAtKeyword + hotelName;

I've broken down the pattern into logical chunks and put them
together again at the end. I can now look at that final expression
and understand the basic chunks of the expression, diving into the
regex for each one to see the details.

Here another alternative that seeks to separate the whitespace to
make the actual regexs look more like tokens.

const string space = @"\s+";
const string start = "^";
const string numberOfPoints = @"(\d+)";
const string numberOfNights = @"(\d+)";
const string nightsAtKeyword = @"nights?\s+at";
const string hotelName = @"(.*)";

const string pattern = start + "score" + space + numberOfPoints + space +
"for" + space + numberOfNights + space + nightsAtKeyword +
space + hotelName;

I find this makes the individual tokens a bit clearer, but all
those space variables makes the overall structure harder to
follow. So I prefer the previous one.

But this does raise a question. All of the elements are separated
by space, and putting in lots of space variables or \s+
in the patterns feels wet. The nice thing about breaking out the
regexs into sub-strings is that I can now use the programming logic
to come up with abstractions that suit my particular purpose
better. I can write a method that will take sub strings and join
them up with whitespace.

private String composePattern(params String[] arg) {
return "^" + String.Join(@"\s+", arg);
}

Using this method, I then have.

const string numberOfPoints = @"(\d+)";
const string numberOfNights = @"(\d+)";
const string hotelName = @"(.*)";

const string pattern = composePattern("score", numberOfPoints,
"for", numberOfNights, "nights?", "at", hotelName);

You may not use exactly any of these alternative yourself, but I
do urge you to think about how to make regular expressions
clearer. Code should not need to be figured out, it should be just
read.

Updates

In this discussion I've made the elements for the composed
regexs be local variables. A variation is to take commonly used
regex elements and use them more widely. This can be handy to use
common regexs that are needed in lots of places. My colleague
Carlos Villela comments that one thing to watch out for is if
these fragments are not well-formed, ie having an opening
parenthesis that's closed in another fragment. This can be tricky
to debug. I've not felt the need to do it, so haven't run into
this problem.

A few people mentioned using fluent interfaces (internal DSLs)
as an more readable alternative
to regexs. I see this as a separate thing. Regexs don't bother
me if they are small, indeed I prefer a small regex to an
equivalent fluent interface. It's the composition that counts,
which you can do with either technique.

Some others mentioned named capture groups. Like comments, I
find these are better than the raw regex, but still find a
composed structure more readable. The point of composition is that
it breaks the overall regex into small pieces that are easier to
understand.

reposted on 31 Jul 2014

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on August 01, 2014 07:12

Final part of Collection Pipelines

In this final installment I touch on laziness, parallelism, and immutability then conclude by outlining when we should use collection pipelines.

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 29, 2014 16:53

photostream 70

Cape Arago, OR

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 26, 2014 21:01

Part 4 of Collection Pipelines: alternatives

In this installment I look at alternatives to using a collection pipeline: loops and comprehensions.

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 24, 2014 09:57

Part 3 of Collection Pipelines: complex example inverting a many-to-many relationship

I’ve added a more complex example to the article, one that inverts a many-to-many relationship. This also raises the question of how to factor more complex pipelines

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 23, 2014 10:33

Collection Pipelines

I’ve often come across a pattern in code where you organize some computation by passing collections through a pipeline of operations. I first came across it in Unix, did it in Smalltalk and Ruby, and find it common in functional programming. I’ve written an article to describe this pattern, and this is the first installment which contains an initial introduction and a definition of the pattern.

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 22, 2014 19:05

Part 2 of Collection Pipelines: a couple more examples

I’ve added two more examples of collection pipelines with this second installment. The first is the classic combination of map and reduce, also introducing specifying functions with names as well as lambdas. The second introduces the group-by operation and treating hashmaps as key-value pairs.

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on July 22, 2014 19:05