Martin Fowler's Blog, page 6

Distributed Systems Pattern: Lease

Cluster nodes need exclusive access to certain resources. But nodes can crash; they can be temporarily disconnected or experiencing a process pause. Under these error scenarios, they should not keep the access to a resource indefinitely.

more…

View more on Martin Fowler's website »

1 like · Like • 0 comments • flag

Published on January 13, 2021 07:33

The Lies that can Undermine Democracy

Like many Americans, I was transfixed and horrified by the recent assault on the Capitol. Much of this anger originates in lies perpetrated by irresponsible politicians and spread through media agencies. Lies like this can destroy democracies, and while we must have free speech we must not be free of the consequences of that speech

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on January 12, 2021 06:56

Maximizing Developer Effectiveness: Feedback Loops

Tim continues his comparison of high and low effectiveness
organizations by comparing their key feedback loops. To improve these,
organizations need to understand the importance of micro feedback loops,
which are often neglected because they are so small.

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on January 06, 2021 07:05

Some more Distributed Systems Patterns

Unmesh Joshi has a few more of his Patterns of Distributed Systems ready to share with the world.

Consistent Core looks at how a large cluster can keep some information strongly consistent, Lease allows unreliable nodes to access limited resources without blocking them when they fail State Watch allows clients to be notified of changes on a server.

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on January 05, 2021 06:43

Distributed Systems Pattern: Consistent Core

Unmesh has a few more of his Patterns of Distributed Systems ready to
share with the world. In this one he looks at the problem of a large
cluster, one that is too large to effectively maintain strong consistency,
yet needs to maintain some data in a strongly consistent way. It can do
this by using a smaller cluster, which he calls a Consistent Core.

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on January 05, 2021 06:43

Maximizing Developer Effectiveness

My colleague Tim Cochran has helped many software engineering
organizations transform to respond faster to changing market needs. Often
companies struggle with these transformations and a primary reason for
these problems is that engineering organization has neglected to provide
developers with an effective working environment. The key to
to developing an effective environment is to concentrate on feedback loops.

In this first installment, Tim contrasts a developer's day between
high-effectiveness and low-effectiveness environments, using this contrast
to show that poor organizations need to remove the common frictions that make
developers feel unproductive .

more…

View more on Martin Fowler's website »

1 like · Like • 0 comments • flag

Published on January 05, 2021 06:40

My favorite musical discoveries of 2020

Like most people, I'm looking forward to seeing 2020 in the rear-view
mirror, but even this ugly year has brought some good things. For the last
three decades I've regularly bought a few albums every month, and I
thought I'd pick out a half-dozen favorites in the hope that they lead
some readers to share at least a bit of my musical tastes. I've been doing
most of my musical buying on Bandcamp, so you can easily sample them.

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on December 22, 2020 09:56

Data Mesh Principles and Logical Architecture

Last year, my colleague Zhamak Dehghani introduced the notion of the Data Mesh, shifting from
the notion of a centralized data lake to a distributed vision of
data. Based on more thinking, and the lessons of a year's worth working
with clients, she's now written an article outlining four
foundational principles of a data mesh, and how they drive a
logical architecture.

more…

View more on Martin Fowler's website »

Like • 0 comments • flag

Published on December 03, 2020 07:27

Don't put data science notebooks into production

We've come across many clients who are interested in taking the
computational notebooks developed by their data scientists, and putting
them directly into the codebase of production applications. My colleague
David Johnston points out that while data science
ideas do need to move out of notebooks and into production, trying to
deploy that notebooks as a code artifact breaks a multitude of good
software practices. Predictably, that results in a number of observed pain
points. This behavior is a symptom of a deeper problem: a lack of
collaboration between data scientists and software developers.

more…

View more on Martin Fowler's website »

2 likes · Like • 0 comments • flag

Published on November 18, 2020 07:25

Bliki: ComputationalNotebook

A computational notebook is an environment for writing a prose
document that allows the author to embed code which can be easily executed
with the results also incorporated into the document. It's a platform
particularly well-suited for data science work. Such environments include
Jupyter Notebook, R Markdown, Mathematica, and Emacs's org-mode.

When I'm exploring some data, it's useful to keep my notes close together
with the code that performs the exploration. I like to try some code, look at
the results, and note down any observations I have from that execution. A
computational notebook allows me to combine these together easily in a single document.

Here's an example of this, looking at some analysis of my google analytics
data for martinfowler.com. I'm doing this in R Studio, which uses the R
Markdown format.

The example out here is a graph, as notebooks are well suited for plotting
various charts. But it's just as useful to embed various data manipulations in
the code and display the data in the document as a table.

I first encountered a computational notebook in the late 1980's with
Mathematica. I remember wishing I'd had access to such a tool during my
university degree, but didn't use a computational notebook again until recent
years, with the rise of their use in data science circles. The notebook
software I hear most about is Jupyter Notebook, which is popular in the Python
community, but as I do my data munging with R I tend to use R Markdown,
usually within R Studio. I also use a rather more niche notebook, org-mode,
which is part of Emacs.

The code embedded in Mathematica is its own programming language, designed for expressing
mathematics. Although Jupyter began in the Python world, it supports a wide
range of programming languages, as does R Markdown. Mathematica is a
commercial tool, but Jupyter and R Markdown are open source. Jupyter stores
its files in JSON, R Markdown uses markdown files with some special markup for
the code blocks. Using a text format for the documents allows them to be
stored in regular version control tools, and using a markup language makes
diffing easier. Using a markup language allows the possibility of editing the
documents in other editors, but they need to have a suitable environment for
executing the code blocks.

Computational notebooks are useful when exploring a problem, such as
trying various forms of analysis on a dataset. The document acts as a record
of what's been tried and all the observations the researcher makes as they try
things. By keeping the code and results together the writer can see exactly
what they did and what results that generated. This coupling of code and
results is a form of IllustrativeProgramming, making the
environment appealing to lay programmers. One thing to be wary of,
however, is if any external environmental factors change the result - such as
the contents of a database. If the dataset isn't too large it can be exported
and kept in the version control system, but often its size is prohibitive.

Notebooks are also useful for preparing reports, usually by generating a
document in PDF, HTML, or other formats. If I want to report to an author on
the traffic for their article, I take the last such report, change the subject
URL, rerun all the code, and tweak any prose commentary I think is
appropriate. If I were sufficiently motivated I could auto-generate such
reports every few months. I like that such reports can easily include the code
used to generate the results, so readers can accurately understand the logic
behind the figures they see.

Notebooks shouldn't be used, however, as a component of a production
system. The notebook structure - with its casual mix of IO, calculation, and
UI - is there to encourage interactivity, but works against the modularity
needed for code that is used as part of a broader code base. It's best to
think of notebooks as a way of exploring logic, once you've found a path, that
logic should be replicated into a library designed for production use.

View more on Martin Fowler's website »

1 like · Like • 0 comments • flag

Published on November 18, 2020 07:24