How to Measure Anything: Finding the Value of Intangibles in Business
Rate it:
Open Preview
73%
Flag icon
higher than can be attributed to pure chance.
73%
Flag icon
This effect occurs even when the initially perceived positive or negative attribute should be unrelated to subsequent evaluations. An experiment
73%
Flag icon
conducted by Robert Kaplan of San Diego State University shows how physical attractiveness causes graders to give essay writers better evaluations on their essays.1 Subjects were asked to grade an essay written by a student. A photograph of the student was provided with the essay. The grade given for the essay correlated strongly with a subjective attractiveness scale evaluated by other judges. What is interesting is that all the subjects received the exact same essay, and the photograph attached
74%
Flag icon
it is actually changing one’s preferences midcourse in the analysis of a decision in a way that supports a forming opinion.
74%
Flag icon
In 1954, he stunned the psychiatric profession with his monumental, classic book, Clinical versus Statistical Prediction.
74%
Flag icon
In predicting college freshman GPAs, a simple linear model of high school rank and aptitude tests outperformed experienced admissions staff.
74%
Flag icon
In predicting the recidivism of criminals, criminal records and prison records outperformed criminologists.
74%
Flag icon
The academic performance of medical school students was better predicted with simple models based on past academic performance t...
This highlight has been truncated due to consecutive passage length restrictions.
74%
Flag icon
In a World War II study of predictions of how well Navy recruits would perform in boot camp, models based on high school records and aptitude tests outperformed expert interviewers. Even when the interviewers were given the same data, the predictio...
This highlight has been truncated due to consecutive passage length restrictions.
74%
Flag icon
“the illusion of learning.” They feel as if their judgments must be getting better with time.
74%
Flag icon
A study of experts in horse racing found that as they were given
74%
Flag icon
more data about horses, their confidence in their prediction about the outcomes of races improved. Those who were given some data performed better than those who were given none. But as the amount of data they were given increased, actual performance began to level off and even degrade. However, their confidence in their predictions continued to go up even after the information load was making the predictions worse.5
74%
Flag icon
Another study shows that, up to a point, seeking input from others about a decision may improve decisions, but beyond that point the decisions actually get slightly worse as the expert collaborates with ...
This highlight has been truncated due to consecutive passage length restrictions.
74%
Flag icon
continues to increase even after the decisions ha...
This highlight has been truncated due to consecutive passage length restrictions.
74%
Flag icon
Dr. Ram felt the error being introduced into the evaluation process was, at this point, mostly one of inconsistently presented data. Almost any improvement in simply organizing and presenting the data in an orderly format would be a benefit. To improve on this situation, he simply organized all the relevant data on faculty performance and presented it in a large matrix. Each row is a faculty member, and each column is a particular category of professional accomplishments (awards, publications, etc.).
75%
Flag icon
While I used to categorically dismiss the value of weighted scores as something no better than astrology, subsequent research has convinced me that they may offer some benefit after all. (And any fair researcher should always be able to say that sufficient empirical evidence would change their mind.)
75%
Flag icon
How to Standardize Any Evaluation: Rasch Models
75%
Flag icon
One expert in the educational testing field told me about something he called “invariant comparison”—a feature of measurement he considered so basic that it was simply “measurement fundamentals, statistics 101 stuff.”
75%
Flag icon
The concept of invariant comparison deals with a key problem central to many human performance tests, such as the IQ test. “Invariant comparison” is a principle that says if one measurement instrument says A is more than B, then another measurement
75%
Flag icon
instrument should give the same answer.
75%
Flag icon
Yet this is exactly what could happen with an IQ test or any other test of human performance. It is possible for one IQ test, having different questions, to give a very different result from another type of IQ test. Therefore, it is possible for Bob to score higher than Sherry on one test and lower on another.
76%
Flag icon
The comparison of different project managers would not be invariant (i.e., independent) of who judged them or the projects they were judged on. In fact, the overriding determinant of their relative standing among project
76%
Flag icon
Previously, the best predictor of who was given certification was simply the judge and the cases the candidates were randomly assigned to, not, as we might hope, the proficiency of the candidate. In other words, lenient examiners were very likely to pass incompetent candidates.
76%
Flag icon
Rasch scores for each judge, case, and candidate for each skill category. Using this approach, it was possible to predict whether a candidate would have passed with an average judge with an average case even if the candidate had a lenient judge and an easy case (or a hard judge and a hard case). Now variance due to judges or case difficulty can be completely removed from consideration in the certification process. None too soon for the general public, I’m sure.
76%
Flag icon
“We should be less like geologists and more
76%
Flag icon
like cartographers.”
76%
Flag icon
Amazingly, he also found that the formula, while simply based on expert judgments and no objective historical data, was better than the expert at making these judgments.
76%
Flag icon
In other words, the formula based only on analysis of expert judgments would predict better than the expert in such problems as who would do well in graduate school or which tumor was malignant.
76%
Flag icon
to military logistics, cybersecurity, estimating movie box office receipts, prioritizing R&D projects, and even in prioritizing agricultural projects in the developing world.
76%
Flag icon
The Lens Model does this by removing the error of judge inconsistency from the evaluations. The evaluations of experts usually vary even in identical situations. As discussed at the beginning of this chapter, human experts can be
76%
Flag icon
influenced by a variety of irrelevant factors yet still maintain the illusion of learning and expertise. The linear model of the expert’s evaluation, however, gives perfectly consistent valuations.
77%
Flag icon
A very different behavior occurs when the task is to generate exact values for a business case, especially one where the estimator has a stake in the outcome, as opposed to a calibrated estimator providing an initial 90% confidence interval (CI). Sitting in a room, one or more people working on the business case will play a game with each estimate.
78%
Flag icon
The Information Economics method adds new errors in another way. It takes a useful and financially meaningful quantity, such as an ROI, and converts it to a score. The conversion goes like this: An ROI of 0 or less is a score of 0, 1% to 299% is a score of 1, 300% to 499% is a 2, and so on. In other words, a modest ROI of 5% gets the same score as an ROI of 200%. In more quantitative portfolio prioritization methods, such a difference would put a huge distance between the priorities of two projects.
80%
Flag icon
The critical revenue forecasts, on the other hand, were based on the Lens modeling method discussed previously. Several of their experts were asked to estimate first- and second-year revenue for over 50 hypothetical new products. Each product was described with a set of parameters the experts felt would inform their estimates.
80%
Flag icon
Some parameters were related to marketing strategy, some were related to describing the target market; some were related to details of the product itself, and so on. After the estimates
80%
Flag icon
were collected, we used a...
This highlight has been truncated due to consecutive passage length restrictions.
80%
Flag icon
modeling method to approximate expert estimates based on the data they were given. This model was then used to generate the revenue estim...
This highlight has been truncated due to consecutive passage length restrictions.
81%
Flag icon
When I asked the renowned physicist and author Freeman Dyson what he thought to be the most important, most clever, and most inspiring measurement, he responded without hesitation, “GPS [Global Positioning System] is the most spectacular example. It has changed everything.” Actually, I was expecting a different kind of response, perhaps something from his days in operations research for the Royal Air Force during World War II, but GPS made sense as both a truly revolutionary measurement instrument as well as a measurement in its own right. GPS is economically available for just about anyone ...more
81%
Flag icon
If Eratosthenes could measure the circumference of Earth by looking at shadows, I wonder what sorts of economic, political, and behavioral phenomena he could measure with web-based GPS.
82%
Flag icon
A “postbooking” survey was sent in an automated e-mail right after a reservation was made, and another was sent in a “welcome home” e-mail after the customer returned from the cruise. Hale says: “We just wanted to see what kind of results we would get. We were getting a 4% to 5% response rate initially, but with the welcome-home e-mail we were getting an 11.5% response rate.” By survey standards, that is very high. In a clever use of a simple control, NLG compares responses to questions like “Will you refer us to a friend?” before and after
82%
Flag icon
customers take the trip to see if scores are higher after the vacation. When they found that clients weren’t as happy after the trip, NLG decided to launch a whole new program with the sales team. Hale says, “We had to retrain the sales team to sell in a different way and get the customer to the right vacation.” Simply discovering the problem was a measurement success. Now the company needs to measure the effect of the new program.
84%
Flag icon
Calibration Training Best when lots of quick, low-cost estimates are
84%
Flag icon
expert to work, and an answer is immediate. Should be the first estimating method in most cases—more elaborate methods can be used if the information value justifies it. Lens Model Used when there are a large number of repeated estimates of the same type (e.g., assessment of investments in a big portfolio) and when the same type of data can be gathered on each. Once created, the Lens Model generates instant answers for this class of problems regardless of the availability of the original expert(s). The model can be created using only hypothetical scenarios. Rasch Model Used to standardize ...more
84%
Flag icon
of real evaluati...
This highlight has been truncated due to consecutive passage length restrictions.
84%
Flag icon
hypothetical). All are taken into consideration for standardization. Prediction Market Best for forecasts, especially where it is useful to track changes in probabilities over time. It requires at least two market players for even the first transaction to occur. It is not ideal if you need fast answers for a large number of quantities, homogeneous or not. If the num...
This highlight has been truncated due to consecutive passage length restrictions.
85%
Flag icon
No attempt at computing a return on investment (ROI) was made, much less any attempt at quantifying risk. This came as a surprise to some of the 30 Chicago companies represented in the room.
86%
Flag icon
His concern, however, was that the benefits for SDWIS were ultimately about public health, which he didn’t know how to quantify economically.
87%
Flag icon
12% chance of a negative return. The other two modifications had less than a 1% chance of a negative return. We plotted these three investments on the investment boundary (Chapter 11) we had already documented for the EPA. All three were acceptable, but not equally so. The reengineering of exception reporting had the highest risk and lowest return of the three.
87%
Flag icon
It is an example of how an intangible like public health is quantified for an IT project. I’ve seen many IT projects dismiss much more easily measured benefits as “immeasurable” and exclude them from the ROI calculation.
87%
Flag icon
This example is about what didn’t have to be measured. Only 1 variable out of 99 turned out to require uncertainty reduction. The initial calibrated estimates were sufficient for the other 98.