Nicholas C. Zakas's Blog, page 12

ECMAScript 6 collections, Part 2: Maps

Maps[1], like sets, are also a familiar topic for those coming from other languages. The basic idea is to map a value to a unique key in such a way that you can retrieve that value at any point in time by using the key. In JavaScript, developers have traditionally used regular objects as maps. In fact, JSON is based on the premise that objects represent key-value pairs. However, the same limitation that affects objects used as sets also affects objects used as maps: the inability to have non-string keys.

Prior to ECMAScript 6, you might have seen code that looked like this:

var map = {};

// later
if (!map[key]) {
map[key] = value;
}

This code uses a regular object to act like a map, checking to see if a given key exists. The biggest limitation here is that key will always be converted into a string. That’s not a big deal until you want to use a non-string value as a key. For example, maybe you want to store some data that relates to particular DOM element. You could try to do this:

// element gets converted to a string
var data = {},
element = document.getElementById("my-div");

data[element] = metadata;

Unfortunately, element will be converted into the string "[Object HTMLDivElement]" or something similar (the exact values may be different depending on the browser). That’s problematic because every element gets converted into the same string, meaning you will constantly be overwriting the same key even though you’re technically using different elements. For this reason, the Map type is a welcome addition to JavaScript.

The ECMAScript 6 Map type is an ordered list of key-value pairs where both the key and the value can be of any type. A key of 5 is different than a key of "5", and keys are determined to be the same using the same rules as values for a set: NaN is consider the same as NaN, -0 is different from 0, and otherwise the === applies. You can store and retrieve data from a map using the set() and get() methods, respectively:

var map = new Map();
map.set("name", "Nicholas");
map.set(document.getElementById("my-div"), { flagged: false });

// later
var name = map.get("name"),
meta = map.get(document.getElementById("my-div"));

In this example, two key-value pairs are stored. The key "name" stores a string while the key document.getElementById("my-div") is used to associate meta data with a DOM element. If the key doesn’t exist in the map, then the special value undefined is returned when calling get().

Maps shared a couple of methods with sets, such as has() for determining if a key exists in the map and delete() for removing a key-value pair from the map. You can also use size() to determine how many items are in the map:

var map = new Map();
map.set("name", "Nicholas");

console.log(map.has("name")); // true
console.log(map.get("name")); // "Nicholas"
console.log(map.size()); // 1

map.delete("name");
console.log(map.has("name")); // false
console.log(map.get("name")); // undefined
console.log(map.size()); // 0

In order to make it easier to add large amounts of data into a map, you can pass an array of arrays to the Map constructor. Internally, each key-value pair is stored as an array with two items, the first being the key and the second being the value. The entire map, therefore, is an array of these two-item arrays and so maps can be initialized using that format:

var map = new Map([ ["name", "Nicholas"], ["title", "Author"]]);

console.log(map.has("name")); // true
console.log(map.get("name")); // "Nicholas"
console.log(map.has("title")); // true
console.log(map.get("title")); // "Author"
console.log(map.size()); // 2

When you want to work with all of the data in the map, you have several options. There are actually three generator methods to choose from: keys, which iterates over the keys in the map, values, which iterates over the values in the map, and items, which iterates over key-value pairs by returning an array containing the key and the value (items is the default iterator for maps). The easiest way to make use of these is to use a for-of loop:

for (let key of map.keys()) {
console.log("Key: %s", key);
}

for (let value of map.values()) {
console.log("Value: %s", value);
}

for (let item of map.items()) {
console.log("Key: %s, Value: %s", item[0], item[1]);
}

// same as using map.items()
for (let item of map) {
console.log("Key: %s, Value: %s", item[0], item[1]);
}

When iterating over keys or values, you receive a single value each time through the loop. When iterating over items, you receive an array whose first item is the key and the second item is the value.

Another way to iterate over items is to use the forEach() method. This method works in a similar manner to forEach() on arrays. You pass in a function that gets called with three arguments: the value, the key, and the map itself. For example:

map.forEach(function(value, key, map)) {
console.log("Key: %s, Value: %s", key, value);
});

Also similar to the arrays version of forEach(), you can pass in an optional second argument to specify the this value to use inside the callback:

var reporter = {
report: function(key, value) {
console.log("Key: %s, Value: %s", key, value);
}
};

map.forEach(function(value, key, map) {
this.report(key, value);
}, reporter);

Here, the this value inside of the callback function is equal to reporter. That allows this.report() to work correctly.

Compare this to the clunky way of iterating over values and a regular object:

for (let key in object) {

// make sure it's not from the prototype!
if (object.hasOwnProperty(key)) {
console.log("Key: %s, Value: %s", key, object[key]);
}

}

When using objects as maps, it was always a concern that properties from the prototype might leak through in a `for-in` loop. You always need to use `hasOwnProperty()` to be certain that you are getting only the properties that you wanted. Of course, if there were methods on the object, you would also have to filter those:

for (let key in object) {

// make sure it's not from the prototype or a function!
if (object.hasOwnProperty(key) && typeof object[key] !== "function") {
console.log("Key: %s, Value: %s", key, object[key]);
}

}

The iteration features of maps allow you to focus on just the data without worrying about extra pieces of information slipping into your code. This is another big benefit of maps over regular objects for storing key-value pairs.

Browser Support

Both Firefox and Chrome have implemented Map, however, in Chrome you need to manually enable ECMAScript 6 features: go to chrome://flags and enable "Experimental JavaScript Features". Both implementations are incomplete. Neither browser implements any of the generator method for use with for-of and Chrome’s implementation is missing the size() method (which is part of the ECMAScript 6 draft specification[2]) and the constructor doesn’t do initialization when passed an array of arrays.

Summary

ECMAScript 6 maps bring a very important, and often used, feature to the language. Developers have long been wanting a reliable way to store key-value pairs and have relied on regular objects for far too long. Maps Provide all of the abilities that regular objects can’t, including easy ways to iterate over keys and values as well as removing concern over prototypes.

As with sets, maps are part of the ECMAScript 6 draft that is not yet complete. Because of that, maps are still considered an experimental API and may change before the specification is finalized. All posts about ECMAScript 6 should be considered previews of what’s coming, and not definitive references. The experimental APIs, although implemented in some browsers, are not yet ready to be used in production.

References

Simple Maps and Sets (ES6 Wiki)
ECMAScript 6 Draft Specification (ECMA)

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on October 09, 2012 07:00

Thoughts on TypeScript

Earlier this week, Microsoft released TypeScript[1], a new compile-to-JavaScript language for “application scale JavaScript.” My initial reaction was confusion:

Um, why? blogs.msdn.com/b/somasegar/ar… (via @izs)

— Nicholas C. Zakas (@slicknet) October 1, 2012

It seems like almost every week there’s a new language that’s trying to replace JavaScript on the web. Google received a lukewarm reception when it introduced Dart[2], it’s own idea for fixing all of JavaScript’s perceived flaws. CoffeeScript[3] continues to be the most prominent of these options, frequently inciting the holy wars online. And now Microsoft is throwing its hat into the ring and I couldn’t help but wonder why.

My bias

Before talking about TypeScript specifically, I want to explain my personal bias so that you can take the rest of my comments in their proper context. There is a very real problem in the web development industry and that problem is a significant lack of good JavaScript developers. I can’t tell you the number of companies that contact me trying to find above-average JavaScript talent to work on their applications. Yes, there are many more competent JavaScript developers now than there were 10 years ago, but the demand has increased in a way that far outpaces the supply increase. There are simply not enough people to fill all of the JavaScript jobs that are available. That’s a problem.

Some would argue that the high demand and low supply puts good JavaScript developers in an awesome position and we should never want to change that. After all, that’s why we can demand the salaries that we do. From a personal economic standpoint, I agree. From the standpoint of wanting to improve the web, I disagree. Yes, I want to be able to make a good living doing what I do, but I also want the web as a whole to continue to grow and get better, and that only happens when we have more competent developers entering the workforce.

I see compile-to-JavaScript languages as a barrier to that goal. We should be convincing more people to learn JavaScript rather than giving them more options to not write JavaScript. I often wonder what would happen if all of the teams and companies who spent time, energy, personnel, and money to develop these alternatives instead used those resources on improving JavaScript and teaching it.

To be clear, I’m not saying that JavaScript is a perfect language and doesn’t have its warts. Every language I’ve ever used has parts that suck and parts that are awesome, and JavaScript is no different. I do believe that JavaScript has to evolve and that necessarily introduces more parts that will suck as well as more parts that are awesome. I just wish we were all spending our efforts in the same area rather than splintering them across different projects.

What is TypeScript?

I spent a lot of time this week looking at TypeScript, reading through the documentation, and watching the video on the site. I was then invited by Rey Bango to meet with a couple members of the TypeScript team to have my own questions answered. With all of that background, I feel like I have a very good idea about what TypeScript is and what it is not.

TypeScript is first and foremost a superset of JavaScript. That means you can write regular JavaScript inside of TypeScript and it is completely valid. TypeScript adds additional features on top of JavaScript that then get converted into ECMAScript 5 compatible code by the TypeScript compiler. This is an interesting approach and one that’s quite different from the other compile-to-JavaScript languages out there. Instead of creating a completely new language with new syntax rules, TypeScript starts with JavaScript and adds additional features that fit in with the syntax quite nicely.

At its most basic, TypeScript allows you to annotate variables, function arguments, and functions with type information. This additional information allows for tools to provide better auto complete and error checking than you could get using normal JavaScript. The syntax is borrowed from the original JavaScript 2/ECMAScript 4 proposal[4] that was also implemented as ActionScript 3:

var myName: string = "Nicholas";

function add(num1: number, num2: number): number {
return num1 + num2;
}

function capitalize(name: string): string {
return name.toUpperCase();
}

The colon syntax may look familiar if you ever used Pascal or Delphi, both of which use the same syntax for indicating the type. The strings, numbers, and booleans in JavaScript are represented in TypeScript as string, number, and bool (note: all lowercase). These annotations hope the TypeScript compiler to figure out if you are using correct values. For example, the following would cause a warning:

// warning: add() was defined to accept numbers
var result = add("a", "b");

Since add() was defined to accept numbers, this code causes a warning from the TypeScript compiler.

TypeScript is also smart enough to infer types when there is an assignment. For example, each of these declarations is automatically assigned a type:

var count = 10; // assume ": number"
var name = "Nicholas"; // assume ": string"
var found = false; // assume ": bool"

That means to get some benefit out of TypeScript, you don’t necessarily have to add type annotations everywhere. You can choose not to add type annotations and let the compiler try to figure things out, or you can add a few type annotations to help out.

Perhaps the coolest part of these annotations is the ability to properly annotate callback functions. Suppose you want to run a function on every item in an array, similar to Array.prototype.forEach(). Using JavaScript, you would define something like this:

function doStuffOnItems(array, callback) {
var i = 0,
len = array.length;

while (i < len) {
callback(array[i], i, array);
i++;
}
}

The callback function accepts three arguments, a value, an index, and the array itself. There’s no way to know that aside from reading the code. In TypeScript, you can annotate the function arguments to be more specific:

function doStuffOnItems(array: string[],
callback: (value: string, i: number, array: string[]) => {}) {
var i = 0,
len = array.length;

while (i < len) {
callback(array[i], i, array);
i++;
}
}

This code adds annotations to both arguments of doStuffOnItems(). The first argument is defined as an array of strings, and the second argument is defined as a function accepting three arguments. Note that the format for defining a function type is the ECMAScript 6 fat arrow function syntax.[5] With that in place, the compiler can check to see that a function matches the signature before the code is ever executed.

The type annotations really are the core of TypeScript and what it was designed to do. By having this additional information, editors can be made that not only do type checking of code before its executed, but also provide better autocomplete support as you’re coding. TypeScript already has plug-ins for Visual Studio, Vim, Sublime Text 2, and Emacs,[6] so there are lots of options to try it out.

Additional features

While the main point of TypeScript is to provide some semblance of static typing to JavaScript, it doesn’t stop there. TypeScript also has support for ECMAScript 6 classes[7] and modules[8] (as they are currently defined). That means you can write something like this:

class Rectangle {
constructor(length: number, width: number) {
this.length = length;
this.width = width;
}

area() {
return this.length * this.width;
}
}

And TypeScript converts it into this:

Note that the constructor function is created appropriately and the one method is properly placed onto the prototype.

Aside from modules and classes, TypeScript also introduces the ability to define interfaces. Interfaces are not defined in ECMAScript 6 at all but are helpful to TypeScript when it comes to type checking. Since JavaScript code tends to have a large amount of object literals defined, interfaces provide an easy way to validate that the right type of object is being used. For example:

interface Point {
x: number;
y: number;
}

function getDistance(pointA: Point, pointB: Point) {
return Math.sqrt(
Math.pow(pointB.x - pointA.x, 2) +
Math.pow(pointB.y - pointA.y, 2)
);
}

var result = getDistance({ x: -2, y: -3}, { x: -4, y: 4})

In this code, there’s an interface called Point with two properties x and y. The getDistance() function accepts two points and calculates the distance between them. The two arguments can be any object containing exactly those two properties of x and y, meaning I can pass in object literals and TypeScript will check to ensure that they contain the correct properties.

Both interfaces and classes feed into the type system to provide better error checking. Modules are just ways to group related functionality together.

What I like

The more I played with TypeScript the more I found parts of it that I really like. First and foremost, I like that you can write regular JavaScript inside of TypeScript. Microsoft isn’t trying to create a completely new language, they are trying to augment JavaScript in a useful way. I can appreciate that. I also like that the code compiles down into regular JavaScript that actually makes sense. Debugging TypeScript generated code isn’t all that difficult because it uses familiar patterns.

What impressed me the most is what TypeScript doesn’t do. It doesn’t output type checking into your JavaScript code. All of those type annotations and error checking are designed to be used only while you’re developing. The final code doesn’t do any type checking unless you are doing it manually using JavaScript code. Classes and modules get converted into regular JavaScript while interfaces completely disappear. No code for interfaces ever appear in the final JavaScript because they are used purely during development time for type checking and autocomplete purposes.

The editor integration for TypeScript is quite good. All you have to do is add a few annotations and all of a sudden the editor starts to light up with potential errors and suggestions. The ability to explicitly define expectations for callback functions is especially impressive, since that’s the one area I tend see a lot of issues related to passing incorrect values into functions.

I also like that Microsoft open-sourced TypeScript. They seem to be committed to developing this in the open and to developing a community around TypeScript. Whether or not they follow through and actually operate as an open source project is yet to be seen, but they’ve at least taken steps to allow for that possibility.

What I don’t like

While I applaud Microsoft’s decision to use ECMAScript 6 classes, I fear it puts the language in a difficult position. According to the TypeScript team members I spoke with, they’re absolutely planning on staying in sync with ECMAScript 6 syntax for modules and classes. That’s a great approach in theory because it encourages people to learn skills that will be useful in the future. In reality, that’s a difficult proposition because ECMAScript 6 is not yet complete and there is no guarantee that the syntax won’t change again before the specification is finished. That puts the TypeScript team in a very difficult position: continue to update the syntax to reflect the current reality of ECMAScript 6 or lag behind (possibly fork?) In order to keep their development environment stable.

The same goes for the type annotations. While there is significant prior work indicating that the colon syntax will work in JavaScript, there’s no guarantee that it will ever be added to the language. That means what TypeScript is currently doing may end up at odds with what ECMAScript eventually does. That will also lead to a decision as to which way to go.

The TypeScript team is hoping that a community will evolve around the language and tools in order to help inform them of which direction to go when these sort of decisions appear. That’s also a double-edged sword. If they succeed in creating a large community around TypeScript, it’s very likely that the community may decide that they want to go away from the ECMAScript standard rather than stick with it due to the high maintenance cost of upgrading existing code.

And I really don’t like having a primitive type named bool. I already told them I’d like to see that changed to boolean so that it maps back to the values returned from typeof, along with string and number.

Should you use it?

I think TypeScript has a lot of promise but keep one thing in mind: the current offering is an early alpha release. It may not look like that from the website, which is quite polished, or the editor plug-ins, or the fact that the version number is listed as 0.8.0, but I did confirm with the TypeScript team that they consider this a very early experimental release to give developers a preview of what’s coming. That means things may change significantly over the next year before TypeScript stabilizes (probably as ECMAScript 6 stabilizes).

So is it worth using now? I would say only experimentally and to provide feedback to the TypeScript team. If you choose to use TypeScript for your regular work, you do so at your own risk and I highly recommend that you stick to using type annotations and interfaces exclusively because these are removed from compiled code and less likely to change since they are not directly related to ECMAScript 6. I would avoid classes, modules, and anything else that isn’t currently supported in ECMAScript 5.

Conclusion

TypeScript offers something very different from the other compile-to-JavaScript languages in that it starts with JavaScript and adds additional features on top of it. I’m happy that regular JavaScript can be written in TypeScript and still benefit from some of the type checking provided by the TypeScript compiler. That means writing TypeScript can actually help people learn JavaScript, which makes me happy. There’s no doubt that these type annotations can create a better development experience when integrated with editors. Once ECMAScript 6 is finalized, I can see a big use for TypeScript, allowing developers to write ECMAScript 6 code that will still work in browsers that don’t support it natively. We are still a long way from that time, but in the meantime, TypeScript is worth keeping an eye on.

References

TypeScript (typescriptlang.org)
Dart (dartlang.org)
CoffeeScript (coffeescript.org)
Proposed ECMAScript 4th Edition – Language Overview (ECMA)
ECMAScript 6 Arrow Function Syntax (ECMA)
Sublime Text, Vi, Emacs: TypeScript enabled! (MSDN)
ECMAScript 6 Maximally Minimal Classes (ECMA)
ECMAScript 6 Modules (ECMA)

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on October 04, 2012 10:43

Computer science in JavaScript: Merge sort

Merge sort is arguably the first useful sorting algorithm you learn in computer science. Merge sort has a complexity of O(n log n), making it one of the more efficient sorting algorithms available. Additionally, merge sort is a stable sort (just like insertion sort) so that the relative order of equivalent items remains the same before and after the sort. These advantages are why Firefox and Safari use merge sort for their implementation of Array.prototype.sort().

The algorithm for merge sort is based on the idea that it’s easier to merge two already sorted lists than it is to deal with a single unsorted list. To that end, merge sort starts by creating n number of one item lists where n is the total number of items in the original list to sort. Then, the algorithm proceeds to combine these one item lists back into a single sorted list.

The merging of two lists that are already sorted is a pretty straightforward algorithm. Assume you have two lists, list A and list B. You start from the front of each list and compare the two values. Whichever value is smaller is inserted into the results array. So suppose the smaller value is from list A; that value is placed into the results array. Next, the second value from list A is compared to the first value in list B. Once again, the smaller of the two values is placed into the results list. So if the smaller value is now from list B, then the next step is to compare the second item from list A to the second item in list B. The code for this is:

This function merges two arrays, left and right. The il variable keeps track of the index to compare for left while ir does the same for right. Each time a value from one array is added, its corresponding index variable is incremented. As soon as one of the arrays has been exhausted, then the remaining values are added to the end of the result array using concat().

The merge() function is pretty simple but now you need two sorted lists to combine. As mentioned before, this is done by splitting an array into numerous one-item lists and then combining those lists systematically. This is easily done using a recursive algorithm such as this:

function mergeSort(items){

// Terminal case: 0 or 1 item arrays don't need sorting
if (items.length < 2) {
return items;
}

var middle = Math.floor(items.length / 2),
left = items.slice(0, middle),
right = items.slice(middle);

return merge(mergeSort(left), mergeSort(right));
}

The first thing to note is the terminal case of an array that contains zero or one items. These arrays don’t need to be sorted and can be returned as is. For arrays with two or more values, the array is first split in half creating left and right arrays. Each of these arrays is then passed back into mergeSort() with the results passed into merge(). So the algorithm is first sorting the left half of the array, then sorting the right half of the array, then merging the results. Through this recursion, eventually you’ll get to a point where two single-value arrays are merged.

This implementation of merge sort returns a different array than the one that was passed in (this is not an “in-place” sort). If you would like to create an in-place sort, then you can always empty the original array and refill it with the sorted items:

function mergeSort(items){

if (items.length < 2) {
return items;
}

var middle = Math.floor(items.length / 2),
left = items.slice(0, middle),
right = items.slice(middle),
params = merge(mergeSort(left), mergeSort(right));

// Add the arguments to replace everything between 0 and last item in the array
params.unshift(0, items.length);
items.splice.apply(items, params);
return items;
}

This version of the mergeSort() function stores the results of the sort in a variable called params. The best way to replace items in an array is using the splice() method, which accepts two or more arguments. The first argument is the index of the first value to replace and the second argument is the number of values to replace. Each subsequent argument is the value to be inserted in that position. Since there is no way to pass an array of values into splice(), you need to use apply() and pass in the first two arguments combined with the sorted array. So, 0 and items.length are added to the front of the array using unshift() so that apply() can be used with splice(). Then, the original array is returned.

Merge sort may be the most useful sorting algorithm you will learn because of its good performance and easy implementation. As with the other sorting algorithms I’ve covered, it’s still best to start with the native Array.prototype.sort() before attempting to implement an additional algorithm yourself. In most cases, the native method will do the right thing and provide the fastest possible implementation. Note, however, that not all implementations use a stable sorting algorithm. If using a stable sorting algorithm is important to you then you will need to implement one yourself.

You can get both versions of mergeSort() from my GitHub project, Computer Science in JavaScript.

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on October 02, 2012 07:00

Computer science and JavaScript: Merge sort