Nicholas C. Zakas's Blog, page 16

December 20, 2011

Introducing Props2Js

One of my principles of Maintainable JavaScript[1] is to separate your configuration data from your application logic. Configuration data is hardcoded information that your JavaScript uses to work properly. This could be anything such as a URL or a UI string. For example:


function validate(value) {
if (!value) {
alert("Invalid value");
location.href = "/errors/invalid.php";
}
}

function toggleSelected(element) {
if (hasClass(element, "selected")) {
removeClass(element, "selected");
} else {
addClass(element, "selected");
}
}

There are three pieces of configuration data in this code. The first is the string, "Invalid value", which is displayed to the user. As a UI string, there's a high chance that it will change frequently. The second is the URL "/errors/invalid.php". URLs tend to change as development progresses due to architectural decisions. The third is the CSS class name "selected". This class name is used three times, meaning that a class name change requires changes in three different places, increasing the likelihood that one will be missed.


Configuration data is best extracted from the core application logic, such as:


//Configuration data externalized
var config = {
MSG_INVALID_VALUE: "Invalid value",
URL_INVALID: "/errors/invalid.php",
CSS_SELECTED: "selected"
};

function validate(value) {
if (!value) {
alert(config.MSG_INVALID_VALUE);
location.href = config.URL_INVALID;
}
}

function toggleSelected(element) {
if (hasClass(element, config.CSS_SELECTED)) {
removeClass(element, config.CSS_SELECTED);
} else {
addClass(element, config.CSS_SELECTED);
}
}

This example stores all of the configuration data in the config object. Each property of config holds a single piece of data, and each property name has a prefix indicating the type of data (MSG for a UI message, URL for a URL, and CSS for a class name). The naming convention is, of course, a matter of preference. The important part of this code is that all of the configuration data has been removed from the functions, replaced with placeholders from the config object.


Externalizing the configuration data means that anyone can go in and make a change without fear of introducing an error in the application logic. It also means that the entire config object can be moved into its own file, so edits are made far away from the code that uses the data.


Having an external object managing your configuration data is a good start, but I'm not a fan of storing configuration data directly in JavaScript code. Because such data changes frequently, I prefer to keep it in a simpler file format – one that's free from worries about missing a semicolon or comma. And that's when I turned to the Java properties file[2].


Java properties files are incredibly simple. One name-value pair per line and comments begin with a #. It's really hard to mess up this format. Here's what the previous example's configuration data looks like in a Java properties file:


# UI Strings
MSG_INVALID_VALUE = Invalid value

# URLs
URL_INVALID = /errors/invalid.php

# CSS Classes
CSS_SELECTED = selected

Even though I had my configuration data in a Java properties file, I had no easy way of making this data available to JavaScript.


This is why I created Props2Js[3], a simple tool that does just one thing: reads a Java properties file and outputs it in a format that JavaScript can use. Actually, it's capable of outputting the data into three formats that JavaScript can use: JSON, JSONP, and regular JavaScript.


java -jar props2js-0.1.0.jar --to jsonp --name myfunc --output result.js source.properties

The --to option specifies the output format, either "js", "json", or "jsonp". The --name option specifies either the variable name (for "js") or the function name (for "jsonp"); this option is ignored for "json". The --output option specifies the file to write the data into. So this line takes the Java properties file named source.properties and outputs JSONP with a callback function of myfunc to a file named result.js.


Props2Js outputs the properties file mentioned above into JSON format:


{"MSG_INVALID_VALUE":"Invalid value","URL_INVALID":"/errors/invalid.php",
"CSS_SELECTED":"selected"}

Here's the JSONP output:


myfunc({"MSG_INVALID_VALUE":"Invalid value","URL_INVALID":"/errors/invalid.php",
"CSS_SELECTED":"selected"});

And here's the plain JavaScript option with --name config:


var config={"MSG_INVALID_VALUE":"Invalid value","URL_INVALID":"/errors/invalid.php",
"CSS_SELECTED":"selected"};

Props2Js is also smart enough to know that you're assigning to an object property if you include a dot in in the --name option. In that case, it omits the var.


Props2Js is available under an MIT License and is hosted at GitHub[3].


References

Maintainable JavaScript 2011 by Nicholas C. Zakas
.properties by Wikipedia
Props2Js



 •  0 comments  •  flag
Share on Twitter
Published on December 20, 2011 08:02

December 16, 2011

Book review: HTML & CSS

HTML & CSS book coverIt had been a while since I'd read a book that didn't have to do with JavaScript or something very computer-sciency, so when I was asked to review HTML & CSS: Design and Build Websites by Jon Duckett, I was interested to see how these books have changed. I learned HTML back in 1996, and honestly haven't picked up another book on the subject since that time.


My first impression of the book is that it's beautiful. The text is large and the pages are colorful, making it very easy to thumb through when in a hurry. When I wasn't in a hurry and sat down to read it, I found that the book almost told the entire story through pictures. The words are there and technically correct, but it's the visuals in the book that really communicate information to the reader.


I admired Duckett's approach to this book. He completely dispels with the buzzwords that glitter so many books these days. There's mention of HTML5 and CSS3, for sure, but it's done in such a way that it doesn't seem gimmicky or hyped. The title of the book itself is evidence of this. Duckett clearly doesn't want you thinking about HTML 4 vs. HTML5 or CSS 2 vs. CSS3. Instead, he wants you to understand the concepts that link together web technology and good design. Some of that is done with HTML 4 and CSS 2 while some is done with HTML5 and CSS3.


This book is really targeted at beginners without a technical background, and it does an exceptional job in serving this audience. The approach is perhaps the gentlest introduction to the concept of web programming that I've ever encountered. So gentle, in fact, I think that almost anyone could pick up this book and start to make a simple web page relatively quickly. It takes you right from creating your HTML file with a text editor, through learning HTML and CSS, all the way to deploying your file and adding Google Analytics.


Sprinkled throughout the book are useful tidbits about typography, contrast, design concepts, and even how multimedia plugins such as Flash work in conjunction with a web page. The very visual nature of the book makes picking up these concepts easy, as every piece of code is accompanied with a diagram, figure, or screenshot showing the result.


If you're an experienced web developer, you'll probably want to pass on this book since it will be far too basic. However, if you're looking for a good book to introduce web development to an inexperienced web developer, or even someone who has no experience, then this book is a great place to start.




 •  0 comments  •  flag
Share on Twitter
Published on December 16, 2011 08:05

December 14, 2011

Timer resolution in browsers

Timer resolution refers to how frequently a clock is updated. For most of their history, web browsers used the default system timer for functionality such as setTimeout(), setInterval(), and Date objects. This meant browsers could only schedule code to run as frequently as the system timer would fire, and dates could only be created with differences equivalent to the timer resolution.


A brief history

Windows machines have a timer resolution of 10-15.6ms by default (most at 15.6ms), which meant that browsers using the system timer were stuck to this resolution. Of course, 10-15.6ms is a lifetime when you have a CPU running as fast as today's processors do. It probably doesn't surprise you that Internet Explorer through version 8 exclusively used system timers and so led to John Resig writing about how timer resolution affects benchmarks[1]. On OS X, browser timers were much more accurate than on Windows.


Until recently, the other browsers on Window also used the system timer and so were all stuck at 15.6ms timer resolution. This was true for Firefox, Safari, and Opera. Chrome may have been the first Windows browser to switch to a higher-resolution timer[2], and their experiments led to some interesting results.


The original idea was for Chrome to have sub-millisecond timers, but this was abandoned in favor of a one millisecond timer resolution. They decided to use the Windows multimedia timer API, which allows you to specify a timer with a resolution as small a one millisecond and use that instead of the system timer. This is the same timer used by plugins such as Flash and Quicktime.


Chrome 1.0 beta had a one millisecond timer resolution. That seemed okay, but then the team started having bug reports. It turns out that timers cause the CPU to spin, and when the CPU is spinning, more power is being consumed because it can't go into sleep (low power) mode.[3] That caused Chrome to push its timer resolution to 4ms.


The 4ms delay was codified in HTML5 as part of the Timer section[4], where it states that the minimum resolution for setTimeout() should be 4ms. The minimum resolution for setInterval() is specified as 10ms.


Timer resolution today

Internet Explorer 9, Firefox 5, Safari 5.1, and Opera 11 all feature a 4ms timer resolution, following Chrome's lead. Prior to that, Firefox 4 and earlier and Safari 5 and earlier had a timer resolution of 10ms (apparently, this was hardcoded in WebKit). Mobile Safari on iOS 5 also has a 4ms timer resolution. Silk on the Kindle Fire has a 10ms timer resolution, potentially indicating it was built off an older version of WebKit. However, just because today's browsers have a timer resolution of 4ms, it doesn't mean that's the resolution you'll be getting.


Most browsers also do some sort of timer throttling based on different conditions. The intent is to save battery at opportune times – times when, theoretically, you either won't notice the difference or would gladly trade for improved battery life on a laptop or mobile device. Here are some circumstances where timer resolution changes:



Chrome and Internet Explorer 9+ switch back to the system timer when a laptop is running on battery power. When plugged in, the browser switches back to the 4ms timer resolution.
Firefox 5+, Chrome 11+, and Internet Explorer 10+ change timer resolution in inactive tabs to 1000 milliseconds.[5]

Mobile Safari on iOS5 and Silk on the Kindle Fire freeze the timer completely when you switch to a different app. The timer restarts when you switch back to the browser.

Browsers will likely continue to make adjustments to timer resolution as it pertains to power consumption on battery-powered devices. The HTML5 spec leaves room for browser vendors to make such changes.


Conclusion

There has been a silent timer resolution evolution going on as browsers have developed over the past few years. Timer resolution isn't one of those topics that gets discussed frequently, but if you're using setTimeout() and setInterval(), it pays to have a deeper understanding of the functionality. We're getting closer to the point of having per-millisecond control of the browser. When someone figures out how to manage timers without CPU interrupts, we're likely to see timer resolution drop again. Until then, keep 4ms in mind, but remember that you still won't always get that.


References

Accuracy of JavaScript Time by John Resig
Chrome: Cranking up the clock by Mike Belshe
CPU Power Utilization on Intel® Architectures by Karthik Krishnan
Timers in HTML5

Clamp setTimeout/setInterval to something higher than 10ms in inactive tabs
Timer Resolution Test by Ryan Grove



 •  0 comments  •  flag
Share on Twitter
Published on December 14, 2011 07:00

November 29, 2011

How content delivery networks (CDNs) work

Content delivery networks (CDNs) are an important part of Internet infrastructure that are frequently used without a full understanding of what's happening behind the scenes. You'll hear people saying, "oh, we put that on the CDN" or "make sure static assets go on the CDN," when they have only a rudimentary idea of what CDNs are and how they work. As with most pieces of technology, CDNs are not magic and actually work in a pretty simple and straightforward manner.


When a web browser makes a request for a resource, the first step is to make a DNS request. Making a DNS request is a lot like looking up a phone number in a phone book: the browser gives the domain name and expects to receive an IP address back. With the IP address, the browser can then contact the web server directly for subsequent requests (there are actually multiple layers of DNS caching, but that's beyond the scope of this post). For your simple blog or small commercial web site, a domain name may have a single IP address; for large web applications, a single domain name may have multiple IP addresses.


Physics determines how fast one computer can contact another over physical connections, and so attempting to access a server in China from a computer in the United States will take longer than trying to access a U.S. server from within the U.S. To improve user experience and lower transmission costs, large companies set up servers with copies of data in strategic geographic locations around the world. This is called a CDN, and these servers are called edge servers, as they are closest on the company's network to the end-user.


DNS resolution

When the browser makes a DNS request for a domain name that is handled by a CDN, there is a slightly different process than with small, one-IP sites. The server handling DNS requests for the domain name looks at the incoming request to determine the best set of servers to handle it. At it's simplest, the DNS server does a geographic lookup based on the DNS resolver's IP address and then returns an IP address for an edge server that is physically closest to that area. So if I'm making a request and the DNS resolver I'm routed to is Virginia, I'll be given an IP address for a server on the East coast; if I make the same request through a DNS resolver in California, I'll be given an IP address for a server on the West coast. You may not end up with a DNS resolver in the same geographic location from where you're making the request.


United States CDNs often have edge servers located on the Pacific and Atlantic coasts


That's the first step of the process: getting the request to the closest server possible. Keep in mind that companies may optimize their CDNs in other ways as well, for instance, redirecting to a server that is cheaper to run or one that is sitting idle while another is almost at capacity. In any case, the CDN smartly returns the best possible IP address to handle the request.


Accessing content

Edge servers are proxy caches that work in a manner similar to the browser caches. When a request comes into an edge server, it first checks the cache to see if the content is present. The cache key is the entire URL including query string (just like in a browser). If the content is in cache and the cache entry hasn't expired, then the content is served directly from the edge server.


If, on the other hand, the content is not in the cache or the cache entry has expired, then the edge server makes a request to the origin server to retrieve the information. The origin server is the source of truth for content and is capable of serving all of the content that is available on the CDN. When the edge server receives the response from the origin server, it stores the content in cache based on the HTTP headers of the response.


When a request comes into an edge server it either contacts the origin server for the content or serves it from cache


Yahoo! created and open sourced the Apache Traffic Server, which is what Yahoo! uses in its CDN for managing this traffic. Reading through the Traffic Server documentation is highly recommended if you'd like to learn more about how cache proxies work.


Example

For example, Yahoo! serves the YUI library files off of its CDN using a tool called the combo handler. The combo handler takes a request whose query string contains filenames and concatenates the files into a single response. Here's a sample URL:


http://yui.yahooapis.com/combo?3.4.1/...

The domain yui.yahooapis.com is part of the Yahoo! CDN and will redirect you to the closest edge server based on your location. This particular request combines two files, yui-base-min.js and array-extras-min.js, into a single response. The logic to perform this concatenation doesn't exist on the edge servers, it only exists on the origin server. So if an edge server receives this request and has no content, a request is made to the origin server to retrieve the content. The origin server is running the proprietary combo handler (specified by /combo? in the URL) and so it combines the files and returns the result to the edge server. The edge server can then serve up the appropriate content.


What does static mean?

I frequently get confused looks when I describe systems similar to the combo handler. There is a misconception that CDNs act like FTP repositories, where you simply upload static files so that others can retrieve them. I hope that it's clear from the previous section that this is not the case. An edge server is a proxy, the origin server is the one that tells the edge server exactly what content should be returned for a particular request. The origin server may be running Java, Ruby, Node.js, or any other type of web server and, therefore, can do anything it wants. The edge server does nothing but make requests and serve content. So the YUI combo handler exists only on the origin server and not on the edge servers.


If that's the case, why not serve everything from the CDN? The CDN is a cache, meaning that is has value when it can serve data directly and not need to contact the origin server. If an edge server needs to make a request to the origin server for every request, then it has no value (and in fact, costs more than just making the request to the origin server itself).


The reason JavaScript, CSS, images, Flash, audio, and video are frequently served from CDNs is precisely because they don't change that frequently. That means not only will the same user receive content from cache, but all users will receive the same data from cache. Once the cache is primed with content, all users benefit. A site's homepage is a poor candidate for edge caching because it's frequently customized per user and needs to be updated several times throughout the day.


Cache expiration

Yahoo! performance guidelines specify that static assets should have far-future Expires headers. This is for two reasons: first, so the browser will cache the resources for a long time, and second, so the CDN will cache the resources for a long time. Doing so also means you can't use the same filename twice, because it may be cached in at least two places and users will receive the cached version instead of the new one for quite a while.


There are several ways to work around this. The YUI library uses directories containing the version number of the library to differentiate file versions. It's also common to append identifiers to the end of a filename, such as an MD5 hash or source control revision. Any of these techniques ensures that users are receiving the most up-to-date version of the file while maintaining far-future Expires headers on all requests.


Conclusion

CDNs are an important part of today's Internet, and they're only going to become more important as time goes on. Even now, companies are hard at work trying to figure out ways to move more functionality to edge servers in order to provide users with the fastest possible experience. This includes a technique called Edge Side Includes (ESI) which is designed to server partial pages from cache. A good understanding of CDNs and how they work is key to unlocking greater performance benefits for users.


Update (29 Nov 2011): Added information about DNS resolvers based on Philip's comments.




 •  0 comments  •  flag
Share on Twitter
Published on November 29, 2011 07:15

November 22, 2011

Book review: Closure: the Definitive Guide

Closure: The Definitive Guide book coverI have to admit that I couldn't remember who Michael Bolin was when he first contacted me to review Closure: The Definitive Guide. The embarrassment turned worse when he reminded me that he interviewed me at Google almost five years ago, before I joined Yahoo!. Having had the chance since then to grab lunch with Michael, the somewhat blurry events of that day in 2006 came flashing back – especially because he had done the "lunchtime interview" portion of my day.


Since that time, Michael had gone on to contribute to Google's Closure library, an incredibly sophisticated universe of tools for developing web applications. There literally could be no better person to write The Definitive Guide for Closure than Michael. Having very little knowledge of Closure, I was the target audience for this book. I had no idea just how large the Closure library truly was, encompassing a JavaScript library, the compiler, and a templating system (Google also just recently released, with the help of Michael, their CSS system as well – though this is not included in the book).


The book is incredibly well-written considering the daunting task of documenting each piece of the Closure ecosystem. Michael does an admirable job of not just explaining what different pieces do, but also why they do it that way. Every time I found myself saying, "hmm, I wonder why they did that," the next couple of paragraphs were spent explaining the design decisions and how they work together with the overriding design of Closure. It's not hard to see that Google engineers spent a lot of time thinking through every little piece of Closure, and Michael consistently dropping in details of these decisions is, in my opinion, the real gold of this book. You really get a sense for what it's like to build a massive, fine-tuned web application from the perspective of Google. While you may or may not agree with the approach, it's very nice to actually read about the decisions rather than just, "accept this, we know what we're doing."


That being said, I did walk away from the book feeling that the entire Closure ecosystem is too complex for me to start using. The ability to quickly get up and running just isn't there, and that's okay, because Closure wasn't intended for small web sites. However, the vast amount of knowledge you need to keep in your brain to use Closure to its peak is quite overwhelming. Michael says that you don't need to use all of the pieces of Closure if you don't want to, and that's absolutely true. However, the value of the whole system used together is clearly much greater than the sum of its parts. So while I could see people using the Closure Compiler without using the Closure Library, I'd be shocked to find people doing the opposite.


I commend Michael for taking an extremely complex project and writing about it so thoughtfully and thoroughly. If you want to learn about Closure and the way that Google thinks about web application development, this is certainly an excellent book. However, I don't think the book will pull people away from their current set of tools




 •  0 comments  •  flag
Share on Twitter
Published on November 22, 2011 07:30

November 18, 2011

Setting up multi-user Apache on EC2

Not too long ago, I wrote about how to setup an EC2 instance in five minutes. I've grown even more enamored with the how quickly I can create a server, maybe use it for just an hour or two, and then tear it down at a cost of a few cents. In one of my projects, a class needed a public-facing shared web server where people could upload files and access them from mobile devices.


Knowing that Apache supports multi-user environments, I immediately went towards that solution. Unfortunately, and surprisingly due to the massive amounts of Apache documentation, I didn't find an end-to-end tutorial on how to setup Apache for multiple users. I pieced together the configuration options and steps for creating users from multiple tutorials and pieces of documentation I found scattered around the web.


Enabling user directories

The most useful part of multi-user Apache is the ability for every user to have a directory inside of their home directory where files are served to the public. This is typically done in the public_html directory, allowing you to access files via /~username/ in a web browser. For example, the user foobot has a home directory of /home/foobot and a publicly-accessible web directory in /home/foobot/public_html. So /home/foobot/public_html/index.html is accessible from a web browser as http://www.example.com/~fooboot/index....


The default Apache configuration disables user directories by default, but it's pretty simple to turn it back on. It just takes a few changes to the httpd.conf file, which is located at /etc/httpd/conf/httpd.conf on default EC2 configurations. There's already a section in the file for setting up user directories, all you need to do is change the default settings so it looks like this:



#
# UserDir is disabled by default since it can confirm the presence
# of a username on the system (depending on home directory
# permissions).
#
UserDir enabled all

#
# To enable requests to /~user/ to serve the user's public_html
# directory, remove the "UserDir disabled" line above, and uncomment
# the following line instead:
#
UserDir public_html



The section immediately underneath defines any overrides for the user directories. If you want to allow the use of .htaccess in these directories, then make sure to provide the options here. This is the configuration I use to allow .htaccess:



AllowOverride All
Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec

Once you've made these changes, you need to reload the server configuration:


sudo service httpd reload

Now the server is ready to support multiple users. All you need now is a few users.


Creating users

Each user must have a public_html directory in their home directory with appropriate permissions set. The permissions necessary are:



The user's home directory must have 711.
The public_html directory must have 755.

For any users you already have, you'll have to manually create the public_html directory and set permissions appropriately.


In my case, I was creating completely new users, so I found a script for creating users and then modified it to create the public_html directory and set permissions appropriately (as well as fixing a minor bug). You must use sudo to run this script:


#!/bin/bash
# Script to add a user to Linux system
if [ $(id -u) -eq 0 ]; then
read -p "Enter username : " username
read -s -p "Enter password : " password
egrep "^$username:" /etc/passwd >/dev/null
if [ $? -eq 0 ]; then
echo "$username exists!"
exit 1
else
pass=$(perl -e 'print crypt($ARGV[0], "password")' $password)
useradd -m -p $pass $username
if [ $? -eq 0 ]; then
mkdir /home/$username/public_html
chmod 711 /home/$username
chmod 755 /home/$username/public_html
cp /home/temp/.htaccess /home/$username/public_html
chown -R $username /home/$username/public_html

echo "User has been added to system!"
else
echo "Failed to add a user!"
fi
fi
else
echo "Only root may add a user to the system"
exit 2
fi

When run, this script prompts you to enter a username and password for a new user. It then creates the user as well as the public_html directory with the correct permissions.


Once the user is created using this script, they can ssh into the server and use SCP to copy file into their public_html directory.


Enjoy

Setting up a multi-user Apache environment is incredibly useful in situations where each user needs a separate web space. Common situations are professional and academic classes as well as professional organizations where developers need to be able to share things easily. A multi-user Apache environment behind a firewall is a nice alternative for file sharing as well.




 •  0 comments  •  flag
Share on Twitter
Published on November 18, 2011 07:32

November 4, 2011

Custom types (classes) using object literals in JavaScript

This past week, Jeremy Ashkenas (of CoffeeScript fame) started a flurry of discussion around class syntax for JavaScript. ECMAScript Harmony is scheduled to have classes and the proposal has been up for a while. Of course, JavaScript has never had a true concept of classes (which is why I call them "types" instead), and the current strawman is no exception – it simply creates some syntactic sugar on top of the current constructor/prototype method of defining custom types. An example:


class Color {

constructor(hex) {
...
}

public r = 1;
public g = 1;
public b = 1;

copy(color) {
...
}

setRGB(r, g, b) {
...
}

setHSV(h, s, v) {
...
}

}

This would be instead of defining a separate constructor and prototype. The above desugars to:


function Color(hex){
...
}

Color.prototype.r = 1;
Color.prototype.g = 1;
Color.prototype.b = 1;

Color.prototype.copy = function(color){
...
};

Color.prototype.setRGB = function(r,g,b){
...
};

Color.prototype.setHSV = function(h,s,v){
...
};

Essentially the new class syntax just helps you define the prototype of the new type while the constructor is responsible for creating instance members.


Jeremy didn't like it, and so came up with an alternate proposal in the form of a gist. At the center of his idea: use the familiar object literal syntax to define new types with just a small amount of syntactic sugar to make things easier.


class Color {

constructor: function(hex) {
...
},

r: 1, g: 1, b: 1,

copy: function(color) {
...
},

setRGB: function(r, g, b) {
...
},

setHSV: function(h, s, v) {
...
}

}

Jeremy's proposal looks closer to object literal syntax with the class keyword and the type name. A lot of commenters on the gist liked this idea – I'm actually not one of them, I think the proposed Harmony syntax is much more succinct and implements sugaring of known patterns in a straightforward way.


Regardless, there is something to Jeremy's approach of being able to define new custom types in one step. It's pretty trivial to do that today using JavaScript. First, you need a simple function:


function type(details){
details.constructor.prototype = details;
return details.constructor;
}

That's all it takes. Basic usage:


var Color = type({
constructor: function(hex) {
...
},

r: 1, g: 1, b: 1,

copy: function(color) {
...
},

setRGB: function(r, g, b) {
...
},

setHSV: function(h, s, v) {
...
}
});

var mycolor = new Color("ffffff");

The syntax is just a bit different from Jeremy's as it adheres to ECMAScript 5 syntax, but works pretty much the same way. The key to understanding this approach is understanding the constructor property. You may be used to accessing constructor from an object instance to get the function that created the object. However, constructor is actually a prototype property, shared by all instances. For any given function created from scratch:


function f(){}
console.log(f === f.prototype.constructor); //true

So basically, the type() function takes the passed-in object and looks for the constructor property. At first, details.constructor.prototype has its default value. The function overwrites the prototype with the details object itself (which already has an appropriate reference to constructor). Then, it simply returns the now-fully-formed constructor function. You can start to use the returned constructor with new immediately.


In lieu of Harmony's new syntax, I've very quickly come to like this approach. Using a single object literal is quick and easy, and of course, works right now in all browsers. There are also any number of ways you could modify type() in order to support things like inheritance and mixins, depending on your use cases.


In the end, I'm looking forward to having some syntactic sugar for defining custom types in JavaScript. We've battled for too long with overly-verbose composition statements while those using class-based languages looked over our shoulders and laughed. I, for one, welcome our new Harmony overlords.


Update (04-Nov-2001): Fixed Harmony example.




 •  0 comments  •  flag
Share on Twitter
Published on November 04, 2011 13:07

October 25, 2011

Improving Rhino CLI utility performance

Back when I worked at Yahoo!, we spent a lot of time improving our build and checkin systems. Part of that meant using JSLint for JavaScript validation and a tool I wrote for CSS validation (not related to CSS Lint). Both of these tools were run using Rhino, the Java-based command line JavaScript engine. We start using these tools and quickly found them to be incredibly useful…when they were actually run. Developers seemed to have trouble remembering to run the lint check.


This wasn't necessarily the developers' fault. There were actually a number of lint checks that could be run based on the type of work being done. We soon determined that we'd combine all of the checks into a single step so that everyone always ran the same check every time. And that's when we discovered a problem: that single step was taking minutes to complete on our massive code base. Not very conducive to productivity.


After doing some digging, we discovered the root of the problem were the Rhino-based utilities. While we tinkered with the JavaScript and got some improvements, it wasn't anywhere near good enough. The biggest change we found while changing the utilities in a very simple way: allowing them to process more than one file.


To understand the change, consider how you would currently run JSHint with Rhino:


java -jar js.jar jshint-rhino.js yourfile.js

This executes the jshint-rhino.js file using Rhino and passes in yourfile.js as the file to run on. Most build systems using JSHint basically run this same line once for every single file. For example, this was the Ant target I was using in the CSS Lint build script:














Running this target would result in each file being run through JSHint. So if, for example, there were only five files, this is the equivalent:


java -jar js.jar jshint-rhino.js yourfile1.js
java -jar js.jar jshint-rhino.js yourfile2.js
java -jar js.jar jshint-rhino.js yourfile3.js
java -jar js.jar jshint-rhino.js yourfile4.js
java -jar js.jar jshint-rhino.js yourfile5.js

Nothing wrong with that, really. The target took about 45 seconds to run through the entire CSS Lint codebase. That isn't bad in the grand scheme of things but it is rather painful when you want to run the check frequently. Can you spot the problem?


Consider this: even though Rhino isn't as fast as Node.js, it is still pretty fast. So where do you think the majority of the time is spent?


The problem is in the setting up and tearing down of the JVM for every file. That is a fixed cost you're paying each and every time you run Java, and if you have dozens of files in your code base, you're paying that cost dozens of times. What you really want to do is the equivalent of:


java -jar js.jar jshint-rhino.js yourfile1.js yourfile2.js yourfile3.js yourfile4.js yourfile5.js

Running all of your JavaScript through JSHint using a single JVM will be much, much faster than running each file through individually. Unfortunately, the JSHint Rhino CLI didn't support passing in multiple files, so as part of my work I made the change and submitted a pull request. That change has now been merged into JSHint.


Once I had JSHint capable of evaluating multiple files in one pass, I changed the Ant target to the following (thanks to Tim Beadle on the CSS Lint mailing list for this):













Now, running ant lint on the CSS Lint code base takes 3 seconds. That's 3 seconds, down from 45 seconds before the change. Not bad.


The CSS Lint CLI for both Rhino and Node.js already support passing in multiple files on the command line, and so you can take advantage of this same pattern to validate all of your files very quickly.


The bottom line here is to keep an eye on your Rhino CLIs. The overhead of creating and destroying a JVM is something you shouldn't be penalized for multiple times while using a utility on your code. If you're using any Rhino-based JavaScript utilities, ask the author to support passing in multiple files. If the utility can already accept multiple files, then make sure your build scripts are actually passing in multiple files at once.




 •  0 comments  •  flag
Share on Twitter
Published on October 25, 2011 09:58

October 24, 2011

CSS Lint v0.8.0 now available

I'm happy to announce that version 0.8.0 of CSS Lint is now available on the web site, via GitHub, and through npm. This release focused very heavily on two things: documentation and web UI improvements.


The documentation work has continued over on the wiki, where we've added some initial documentation on using CSS Lint in a build system (we'd love contributions to this, by the way). The Developer Guide has been reorganized to be more useful and more details have been added regarding tests and rule creation. The rules documentation has also had a facelift as part of our ongoing efforts to improve documentation around rules.


Other notable changes in this release:



Based on a suggestion from Kevin Sweeney, the web UI now remembers your settings in between visits.
You can also specify which rules to use as a hash in the URL for the web UI. For example, http://csslint.net#warnings=ids,import,important. The hash is automatically updated whenever you change which rules you're applying, so it's easy to copy-paste the URL to a friend and share your favorite settings.
Eric Wendelin added an error/warning indication to the compact CLI output format. As a bonus, you can copy the hash and use it with the CLI.
Based on a suggestion from Mahonnaise, you can now configure warnings and errors by rule in the CSS Lint CLI. You can now specify --warnings and --errors with lists of rules. The --rules option is deprecated. For more, checkout the CLI documentation.
After a lengthy debate, the Broken Box Model rule has been renamed to the Box Model Sizing rule to better indicate the rule's intent.
The rules that checks for known properties got smarter. Instead of just checking property names, it also checks their values. Not all properties are supported yet, but CSS Lint is now capable of helping you out with a large number of properties.
New Rule: The box-sizing rule warns when box-sizing is used (rule documentation).
New Rule: Our first accessibility rules warns when you use outline: none (rule documentation).

Thanks once again to all of the great feedback and contributions we've been receiving from the growing CSS Lint community.




 •  0 comments  •  flag
Share on Twitter
Published on October 24, 2011 15:54

October 20, 2011

So you want to write JavaScript for a living? [repost]

In October 2007, Hans Brough published a blog post entitled, "So you want to write JavaScript for a living?" Hans put a lot of effort into the post, contacting myself as well as several others to get quotes and insights into the hiring process for JavaScript development. Through a series of unfortunate events, the article ended up being lost at its original site. I told him at the time that if he ever found a draft, to let me know and I would repost it for posterity.


Just recently Hans contacted me to let me know he had found a copy of the post. What follows is Hans' original article, reposted with permission. It is a bit dated but still a nice read to see how far we've come.



By Hans Brough


What do you need to know if you are interviewing for a job that involves Javascript development? What kind of expectations do employers have of candidates now that the state of client side development has been changed with the rise of asynchronous JavaScript and the often slick, supporting interfaces? These were questions I was asking myself after a friend pointed me to an interesting job posting over at Meebo that included some JavaScript puzzlers on logical operators, DOM oddities and… well that's all I should say so as not to drop any hints. At any rate I thought it was time to do a reality check and ask members of the development community what they expect a candidate to bring to the table.


When I asked Elaine Wherry, Ajax Girl and co-founder at Meebo, how her puzzler questions were working she had this to say:


We see many candidates apply for the Front End Developer position with a web developer background. We're looking for people with good CS fundamentals and guerilla debugging skills – these questions help us focus on that pool.


It seems that over the last few years everyone is willing to get their hands dirty with a little Javascript. As Elaine implies above, those using the language come from a wide range of backgrounds which almost certainly guarantees a wide range of experience levels and approaches to problem solving.


Neelesh Tendulkar, Senior Software Engineer at Simply Hired, approaches these differences with a programming exercise named 'buzz' that helps him understand a candidate's approach to problem solving.


I understand that an interview is a nerve wracking process, so when I'm looking at what they generate, I'm not looking for it to be perfect on the first try – very seldom do folks get it completely correct right off the bat. If they make a mistake I might ask them a question like "Ok, that looks good, but what happens when your program gets to a number like X? Walk me through what happens." Usually they'll be able to figure out what's wrong here and correct it. Even if the candidate makes several mistakes, if they're able to take vague hints like the one I gave above and interpret these and modify their program, then that tells me a lot about their programming ability.


At some point you will be asked specifics about the language that cover topics beyond basic programming itself. What you need to know depends of course on the position your applying forbut everyone should know about basic DOM manipulation. Tom Trenka, a contributor to the Dojo toolkit, puts it this way


If your candidate doesn't understand getElementById, getElementsByTagName, appendChild and removeChild, you have a problem


Nicholas Zakas, author of Professional JavaScript for Web Developers, said virtually the same thing:


…you need to know how to create an element on the fly, get a reference to any element on the page, insert, remove, replace, etc. nodes in the page. These methods should be memorized!


This presumes you know a bit about how the document object model is put together. It's safe to say that before going into an interview you should be able to look at a given page and mentally traverse it's structure. At the very least you need a basic understanding of how element nodes relate to one another on the page. This might be a great launching point into a discussion about how semantically correct markup can make your life easier once you start adding behaviors to a page.


Another must know topic is working with events and event handers across browsers as Nicholas notes:


No modern web application can survive without event handlers. Knowledge of the differences across browsers and issues surrounding event handling are a must.


I think that if you are working in a web shop that is not dabbling in asynchronous programming (aka Ajax) or building high traffic web applications you might be able to get away with a solid knowledge of the above as well as a strong understanding of xhtml and css. As Tom mentioned "there's a lot of halfway decent coding one can still do with JavaScript without having guru-level or even intermediate levels of understanding"


Assuming you want to work at a job that is building web apps there are a few more must knows to add to our list. Again here is a quote from Nicholas about Ajax:


You need to know not just what the acronym stands for, but practically speaking, what the technique is and why it's important. A working knowledge of multiple ways to implement Ajax communication is always a plus, since using XMLHttpRequest is not always an option.


If you're making an ajax call there is a good chance you'll need to know about callback functions. Be prepared to talk about what a callback function is, why it is and how to write one efficiently. Additionally be prepared to talk about supporting questions like this one from Neelesh:


Do you have any experience with JSON? If so, … why do you think some developers may prefer to use this as the envelope language as opposed to XML?


Another topic to know well is Object Oriented programming in Javascript. If you're going to be part of a team building a web application then considerations like re-usability and scalability are paramount. Tom had this to say on the topic


This means understanding how to set up a prototype chain and how to make sure a base constructor is applied correctly in the process of object instantiation. I suppose just understanding what I said might be enough but they should be able to recognize the following pattern:



function mySub(){
myBase.call(this);
// additional info here
}
mySub.prototype = new myBase;
mySub.prototype.constructor = mySub;

This also means being able to talk about javascript's prototype based inheritance vs class inheritance used in other languages. Talking about inheritance in Javascript can get into the deeper end of the pool pretty quick. To start out with you might be asked a simpler question as Eric Todd, Senior Application Engineer at Corbis mentions,


I like "this" [the keyword] because I was hacking javascript for maybe 2 years before I ever saw the word, and I struggled to get it… mostly because javascript was the first language I had ever used and did [not] understand OO programming and the use of "this".


This is a good indicator question as to whether the client has any sense of objects within Javascript. Another is to simply ask them to list a few of javascript's core objects which may seem silly but certainly points out any glaring gaps in their knowledge of the language.


A point I like to explore is object notation as it can get to the heart of understanding objects in Javascript. The examples do not need to be complicated to work well. For example I might show the candidate the following object literal:


var candidate = {name:{first:'hans',last:'brough'},age:'15'};

I ask them to demonstrate how they access it's properties, add a method or otherwise modify the object. Even better, ask them to demonstrate how the same object could be created in different ways. It's a simple example that you can build on or branch off into related topics depending on the candidates experiences. For example if they don't know what an object literal is then perhaps it's an indicator that the candidate has not used JSON strings in asynchronous scripts. This is also a good launching point into another 'must have' noted by Tom:


The basics of JS object mutability, and using that to isolate code. Basically faking namespaces by using objects to hold other objects.


All in all, given the modern usage of javascript in web-apps today it is an excellent idea to grasp the fundamentals of OOP in javascript.


Another question you should be prepared to talk about are any experiences with libraries like Dojo, Prototype or effects libraries like Script.aculo.us. There are so many libraries out there now someone is bound to at least ask you about your preference. Although as Nicholas points out they should not serve as too much of a crutch


It's really important for you to be able to write your own code without relying on JavaScript libraries like Dojo, Prototype, etc. Libraries like these are helpful in some cases, but when it comes to enterprise-level web applications, you will run into situations that the libraries didn't anticipate and you'll need to know how to maneuver around with them. A sure fire way to say "no hire" is by answering a question with, "well, if I could use Prototype|Dojo|etc., then I could do it this way…but otherwise, I don't know how."


So to sum up this little research project, here is a short list of the minimum to know when you interview for a JS development job:



problem solving, debugging and fundamental CS skills
DOM manipulation
Events and Event handling including differences between the IE model and the W3 model
Asynchronous programming (Ajax)
Object Oriented Programing to include setting up prototype based inheritance
Familiarity with popular JS libraries

Keep in mind it's not only about how much you know. Here's a parting thought from Neelesh.


… in addition to trying to ascertain the candidate's technical prowess, I'm also looking at other things. How well does this person communicate (especially in a pressure situation like an interview)? What kind of personality do they have? How do they interact with me? How do they approach the problems that I give them? In the end, what I'm trying to determine is how good of a fit this candidate is to not only the position but to the company as well. There have been times when I've interviewed folks who are technically brilliant, but probably wouldn't integrate well into the culture of the company.




 •  0 comments  •  flag
Share on Twitter
Published on October 20, 2011 15:20

Nicholas C. Zakas's Blog

Nicholas C. Zakas
Nicholas C. Zakas isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Nicholas C. Zakas's blog with rss.