Nicholas C. Zakas's Blog, page 14

Working with files in JavaScript, Part 4: Object URLs

Up to this point in the blog series, you’ve learned how to use files in the traditional way. You can upload files to the server and you can read file data from disk. These all represent the most common ways of dealing with files. However, there is a completely new way to deal with files that has the capacity to simplify some common tasks. This new way is to use object URLs.

What is an object URL?

Object URLs are URLs that point to files on disk. Suppose, for example, that you want to display an image from the user’s system on a web page. The server never needs to know about the file, so there’s no need to upload it. You just want to load the file into a page. You could, as shown in the previous posts, get a reference to a File object, read the data into a data URI, and then assign the data URI to an [image error] element. But think of all the waste: the image already exists on disk, why read the image into another format in order to use it? If you create an object URL, you could assign that to the [image error] and access that local file directly.

How does it work?

The File API[1] defines a global object called URL that has two methods. The first is createObjectURL(), which accepts a reference to a File and returns an object URL. This instructs the browser to create and manage a URL to the local file. The second method is revokeObjectURL(), which instructs the browser to destroy the URL that is passed into it, effectively freeing up memory. Of course, all object URLs are revoked once the web page is unloaded, but it’s good to free them up when they’re no longer needed anyway.

Support for the URL object isn’t as good as for other parts of the File API. As of the time of my writing, Internet Explorer 10+ and Firefox 9+ support a global URL object. Chrome supports it in the form of webkitURL while Safari and Opera have no support.

Example

So how would you display an image from disk without reading the data first? Suppose that you’ve given the user a way to select a file and now have a reference to it in a variable called file. You can then use the following:

var URL = window.URL || window.webkitURL,
imageUrl,
image;

if (URL) {
imageUrl = URL.createObjectURL(file);
image = document.createElement("img");

image.onload = function() {
URL.revokeObjectURL(imageUrl);
};

image.src = imageUrl;
document.body.appendChild(image);
}

This example creates a local URL variable that normalizes the browser implementations. Assuming that URL is supported, the code goes on to create an object URL directly from file and stores it in imageUrl. A new [image error] element is created and given an onload event handler that revokes the object URL (more on that in a minute). Then, the src property is assigned to the object URL and the element is added to the page (you may want to use an already-existing image).

Why revoke the object URL once the image is loaded? After the image is loaded, the URL is no longer needed unless you intend to reuse it with another element. In this example, the image is being loaded into a single element, and once the image has been completely loaded, the URL isn’t serving any useful purpose. That’s the perfect time to free up any memory associated with it.

Security and other considerations

At first glance, this capability is a bit scary. You’re actually loading a file directly from the user’s machine via a URL. There are, of course, security implications to this capability. The URL itself isn’t a big security issue because it’s a URL that’s assigned dynamically by the browser and would be useless on any other computer. What about cross-origin?

The File API disallows using object URLs on different origins. When an object URL is created, it is tied to the origin of the page in which the JavaScript executed, so you can’t use an object URL from www.wrox.com on a page at p2p.wrox.com (an error occurs). However, two pages from www.wrox.com, where one is embedded in the other with an iframe, are capable of sharing object URLs.

Object URLs exist only so long as the document that created them. When the document is unloaded, all object URLs are revoked. So, it doesn’t make sense to store object URLs in client-side data storage to use later; they are useless after the page has been unloaded.

You can use object URLs anywhere the browser would make a GET request, which includes images, scripts, web workers, style sheets, audio, and video. You can never use an object URL when the browser would perform a POST, such as within a whose method is set to “post”.

Up next

The ability to create URLs that link directly to local files is a powerful one. Instead of needing to read a local file into JavaScript in order to display it on a page, you can simply create a URL and point the page to it. This process greatly simplifies the use case of including local files in a page. However, the fun of working with files in JavaScript has only just begun. In the next post, you’ll learn some interesting ways to work with file data.

References

File API

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 31, 2012 07:00

Now available: Maintainable JavaScript

I’m happy to announce that my latest book, Maintainable JavaScript, is now available in print. Thanks to the folks at O’Reilly, the ebook was released as a preview last month, but now all the edits have been completed and the book is officially done. I’m very excited about this book, even moreso than some of the others, because it’s quite different than any I’ve written before.

One of the reasons I’ve very excited about this book is because it’s the first book that I’ve conceived of and written entirely on my own. Each of my previous books developed through others. Professional JavaScript wasn’t the book I set out to write, but it was the one I agreed to write. I worked on the outline with Jim Minatel before finally putting the book together. Professional Ajax was entirely Jim’s (brilliant) idea. I even fought him about it and at first declined to write it. Lesson learned: Jim is the man. For High Performance JavaScript, I was approached jointly by Yahoo! and O’Reilly to write it.

Maintainable JavaScript, on the other hand, grew out of a talk I gave when I first started at Yahoo! (the embarrassing video is available in YUI Theater). It was my first ever talk, and so it was very rough. I also spilled my water right on top of my laptop about halfway through, so I was freaking out that the computer might explode. In any event, last year I was asked to reprise the talk for PayPal, and then received other requests to give the talk. In redoing the presentation, I noticed that almost everything I said in the first version still held true (aside from a few personal preferences).

In December, I kept thinking about better ways to explain the topics, and before I knew it, I had a whole book outline sketched out and was digging into writing. The book almost wrote itself, as I blasted out 45 pages on the first day. It wasn’t long before the book was written and ready to go.

I’m also excited about this book because it’s largely an opinion book. I’m telling you about my experience writing enterprise-level JavaScript in my career. Because of that, I get to share stories from my personal experience as to why some practices are better than others. To put it simply, I use the word “I” in this book, and that is a fantastic feeling. It gives the book a more personal, conversational tone than my others.

Yeah, yeah…what’s it about?

Maintainable JavaScript, like my talk of the same name, is all about writing JavaScript that will continue working for five years. Code that remains working for five years might seem like a pipe dream with the rapid evolution of browsers and web technologies, but it’s not only possible, it’s important to your team. Your code should outlive your presence on any given job, and further, it should be able to be worked on by others with ease.

To that end, Maintainable JavaScript focuses on three things:

Code Style – yes, everyone loves a good discussion about code style guidelines. I compare and contrast style guidelines from several popular style guides and add in my own opinions on what makes a good code style. In the end, style is personal, and all that really matters is that everyone on the team writes code in the same way. This part of the book takes you through all of the important stylistic considerations that you should put into your style guide (a copy of my personal style guide is included as an appendix).
Programming Practices – these go a step further than code style and instruct you on common solutions to simple problems. Programming practices are algorithms and approaches rather than syntax. Browser sniffing is a programming practice, for example. This section goes through several practices that are either good or bad, and explains why using real-life situations.
Automation – the way that you ensure style guides are followed and other errors don’t creep in over time. By having automated ways of processing and verifying code, you prevent code rot and ensure that new code is always following established guidelines. This section uses Ant as an example of how to build out an automation system that can validate, minify, concatenate, and test your code.

Unlike my other books, I believe the tips and techniques in this book will remain relevant for a long time to come. As I said, it started as a talk in 2007 and pretty much everything I mentioned is still relevant, so I hope the tips will continue to be relevant going forward. I hope you enjoy it!

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 29, 2012 07:00

Working with files in JavaScript, Part 3: Progress events and errors

The FileReader object is used to read data from files that are made accessible through the browser. In my previous post, you learned how to use a FileReader object to easily read data from a file in a variety of formats. The FileReader is very similar to XMLHttpRequest in many ways.

Progress events

Progress events are becoming so common that they’re actually written up in a separate specification[1]. These events are designed to generically indicate the progress of data transfers. Such transfers occur when requesting data from the server, but also when requesting data from disk, which is what FileReader does.

There are six progress events:

loadstart – indicates that the process of loading data has begun. This event always fires first.
progress – fires multiple times as data is being loaded, giving access to intermediate data.
error – fires when loading has failed.
abort – fires when data loading has been canceled by calling abort() (available on both XMLHttpRequest and FileReader).
load – fires only when all data has been successfully read.
loadend – fires when the object has finished transferring data. Always fires and will always fire after error, abort, or load.

Two events, error and load, were discussed in my previous post. The other events give you more fine-grained control over data transfers.

Tracking progress

When you want to track progress of a file reader, use the progress event. The event object for this event contains three properties to monitor the data being transferred:

lengthComputable – a boolean indicating if the browser can determine the complete size of the data.
loaded – the number of bytes that have been read already.
total – the total number of bytes to be read.

The intent of this data is to allow for progress bars to be generated using the information from the progress event. For example, you may be using an HTML5 element to monitor the progress of reading a file. You can tie the progress value to the actual data using code like this:

This is similar to the approach that Gmail uses for its drag and drop file upload implementation, where you see a progressbar immediately after dropping a file onto the email. That progressbar indicates how much of the files has been transferred to the server.

Dealing with errors

Even though you’re reading a local file, it’s still possible for the read to fail. The File API specification[2] defines four types of errors:

NotFoundError – the file can’t be found.
SecurityError – something about the file or the read is dangerous. The browser has some leeway as to when this occurs, but generally if the file is dangerous to load into the browser or the browser has been performing too many reads, you’ll see this error.
NotReadableError – the file exists but can’t be read, most likely due to a permissions problem.
EncodingError – primarily when trying to read as a data URI and the length of the resulting data URI is beyond the maximum length supported by the browser.

When an error occurs during a file read, the FileReader object’s error property is assigned to be an instance of one of the above mentioned errors. At least, that’s how the spec is written. In reality, browsers implement this as a FileError object that has a code property indicating the type of error that has occurred. Each error type is represented by a numeric constant value:

FileError.NOT_FOUND_ERR for file not found errors.
FileError.SECURITY_ERR for security errors.
FileError.NOT_READABLE_ERR for not readable errors.
FileError.ENCODING_ERR for encoding errors.
FileError.ABORT_ERR when abort() is called while there is no read in progress.

You can test for the type of error either during the error event or during loadend:

The FileReader object is a fully-featured object with a lot of functionality and a lot of similarities to XMLHttpRequest. By following these last three posts, you should now be able to read data from files using JavaScript and send that data back to the server if necessary. However, the File API ecosystem is quite a bit larger than has been already discussed in this series, and in the next part you’ll learn about a powerful new features designed to work with files.

References

Progress Events
File API

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 22, 2012 07:00

Working with files in JavaScript, Part 2: FileReader

In my previous post, I introduced using files in JavaScript, focusing specifically on how to get access to File objects. These objects contain file metadata obtained only when the user opts to either upload a file or drags and drops a file onto the web page. Once you have files, however, the next step is to read data from them.

The FileReader type

The FileReader type has a single job: to read data from a file and store it in a JavaScript variable. The API is intentionally designed to be similar to XMLHttpRequest since both are loading data from an external (outside of the browser) resource. The read is done asynchronously so as not to block the browser.

There are several formats that a FileReader can create to represent the file data, and the format must be requested when asking the file to be read. Reading is done through calling one of these methods:

readAsText() – returns the file contents as plain text
readAsBinaryString() – returns the file contents as a string of encoded binary data (deprecated – use readAsArrayBuffer() instead)
readAsArrayBuffer() – returns the file contents as an ArrayBuffer (good for binary data such as images)
readAsDataURL() – returns the file contents as a data URL

Each of these methods initiates a file read similar to the XHR object’s send() method initiating an HTTP request. As such, you must listen for the load event before starting to read. The result of the read is always represented by event.target.result. For example:

var reader = new FileReader();
reader.onload = function(event) {
var contents = event.target.result;
console.log("File contents: " + contents);
};

reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};

reader.readAsText(file);

This example simply reads the contents of a file and outputs it in plain text to the console. The onload handler is called when the file is successfully read whereas the onerror handler is called if the file wasn’t read for some reason. The FileReader instance is available inside of the event handler via event.target and it’s recommended to use that instead of referencing the reader variable directly. The result property contains the file contents on success and error contains error information about the failed operation.

Reading data URIs

You can use the same basic setup for reading to a data URI. Data URIs (sometimes called data URLs) are an interesting option if you want to, for example, display an image that was just read from disk. You could do so with the following code:

var reader = new FileReader();
reader.onload = function(event) {
var dataUri = event.target.result,
img = document.createElement("img");

img.src = dataUri;
document.body.appendChild(img);
};

reader.onerror = function(event) {
console.error("File could not be read! Code " + event.target.error.code);
};

reader.readAsDataURL(file);

This code simply inserts an image that was read from disk into a page. Since the data URI contains all of the image data, it can be passed directly into the src attribute of an image and displayed on the page. You could, alternately, load the image and draw it onto a as well:

This code loads the image data into a new Image object and then uses that to draw the image onto a canvas (specifying both the width and height as 100).

Data URIs are generally used for this purpose, but can be used on any type of the file. The most common use case for reading a file into a data URI is to display the file contents on a web page immediately.

Reading ArrayBuffers

The ArrayBuffer type[1] was first introduced as part of WebGL. An ArrayBuffer represents a finite number of bytes that may be used to store numbers of any size. The way data is read from an ArrayBuffer is by using a specific view, such as Int8Array, which treats the underlying bytes as a collection of 8-bit signed integers or Float32Array, which treats the underlying bytes as a collection of 32-bit floating point numbers. These are called typed arrays[2], which force you to work with a specific numeric type rather than containing any type of data (as with traditional arrays).

You use an ArrayBuffer primarily when dealing with binary files, to have more fine-grained control over the data. It’s beyond the scope of this post to explain all the ins and outs of ArrayBuffer, just realize that you can read a file into an ArrayBuffer pretty easily if you need it. You can pass an ArrayBuffer directly into an XHR object’s send() method to send the raw data to the server (you’ll have to read this data from the request on the server to reconstruct the file), so long as your browser fully supports XMLHttpRequest Level 2[3] (most recent browsers, including Internet Explorer 10 and Opera 12).

Up next

Reading data from a file using a FileReader is pretty simple. If you know how to use XMLHttpRequest, there’s no reason you can’t also be reading data from files. In the next part of this series, you’ll learn more about using the FileReader events and understanding more about possible errors.

References

ArrayBuffer
Typed Array Specification
XMLHttpRequest Level 2

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 15, 2012 07:30

Working with files in JavaScript, Part 1: The Basics

Many years ago, I was asked during a job interview at Google what changes I would make to the web in order to provide better experiences. At the top of my list was having some way to work with files other than the control. Even as the rest of the web was evolving, the way we dealt with files never changed since it was first introduced. Thankfully, with HTML5 and related APIs, we now have far more options for working with files than ever before in the latest versions of desktop browsers (iOS still has no support for the File API).

The File type

The File type is defined in the File API[1] specification and is an abstract representation of a file. Each instance of File has several properties:

name – the filename
size – the size of the file in bytes
type – the MIME type for the file

A File object basically gives you essential information about the file without providing direct access to the file contents. That’s important because reading from files requires disk access, and depending on the size of the file, that process has the potential to take a significant amount of time. A File object is just a reference to a file, and getting data from that file is a separate process altogether.

Getting File references

Of course, access to user files is strictly forbidden on the web because it’s a very obvious security issue. You wouldn’t want to load up a web page and then have it scan your hard drive and figure out what’s there. You need permission from the user in order to access files from their computer. There’s no need for messy permission windows, however, because users grant permission for web pages to read files all the time when they decide to upload something.

When you use a control, you’re giving the web page (and the server) permission to access that file. So it makes sense that the first place you can retrieve File objects is through a control.

HTML5 defines a files property for all controls. This collection is a FileList, which is an array-like structure called FileList containing File objects for each selected file in the control (remember, HTML5 allows multiple file selection in these controls). So at any point in time, you can get access to the files a user has selected using code similar to this:

This relatively simple code listens for the change event on the file control. When the event fires, it signifies that the file selection has changed, and the code iterates through each File object and outputs its information. Keep in mind that the files property is always accessible from JavaScript, so you don’t have to wait for change to try to read it.

Drag and drop files

Accessing files from form controls still requires the form control and the associated user action of browsing to find the files of interest. Fortunately, HTML5 Drag and Drop[2] provides another way for users to grant access to their files: by simply dragging a file from the desktop into the web browser. All you have to do to take advantage is listen for two events.

In order to read files that are dropped onto an area of the page, you must listen for the dragover and drop events and cancel the default action of both. Doing so tells the browser that you are handling the action directly and it shouldn’t, for example, open an image file.

The event.dataTransfer.files is another FileList object that you can access to get file information. The code is almost exactly the same as using the file form control and the File objects can be accessed in the same way.

Ajax file upload

Once you have a reference to the file, you’re able to do something that’s pretty cool: upload a file via Ajax. This is all possible due to the FormData object, which is defined in XMLHttpRequest Level 2[3]. This object represents an HTML form and allows you to add key-value pairs to be submitted to the server via the append() method:

var form = new FormData();
form.append("name", "Nicholas");

The great thing about the FormData object is that you can add a file directly to it, effectively mimicking a file upload by HTML form. All you have to do is add the File reference with a specific name, and the browser does the rest. For example:

// create a form with a couple of values
var form = new FormData();
form.append("name", "Nicholas");
form.append("photo", control.files[0]);

// send via XHR - look ma, no headers being set!
var xhr = new XMLHttpRequest();
xhr.onload = function() {
console.log("Upload complete.");
};
xhr.open("post", "/entrypoint", true);
xhr.send(form);

Once the FormData object is passed into send(), the proper HTTP headers are automatically set for you. You don’t have to worry about setting the correct form encoding when using files, so the server gets to act as if a regular HTML form has been submitted, reading file data from the “photo” key and text data from the “name” key. This gives you the freedom to write processing code on the backend that can easily work with both traditional HTML forms and Ajax forms of this nature.

And all of this works in the most recent version of every browser, including Internet Explorer 10.

Up next

You now know the two methods of accessing File information in the browser: through a file upload control and through native drag and drop. There will likely be other ways to access files in the future, but for now, these are the two you need to know. Of course, reading information about files is just part of the problem. The next step is read data from those files, and that’s where part 2 will pick up.

References

File API specification (editor’s draft)
HTML5 Drag and Drop
XMLHttpRequest Level 2

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 08, 2012 07:30

Working with files in JavaScript, Part 1

Many years ago, I was asked during a job interview at Google what changes I would make to the web in order to provide better experiences. At the top of my list was having some way to work with files other than the control. Even as the rest of the web was evolving, the way we dealt with files never changed since it was first introduced. Thankfully, with HTML5 and related APIs, we now have far more options for working with files than ever before in the latest versions of desktop browsers (iOS still has no support for the File API).

The File type

The File type is defined in the File API[1] specification and is an abstract representation of a file. Each instance of File has several properties:

name – the filename
size – the size of the file in bytes
type – the MIME type for the file

A File object basically gives you essential information about the file without providing direct access to the file contents. That’s important because reading from files requires disk access, and depending on the size of the file, that process has the potential to take a significant amount of time. A File object is just a reference to a file, and getting data from that file is a separate process altogether.

Getting File references

Of course, access to user files is strictly forbidden on the web because it’s a very obvious security issue. You wouldn’t want to load up a web page and then have it scan your hard drive and figure out what’s there. You need permission from the user in order to access files from their computer. There’s no need for messy permission windows, however, because users grant permission for web pages to read files all the time when they decide to upload something.

When you use a control, you’re giving the web page (and the server) permission to access that file. So it makes sense that the first place you can retrieve File objects is through a control.

HTML5 defines a files property for all controls. This collection is a FileList, which is an array-like structure called FileList containing File objects for each selected file in the control (remember, HTML5 allows multiple file selection in these controls). So at any point in time, you can get access to the files a user has selected using code similar to this:

This relatively simple code listens for the change event on the file control. When the event fires, it signifies that the file selection has changed, and the code iterates through each File object and outputs its information. Keep in mind that the files property is always accessible from JavaScript, so you don’t have to wait for change to try to read it.

Drag and drop files

Accessing files from form controls still requires the form control and the associated user action of browsing to find the files of interest. Fortunately, HTML5 Drag and Drop[2] provides another way for users to grant access to their files: by simply dragging a file from the desktop into the web browser. All you have to do to take advantage is listen for two events.

In order to read files that are dropped onto an area of the page, you must listen for the dragover and drop events and cancel the default action of both. Doing so tells the browser that you are handling the action directly and it shouldn’t, for example, open an image file.

The event.dataTransfer.files is another FileList object that you can access to get file information. The code is almost exactly the same as using the file form control and the File objects can be accessed in the same way.

Ajax file upload

Once you have a reference to the file, you’re able to do something that’s pretty cool: upload a file via Ajax. This is all possible due to the FormData object, which is defined in XMLHttpRequest Level 2[3]. This object represents an HTML form and allows you to add key-value pairs to be submitted to the server via the append() method:

var form = new FormData();
form.append("name", "Nicholas");

The great thing about the FormData object is that you can add a file directly to it, effectively mimicking a file upload by HTML form. All you have to do is add the File reference with a specific name, and the browser does the rest. For example:

// create a form with a couple of values
var form = new FormData();
form.append("name", "Nicholas");
form.append("photo", control.files[0]);

// send via XHR - look ma, no headers being set!
var xhr = new XMLHttpRequest();
xhr.onload = function() {
console.log("Upload complete.");
};
xhr.open("post", "/entrypoint", true);
xhr.send(form);

Once the FormData object is passed into send(), the proper HTTP headers are automatically set for you. You don’t have to worry about setting the correct form encoding when using files, so the server gets to act as if a regular HTML form has been submitted, reading file data from the “photo” key and text data from the “name” key. This gives you the freedom to write processing code on the backend that can easily work with both traditional HTML forms and Ajax forms of this nature.

And all of this works in the most recent version of every browser, including Internet Explorer 10.

Up next

You now know the two methods of accessing File information in the browser: through a file upload control and through native drag and drop. There will likely be other ways to access files in the future, but for now, these are the two you need to know. Of course, reading information about files is just part of the problem. The next step is read data from those files, and that’s where part 2 will pick up.

References

File API specification (editor’s draft)
HTML5 Drag and Drop
XMLHttpRequest Level 2

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 08, 2012 07:30

Book review: The Linux Command Line

I have a confession to make, before joining Yahoo!, I had never used Linux before. After having one semester of UNIX in college, I spent the next five years just using Windows. When I got to Yahoo!, I was faced with the daunting task of learning Linux on the job as I went. I still remember our intern that first year I was there laughing as I struggled to navigate around my development box. Yeah, the intern was laughing at me.

I really and truly wish that The Linux Command Line had been available to me at that time. This is exactly what a Linux beginner needs to get up to speed quickly. The book goes beyond simply walking through all of the command line utilities, and ventures into the realm of theory and how things work together. I’m definitely not a Linux expert, but I have been using it on a day-to-day basis for about six years, and I still felt like I learned an incredible amount from this book.

The author approaches all of the topics in a very friendly way. All of the examples are easy to follow and he even takes time to explain some of the differences across various flavors of Linux. He covers all of the common command line tools, taking you through in a logical order with great narrative.

This book has earned a permanent place on my desk, as I find myself repeatedly going back to it when I get stuck on something. If you’re a Linux beginner, or just really want to understand the system better, I can’t recommend this book enough. I’ll probably be rereading it a couple more times, myself, to make sure that everything sticks. In the meantime, I’ve already gone through and updated a bunch of bash scripts that I now realize aren’t coded as well as they should be.

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on May 01, 2012 07:00

The performance of localStorage revisited

Now a few weeks removed from a large amount of hand-ringing around the performance of localStorage in browsers, I’ve learned some more about why there was such a concern at Mozilla (which prompted Chris to write his blog post[1]). The post was met with skepticism because it lacked two key components: numbers and a comparison. The assertion was that localStorage is slow, but there was no data to back it up.

Wanting to get to the bottom of it, I[2] and John Allsopp[3] wrote blog posts trying to provide numbers around localStorage. John’s post focused on quantifying the amount of time it takes to perform a single read and a single write, which gave us good initial numbers for these operations. My post focused on comparing localStorage reads and writes to cookie reads and writes from JavaScript. My theory was that cookies are the closest appromixation of localStorage due to the fact that its contents are stored on disk and are shared by all tabs pointing to the same origin. Both John and I concluded by saying that localStorage doesn’t have an appreciably bad affect on performance either as an aggregate rating or in comparison to cookies.

More details

Subsequent to that, I started a conversation with Jonas Sicking from Mozilla, who actually worked on the localStorage implementation for Firefox and so has a unique perspective. He started from the position that there is a performance problem and I started from the position that there is not, based on the numbers from John and I. Jonas pointed out a key piece of information I wasn’t aware of: the performance issue isn’t with individual reads and writes, it’s with the initial read into memory.

Firefox starts out by reading all of the data from localStorage into memory for the page’s origin. Once the data is in memory, reads and writes should be relatively fast (though they do still appear slower than reading and writing to a native JavaScript object – not sure why), so our measuring of reads and writes doesn’t capture the full picture. Jonas’ assertion is that reading the data from localStorage on page load is the concern.

As Jonas kept telling me (and finally it stuck), the real problem with localStorage is that it’s a synchronous API, which makes the implementors decide between a limited number of options. One option is to load all the data as the page is loading, but that has a side effect of slowing down initial page load because JavaScript using localStorage can’t execute until the data for localStorage has been completely read. That means a large amount of data in localStorage could actually increase page load time because JavaScript needs to wait before executing.

The other option isn’t much better. If you were to wait until the first time localStorage was used, it would require a full (blocking) stop while the data was read from disk initially. Once again, this could be noticeable if there’s a large amount of data on disk. What’s more, you could argue that a delay on calling localStorage.getItem() is unexpected, because there is an assumption that you’re already working in memory and so the operation should be fast. This is why Firefox loads the data on page load.

In reality, this becomes the same problem as cookies. Cookies are stored on disk and read into memory upon page load as well. The difference is in the size of the data. Cookies are still fairly limited in size (around 4KB) where localStorage is much large (5MB). Of course, reading a 5MB file from the file system will be faster than downloading it over the internet, but who’s to say if it would significantly affect page load time?

Benchmarks?

I tried to run some benchmarks but was met with a technical limitation: no one is sure if our current testing tools are accurately taking the initial localStorage read into account. Without that information, it’s hard to know whether or not localStorage is actually a performance problem for initial page load. It definitely isn’t a performance issue for reads and writes after the fact (though it doesn’t come without some cost, as noted previously).

A new API?

The call to create a new API to replace localStorage seems a bit permature, but is basically centered around three main ideas:

The browsershouldn’t need to read a large amount of data from disk on page load.
The read from disk should be asynchronous and not block the UI thread.
The developer should be able to indicate when the read should happen.

This led Jonas to suggesting several alternatives APIs on Chris’ original post. The one I like the best is this:

getBetterLocalStorage(function(storage) {
x = storage.foo;
storage.bar = calculateStuff(y);
storage.baz++;
});

Ignoring the name, the getBetterLocalStorage() function signals the browser that it’s time to read everything into memory, so the storage object can be used as any other object. Once the callback function is finished executing, the changes would be written back to disk. Though I’m not ready to throw out localStorage completely, I do like the direction of this API. In fact, it closely follows a proposal I made for improving localStorage with expiration dates and encryption.[4]

Conclusion

Whether or not localStorage is a performance issue on page load is still a question. It’s hard for to know for sure if this is a real issue until we can get some good benchmarks from browsers. Unfortunately, this will likely have to come from browser developers who can look at the code and figure out whether localStorage is already being accounted for, and if not, how to measure it.

In the meantime, IndexedDB is definitely not a suitable replacement for localStorage in almost every case. IndexedDB could be used, as Jonas pointed out, to create a solution similar to the one he proposed. However, it’s still a bit of overhead to write that out. My advice: don’t worry too much about localStorage for now…but don’t go storing 5MB of data in it either, just in case.

References

There is no simple solution for localStorage by Chris Heilmann
In defense of localStorage by Me
localStorage, perhaps not so harmful by John Allsopp
Towards more secure client-side data storage by Me

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on April 25, 2012 14:52

How to install Apache Ant on Windows

Apache Ant[1] is still my favorite tool for creating build systems for my code. Yes, I know there are a lot of shiny new tools written in Node.js or something else, but I've used Ant for a long time and have found it easy to teach others. What's more, it comes installed on Macs and is an easy install on Linux as a package.

Unfortunately, it's a bit of a beast to install on Windows. Every time I have to install Ant on another Windows machine I end up searching the web yet again for a good set of instructions. So this post is primarily for myself, so that I don't need to search too far.

Prerequisites

Before beginning, make sure you have the latest JDK installed. If not, go download it from Sun[2] and install it. It's better to install the JDK instead of just the JRE because some Ant tasks require the JDK.

Step 1: Download and install

The first step, as with most software, is to download Ant. Go to the Ant homepage and click to download the binary. Because we're talking about Windows, choose to download the ZIP file rather than any of the others. Scroll down to where it says "Current release of Ant" and click on the ZIP filename.

Once downloaded, unzip the file. You'll now need to choose a permanent home for Ant on the computer. I tend to use c:\java\ant for simplicity, but you can use whatever you want. I do recommend, however, that the path have no spaces in it (spaces make things more complicated).

Step 2: Set environment variables

This is the part that I always forget. Because you're installing Ant by hand, you also need to deal with setting environment variables by hand.

For Windows XP: To set environment variables on Windows XP, right click on My Computer and select Properties. Then go to the Advanced tab and click the Environment Variables button at the bottom.

For Windows 7: To set environment variables on Windows 7, right click on Computer and select Properties. Click on Advanced System Settings and click the Environment Variables button at the bottom.

The dialog for both Windows XP and Windows 7 is the same. Make sure you're only working on system variables and not user variables.

The only environment variable that you absolutely need is JAVA_HOME, which tells Ant the location of your JRE. If you've installed the JDK, this is likely c:\Program Files\Java\jdk1.x.x\jre on Windows XP and c:\Program Files(x86)\Java\jdk1.x.x\jre on Windows 7. You'll note that both have spaces in their paths, which causes a problem. You need to use the mangled name[3] instead of the complete name. So for Windows XP, use C:\Progra~1\Java\jdk1.x.x\jre and for Windows 7, use C:\Progra~2\Java\jdk1.6.0_26\jre if it's installed in the Program Files(x86) folder (otherwise use the same as Windows XP).

That alone is enough to get Ant to work, but for convenience, it's a good idea to add the Ant binary path to the PATH variable. This variable is a semicolon-delimited list of directories to search for executables. To be able to run ant in any directory, Windows needs to know both the location for the ant binary and for the java binary. You'll need to add both of these to the end of the PATH variable. For Windows XP, you'll likely add something like this:

;c:\java\ant\bin;C:\Progra~1\Java\jdk1.x.x\jre\bin

For Windows 7, it will look something like this:

;c:\java\ant\bin;C:\Progra~2\Java\jdk1.x.x\jre\bin
Done

Once you've done that and applied the changes, you'll need to open a new command prompt to see if the variables are set properly. You should be able to simply run ant and see something like this:

Buildfile: build.xml does not exist!
Build failed

That means Ant is installed properly and is looking for a build.xml file.

References

Apache Ant homepage
Java SE Downloads

View more on Nicholas C. Zakas's website »

Like • 0 comments • flag

Published on April 12, 2012 09:49

It’s time to start using JavaScript strict mode

ECMAScript 5 introduced strict mode to JavaScript. The intent is to allow developers to opt-in to a “better” version of JavaScript, where some of the most common and egregious errors are handled differently. For a while, I was skeptical, especially with only one browser (Firefox) initially supporting strict mode. Fast forward to today, every major browser supports strict mode in their latest version, including Internet Explorer 10 and Opera 12. It’s time to start using strict mode.

What does it do?

Strict mode makes a lot of changes to how JavaScript runs, and I group these into two categories: obvious and subtle. The subtle changes aim to fix subtle problems, and I’m not going to delve into those here; if you’re interested in those details, please see Dmitry Soshnikov’s excellent, ECMA-262-5 in Detail. Chapter 2. Strict Mode[1]. I’m far more interested in talking about the obvious changes: the ones you should know about before using strict mode, and the ones that will most likely help you the most.

Before getting into specific features, keep in mind that one of the goals of strict mode is to allow for faster debugging of issues. The best way to help developers debug is to throw errors when certain patterns occur, rather than silently failing or behaving strangely (which JavaScript does today outside of strict mode). Strict mode code throws far more errors, and that’s a good thing, because it quickly calls to attention things that should be fixed immediately.

Eliminates with

To begin, strict mode eliminates the with statement. It is now considered invalid JavaScript syntax and will throw a syntax error when it appears in strict mode code. So first step to using strict mode: make sure you’re not using with.

// Causes a syntax error in strict mode
with (location) {
alert(href);
}
Prevents accidental globals

Next, variables must be declared before they can be assigned to. Without strict mode, assigning a value to an undeclared variable automatically creates a global variable with that name. This is one of the most common errors in JavaScript. In strict mode, attempting to do so throws an error.

// Throws an error in strict mode
(function() {

someUndeclaredVar = "foo";

}());

Eliminates this coercion

Another important change is a this-value of null or undefined is no longer coerced to the global. Instead, this remains its original value, and so may cause some code depending on the coercion to break. For example:

window.color = "red";
function sayColor() {
alert(this.color);
}

// Throws an error in strict mode, "red" otherwise
sayColor();

// Throws an error in strict mode, "red" otherwise
sayColor.call(null);

Basically, the this-value must be assigned a value or else it remains undefined. That means constructors accidentally called without new are also affected:

function Person(name) {
this.name = name;
}

// Error in strict mode
var me = Person("Nicholas");

In this code, this is undefined when the Person constructor is called without new. Since you can’t assign a property to undefined, this code throws an error. In non-strict mode, this would be coerced to the global and so name would be assigned as a global variable.

No duplicates

It can be quite easy to duplicate properties in objects or named arguments in functions if you’ve been doing a lot of coding. Strict mode throws an error when it comes across either pattern:

// Error in strict mode - duplicate arguments
function doSomething(value1, value2, value1) {
//code
}

// Error in strict mode - duplicate properties
var object = {
foo: "bar",
foo: "baz"
};

These are both syntax errors and so the error is thrown before the code is executed.

Safer eval()

Even though eval() wasn’t removed, it has undergone some changes in strict mode. The biggest change is that variables and functions declared inside of an eval() statement are no longer created in the containing scope. For example:

(function() {

eval("var x = 10;");

// Non-strict mode, alerts 10
// Strict mode, throws an error because x is undeclared
alert(x);

}());

Any variables or functions created inside of eval() stay inside of eval(). You can, however, return a value from eval() if you wish to pass a value back out:

(function() {

var result = eval("var x = 10, y = 20; x + y");

// Works in strict and non-strict mode (30)
alert(result);

}());

Errors for immutables

ECMAScript 5 also introduced the ability to modify property attributes, such as setting a property as read only or freezing an entire object’s structure. In non-strict mode, attempting to modify an immutable property fails silently. You’ve probably run into this issue with some native APIs. Strict mode ensures that an error is thrown whenever you try to modify an object or object property in a way that isn’t allowed.

var person = {};
Object.defineProperty(person, "name", {
writable: false,
value: "Nicholas"
});

// Fails silently in non-strict mode, throws error in strict mode
person.name = "John";

In this example, the name property is set to read only. In non-strict mode, assigning to name fails silently; in strict mode, an error is thrown.

Note: I very strongly encourage you to use strict mode if you’re using any of the ECMAScript attribute capabilities. If you’re changing the mutability of objects, you’ll run into a lot of errors that will fail silently in non-strict mode.

How do you use it?

Strict mode is very easily enabled in modern browsers using the following pragma:

"use strict";

Even though this looks like a string that isn’t assigned to a variable, it actually instructs conforming JavaScript engines to switch into strict mode (browsers that don’t support strict mode simply read this as an unassigned string and continue to work as usual). You can use it either globally or within a function. That being said, you should never use it globally. Using the pragma globally means that any code within the same file also runs in strict mode.

// Don't do this
"use strict";

function doSomething() {
// this runs in strict mode
}

function doSomethingElse() {
// so does this
}

This may not seem like a big deal, however, it can cause big problems in our world of aggressive script concatenation. All it takes is one script to include the pragma globally for every script its concatenated with to be switch into strict mode (potentially revealing errors you never would have anticipated).

For that reason, it’s best to only use strict mode inside of functions, such as:

function doSomething() {
"use strict";
// this runs in strict mode
}

function doSomethingElse() {
// this doesn't run in strict mode
}

If you want strict mode to apply to more than one function, use an immediately-invoked function expression (IIFE):

(function() {

"use strict";

function doSomething() {
// this runs in strict mode
}

function doSomethingElse() {
// so does this
}
}());
Conclusion

I strongly recommend everyone start using strict mode now. There are enough browsers supporting it that strict mode will legitimately help save you from errors you didn’t even know where in your code. Make sure you don’t include the pragma globally, but use IIFEs as frequently as you like to apply strict mode to as much code as possible. Initially, there will be errors you’ve never encountered before – this is normal. Make sure you do a fair amount of testing after switching to strict mode to make sure you’ve caught everything. Definitely don’t just throw "use strict" in your code and assume there are no errors. The bottom line is that it’s time to start using this incredibly useful language feature to write better code.

Update (14-Mar-2012): Added note about using strict mode pragma with non-conforming JavaScript engines.

Update (21-Mar-2012): Fixed typo.

References

ECMA-262-5 in Detail. Chapter 2. Strict Mode by Dmitry Soshnikov