Toby Segaran's Blog, page 4

April 17, 2009

My latest project: Freerisk

For the past few months, between writing books and my day job, I’ve been working on a project with my friend Jesper called Freerisk.


A few months ago after we first heard Tim O’Reilly’s “Work on stuff that matters” speech, we started talking about what issues, besides the environmental concerns mentioned in his speech, were import to us that we actually had the skills to work on. We came to the idea of how hackers could help the financial system, particularly when it came to evaluating default-risk of companies or looking for fraudulent behavior.


The financial system itself has always been very closed. The government republishes filings by the SEC in a variety of messy formats, but those who want clean data need to pay subscription fees and have very limited republication rights. So our plan is to make Freerisk a huge open data store of financial data taken primarily from company filings. It’s all going to be available to download or query using standards like SPARQL.


On top of that, there will be APIs for building risk models and submitting your results. We hope to show that “financial hackers” can come up with more interesting and accurate calculators that can model a wider variety of risk scenarios.


If you’re interested in this, several people have written about the project:



Harvard Business Review
Fast Company Interview
O’Reilly Radar on Freerisk
Innovation Lab Write up (Danish)
Next 6 Presentation Followup

We’ve also given several presentations. The O’Reilly emerging technologies conference was kind enough to make and post a video of our talk there (this was our first one, so it’s a little rough, but it should give you a good idea!)






We are looking for people who are interested in getting involved in this project. We have started a discussion group called Open Finance Hackers (just started, nothing there yet). If you’re interested in this at all, please email me and join the group.

 •  0 comments  •  flag
Share on Twitter
Published on April 17, 2009 13:39

March 1, 2009

A crazy few months

Apologies for the lack of recent posts (I think you’ll forgive me in just a moment). I’ve had a crazy few months, but here’s what I’ve been up to, with links for stuff that you can pre-order and download!



Finished the draft of my second book, with my coworkers Jamie Taylor and Colin Evans. It’s called “Programming the Semantic Web” and it’s already listed in Amazon (the description there right now will be changed, trust me)
Working on collecting and editing essays for what will be a great collection, called Beautiful Data. I’m not sure if I’m allowed to tell you who the contributors are yet, but I will say they’re fantastic and we were very lucky to get them.
I gave a 3-hour workshop and a 40-minute session talk at Webstock which was held in Wellington, New Zealand a couple of weeks ago. It was an amazing experience and warrants a whole post on its own. For now, the slides for both sessions are available as PDFs at http://kiwitobes.com/webstock/

And coming up, there’s still more stuff going on:



I’m giving a talk at ETech on March 10th. It’s about the failure of risk rating agencies and ideas for how the tech community can help
I’ll also be at Web 2.0 Expo giving another talk on Sources for Data Geeks on April 2nd
And I’m getting married on July 4th!

(because a lot of people ask me, the answer is: no, conference speaking is not even slightly lucrative. I do it for fun)

 •  0 comments  •  flag
Share on Twitter
Published on March 01, 2009 18:50

October 14, 2008

Personal data integration (part 1)

I’ve been toying with the idea of attempting “semantic integration” of a lot of personal data in my life. I’ll be sure to share more later, but so far I’ve managed to pull together my September phone records, my email history, my contacts, my calendar and my Facebook friends (via the API, not something sketchy!) into a single triple-store.


Using this data, I was able to create this chart, which shows my friend network (I have removed myself and Brooke, since we’re connected to everyone and it ruins the layout). The people who I emailed, texted or called in September are shown in green.


[image error]


You can see tight clusters of my friend groups. The tightest is the big hairball near the bottom that makes up much of Brooke’s Stanford GSB class, but also clear are groupings for my friends from MIT, Chapel Hill, Boston (post-MIT return), my San Francisco tech friends and my family. My family is the only group that is isolated from the rest of the graph — everyone else is connected, which is partly because I’ve introduced some of these groups to each other, and partly just because it’s a small world.


Also good to see is that almost every cluster has at least one green node (my family notably doesn’t, but that’s because my parents aren’t on Facebook), so I’ve generally done a good job of keeping in touch with at least a few people from different phases of my life.


There’s a lot of talk about breaking the silos in the enterprise and, in the semantic-web community, data integration across the entire web. But right now, people don’t even have decent integration across their own personal information. The current proliferation of single-feature applications encourages you to store different aspects of your life in different places — the advantage of course, is that something highly specialized is much more pleasant to use, but the disadvantage is that there’s no way to query across these aspects. I’m interested in experimenting with ways that help people “break the silos” with their own information, in the hope that this will both yield useful applications and help us get a better grip on the bigger problems.


I now have code to keep my triple-store synced with my friend network, my contacts, my phone records, my email and my calendar. I can construct queries across all of this (who did I forget to call on their birthday? Who have I seen recently who went to Stanford?). I’ll be sharing this code at some point, but I want to see how far I can take this. I’m also interested in hearing from anyone who has tried similar experiments and wants to collaborate.


So, anyone have any thoughts on other sources of personal data or questions you might want to ask once it’s integrated?

 •  0 comments  •  flag
Share on Twitter
Published on October 14, 2008 13:40

September 15, 2008

Web 2.0 NYC, Freebase UG meeting, and Taleb

A few quick updates:



I’ll be speaking at Web 2.0 in New York City this Thursday at 3pm. If you’re at the conference, find me and say hi!
While I’m gone, Freebase is having a user group meeting. Here is the info. Great speakers, you’ll seriously love the GeoSearch API
A new article by my favorite non-fiction author, Nassim Taleb, is at Edge. Highly recommended

I’m working on a lot of new projects right now, I’ll have more to share soon.

 •  0 comments  •  flag
Share on Twitter
Published on September 15, 2008 19:13

August 31, 2008

O’Reilly interview at OSCON

While I was at OSCON earlier this year, I did a 20 minute video interview with O’Reilly. I think the idea is to take a lot of interviews and edit them down to shorter segments for some kind of video supplement, but they’ve also posted the entire thing on Youtube.


I talk a little bit about my biotech experience, my book, working at Freebase and the importance of open data to new applications. The whole 20-minute segment is embedded below.




Let me know what you think!

 •  0 comments  •  flag
Share on Twitter
Published on August 31, 2008 11:25

Toby Segaran's Blog

Toby Segaran
Toby Segaran isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Toby Segaran's blog with rss.