Julia Silge's Blog, page 6

April 22, 2021

Which #TidyTuesday Netflix titles are movies and which are TV shows?

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just starting out to tuning more complex models with many hyperparameters. Today���s screencast walks through how to build features for modeling from text, with this week���s #TidyTuesday dataset on Netflix titles. ����

Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Explore data

Our modeling goal is to predict whether a title on Ne...

 •  0 comments  •  flag
Share on Twitter
Published on April 22, 2021 17:00

April 13, 2021

Which #TidyTuesday post offices are in Hawaii?

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today���s screencast walks through how to use text information at the subword level in predictive modeling, with this week���s #TidyTuesday dataset on United States post offices. ����

Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Explore data

Our modeling goal i...

 •  0 comments  •  flag
Share on Twitter
Published on April 13, 2021 17:00

March 23, 2021

Dimensionality reduction of #TidyTuesday United Nations voting patterns

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. One change I have recently made on my blog is to remove Disqus comments. I want to say a huge THANK YOU ���� to everyone who ever commented on my blog before and express how much I appreciate folks��� interest and willingness to share their thoughts. Disqus was becoming frustrating for a couple of reasons, so I downloaded my...

1 like ·   •  0 comments  •  flag
Share on Twitter
Published on March 23, 2021 17:00

March 3, 2021

Bootstrap confidence intervals for #TidyTuesday Super Bowl commercials

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast uses a relatively new function from rsample for quickly finding bootstrap confidence intervals, with this week’s #TidyTuesday dataset on Super Bowl commercials. ����

Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Explore the data

Our modeling g...

 •  0 comments  •  flag
Share on Twitter
Published on March 03, 2021 16:00

February 23, 2021

Getting started with k-means and #TidyTuesday employment status

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today���s screencast uses the broom package to tidy output from k-means clustering, with this week���s #TidyTuesday dataset on employment and demographics.

Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Explore the data

Our modeling goal is to use k-means c...

 •  0 comments  •  flag
Share on Twitter
Published on February 23, 2021 16:00

February 11, 2021

Understand your models with #TidyTuesday inequality in student debt

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast is a short one! It walks through how we can use tidyverse and tidymodels functions to explore a model after we have trained it, using this week’s #TidyTuesday dataset on student debt inequality. �����������

Here is the code I used in the video, for those who prefer reading instead of or in addition ...

 •  0 comments  •  flag
Share on Twitter
Published on February 11, 2021 16:00

February 1, 2021

Learn tidytext with my new learnr course

Today I am happy to announce that a new free, online, open source, interactive tutorial, Text Mining with Tidy Data Principles, has been published! ����

I previously developed an interactive course on text mining for an online learning company, but that course is no longer available. I’ve been wanting to revisit the ideas behind that course, update them, and make a new tutorial freely available for a long time, much like I did for my supervised machine learning course; I recently sat down and...

 •  0 comments  •  flag
Share on Twitter
Published on February 01, 2021 16:00

January 14, 2021

Explore art media over time in the #TidyTuesday Tate collection dataset

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from starting out with first modeling steps to tuning more complex models. Today’s screencast walks through how to train a regularized regression model with text features and then check model diagnostics like residuals, using this week’s #TidyTuesday dataset on the artwork in the Tate collection. ����

Here is the code I used in the video, for those who prefer reading instead of or in additi...

 •  0 comments  •  flag
Share on Twitter
Published on January 14, 2021 16:00

January 3, 2021

Predicting injuries for Chicago traffic crashes

This is the latest in my series of
screencasts demonstrating how to use the
tidymodels packages, from starting out with first modeling steps to tuning more complex models. Instead of Tidy Tuesday data, this screencast uses some “wild caught” data from Chicago’s open data portal and is planned to be the first in a series walking through how to approach model ops tasks using tidymodels and other R tools. This screencast focuses on training a model, for
traffic crashes in Chicago. We can build a...

 •  0 comments  •  flag
Share on Twitter
Published on January 03, 2021 16:00

December 15, 2020

Upcoming changes to tidytext: threat of COLLAPSE

The
tidytext package passed one million downloads from CRAN this year! It has been truly amazing to see this project grow
out of an rOpenSci unconference several years ago to be a piece of software useful to people’s real world work.


There has been some of the infrastructure of the package still around from its very early days, and as more people have continued to use it, some early decisions have needed to be visited. I recently made some updates that fix what most people would consider a bug...

 •  0 comments  •  flag
Share on Twitter
Published on December 15, 2020 16:00