Julia Silge's Blog, page 2

July 18, 2023

Classification metrics for #TidyTuesday GPT detectors

Learn about different kinds of metrics for evaluating classification models, and how to compute, compare, and visualize them.
 •  0 comments  •  flag
Share on Twitter
Published on July 18, 2023 17:00

July 4, 2023

What tokens are used more vs. less in #TidyTuesday place names?

Let’s use byte pair encoding tokenization along with Poisson regression to understand which tokens are more more often (or less often) in US place names.
 •  0 comments  •  flag
Share on Twitter
Published on July 04, 2023 17:00

May 19, 2023

Predict the magnitude of #TidyTuesday tornadoes with effect encoding and xgboost

How well can we predict the magnitude of tornadoes in the US? Let’s use xgboost along with effect encoding to fit our model.
 •  0 comments  •  flag
Share on Twitter
Published on May 19, 2023 17:00

May 10, 2023

Tune an xgboost model with early stopping and #TidyTuesday childcare costs

Can we predict childcare costs in the US using an xgboost model? In this blog post, learn how to use early stopping for hyperparameter tuning.
 •  0 comments  •  flag
Share on Twitter
Published on May 10, 2023 17:00

May 3, 2023

Deploy a model on AWS SageMaker with vetiver

Learn how to train and deploy a model with R and vetiver on AWS SageMaker infrastructure.
 •  0 comments  •  flag
Share on Twitter
Published on May 03, 2023 17:00

April 4, 2023

Use OpenAI text embeddings with #TidyTuesday horror movie descriptions

High quality text embeddings are becoming more available from companies like OpenAI. Learn how to obtain them and then use them for text analysis.
 •  0 comments  •  flag
Share on Twitter
Published on April 04, 2023 17:00

February 7, 2023

Resampling to understand gender in #TidyTuesday art history data

Artists who are women are underrepresented in art history textbooks, and we can use resampling to robustly understand more about this imbalance.
 •  0 comments  •  flag
Share on Twitter
Published on February 07, 2023 16:00

January 17, 2023

To downsample imbalanced data or not, with #TidyTuesday bird feeders

Will squirrels will come eat from your bird feeder? Let’s fit a model both with and without downsampling to find out.
 •  0 comments  •  flag
Share on Twitter
Published on January 17, 2023 16:00

November 24, 2022

High cardinality predictors for #TidyTuesday museums in the UK

Learn how to handle predictors with high cardinality using tidymodels for accreditation data on UK museums.
 •  0 comments  •  flag
Share on Twitter
Published on November 24, 2022 16:00

November 9, 2022

Delete all your tweets using rtweet

Worried about how a certain social media platform is going and want to start removing yourself? Learn how to delete all your tweets.
 •  0 comments  •  flag
Share on Twitter
Published on November 09, 2022 16:00