Just How Do Those $&%*# Amazon Algos Work Anyway?

(Yes, I'm still around -- thank you to those of you who have emailed their concern. I'm just concentrating on other social media areas and posting to a few private and public groups, which is cutting into personal blog time.)



Warning: This (very long) post contains math.



Authors are a funny lot. Since a core group of us introduced the idea that Amazon's algorithms for its popularity lists contain a price bias and that freebies now seem to count about 1/10 as much as paid sales on those lists, the amount of misunderstanding regarding the findings has been, to say the least, staggering. Misinformation and disinformation propagating through the interweebs and the various forums point to a serious non-grasping of the underlying concepts.



And the authors reacting to just plain wrong information are understandably questioning why they aren't seeing results in line with what they're reading.



So let's quickly revisit what those findings are and what they aren't. Then we'll take a look at how the algorithm for determining rank on the popularity lists probably works at its most basic structure. Mind you, the actual algorithm is likely much more complicated than what I'll present, but the simple form you'll see here should help you to understand how it determines the playing field.



Popularity List Findings



First, the popularity list is NOT the bestseller list. They are two different beasts. On the Amazon webpage and on Kindle devices, you have to actually navigate to the bestseller list to find it. The BS list will show the paid bestsellers in one column on the left and the free bestsellers in a column to the right. If you're not seeing BOTH paid and free books on the page, you are not looking at the actual bestsellers. If at the bottom of the page you don't see links to specifically scroll through the Top 100 books and ONLY the Top 100, you're not looking at the actual bestseller list.



What you're looking at is the popularity list. And it's THIS list we'll be discussing.



Differences between the popularity and bestseller lists:




The popularity list figures in freebies. The bestseller list does not.
The popularity list does not figure in borrows. The bestseller list does.
The popularity list has a price bias. The bestseller list does not.
The popularity list influences the bestseller list more than the bestseller list influences popularity.
The popularity list figures in sales (and sales-equivalents) over the last 30 days. The bestseller list weights sales history, but not to the extent the pop list does.
The popularity list recrunches about once per day. The bestseller list recrunches hourly.
The popularity list has a lag time of about 2 days. The bestseller list has about an hour lag time.
The popularity list rank does not display anywhere except in the list itself. The bestseller rank is the rank you find on a book's product page.



The only way to know where your book ranks on the popularity list in any given category is to tediously scroll through the list to find it. (If you're pretty sure you're several pages in and don't want to scroll through every page, you can change the page numbers via the url in your address bar; but this option is for advanced users only who can find the page number designation in the url code.)



So, to make this as clear as possible:




The number of freebies you give away during a free run does not in any way affect your bestseller ranking. 
The price of your book does not in any way affect your bestselling ranking.



These variables are used only for determining a book's rank in the popularity lists.



The reason the popularity list rankings are important is that your book's visibility in those lists seems to be a huge sales driver. YOU may not personally find books by browsing that list, but a lot of folk apparently do. Also, some of the recommendation emails Amazon sends out display the top 6 or 7 books in a category, then provide a link to the pop list to discover more books in that category.



Watching your pop list numbers is as important -- and for some, even more important -- than watching your bestseller numbers.



Algorithm for Determining Popularity List Rank



The popularity list algorithm has undergone at least 2 major changes since it came under scrutiny in January. Back then, freebies appeared to be weighted 100% of a sale and borrows appeared to be counted in as well. Because of these weightings, books in Select that went on a successful free run with 2000 or so downloads would wind up at the top of the pop lists after the 2-day lag to get there. That resulted in the famous 3-day bump when browsers would start seeing a book on the first page of a pop list and hit Buy, catapulting a lot of indie books into the stratosphere. That was the Golden Age.



In March, Amazon started doing split marketing, testing different algorithms to create its popularity lists. Between late March and early May, there appeared to be 3 separate lists being tested, and predicting the popularity of a freebie following its free run was difficult because of the multiple lists.



In early May, Amazon apparently settled on a single algorithm to display to the majority of its customers. (Caveat: the list for the Fire seems to be out of synch from the rest -- either Fire readers are being presented a different list entirely or else the servers sending out the data to Fires are delayed.) There are umpteen possibilities as to WHY Amazon settled on the algorithm it did. I've speculated elsewhere about the why as have others, and this post won't rehash those speculations. We're simply accepting that Amazon wanted to elevate certain classes of books and decelerate the meteoric rise of others. It's how they're accomplishing this that we'll look at today.



Remember, we're working on best-guess speculation here, figured out from watching how the books on the list perform against each other. It's reverse-engineering -- and subject to a lot of variables that those of us outside of Amazon are simply not privy too. There will always be outliers, and there will always be minor differences in rank performance due to those other variables. For the most part, though, this simple formula seems to be the base for the current popularity list algorithm.



[(.1 x A) + B] x C / 30 = number of sales equivalents



where

A = the number of freebies given away in the past 30 days (notice it gets multiplied by 0.1 or 1/10);

B = the number of actual sales in the past 30 days; and

C = the weighting given for pricing.

30 = the number of days in a month (hence, the 30-day cliff that's talked about in conjunction with the pop list)



C is guess work since it's hard to figure exactly how Amazon is weighting price. It's a big enough variable to be noticeable, but not so big that it skews the results in a truly huge way. It also seems that the weighting of price goes by ranges of price, so a 2.99 book might be weighted the same as a 3.99 book. As a guess, the following matrix might be reasonably close:



99c -$2.98 = 1.0

2.99 - 3.99 = 1.1

4.00 - 5.99 = 1.2

6.00 - 7.99 = 1.3

8.00 - 9.99 = 1.4



So let's put some real numbers in there to see how this works. I'll use SECTOR C's past 30 days as an example since it had only a modest free run the last time out and its overall July sales were modest as well.



So for SECTOR C,

A = 3325 (number of freebies given away on the US site)

B = 328 (number of US sales from July 4 - Aug 3)

C = 1.2 ($4.39 is the book's typical list price)



Plugging the numbers into the equation, and showing our work, we get:



[(.1 x 3325) + 328] x 1.2 / 30 =

(332 + 328) x 1.2 / 30 =

660 x 1.2 / 30 =

792 / 30 = 26.4



So, 26.4 is the average daily sales equivalent for the past 30 days. Because of a healthy number of freebies being figured in, that means that SECTOR C is going to enjoy a better popularity rank than another book that has sold 328 copies over the last 30 days -- even if that other book currently has a better bestseller rank.



26.4 books is equivalent to a bestseller sales rank of around #3500. On Aug 3, SECTOR C's actual bestseller rank was between #5565 and #6930.



Now, because we don't know the exact number of books other authors are selling, we have to look at current ranks to make some best guesses to see why SECTOR C is at #29 on the popularity list for Technothrillers. And because books that have been on free runs are more volatile in the ranks, it's best to compare books that are not in Select (who is this Tom Clancy that has books on either side of mine on that list?!).



Here are the ranks and prices of the non-Select books closest to mine at #29:

#25 - 3178 - $3.99

#26 - 6030 - $8.99

#27 - 5380 - $8.99

#28 - 8720 - $4.95

#29 - 3500 (equivalent) - $4.39 (This is SECTOR C)

#30 - 24,365 - $3.99

#34 - 3931 - $3.99



"Aha!" you say. "A flaw in the calculations! Look at the 24,000+ rank on the book at #30!" Well, yes, I did look at that book and I found through Google that it had been free on at least July 25, so it was either price matched during the last 30 days or left Select in the past week. Variables like this are what makes reverse-engineering difficult -- and likely what makes many folk looking at a single snapshot question the accuracy of the findings. It has taken several snapshots over an extended period of time and deep research to come up with the guesstimations that we have.



While we can never draw conclusions from such limited data, we can look at the data above and see a couple of things:



At about the same price, the books at #25, #29 and #34 line up in the rank right where we would expect them to in relation to one another. We've already determined that #30 is skewed by an earlier free run. As I can only see today's rank for #28 (it's not listed on any of the tracker sites), it could well have had a better rank 2 days ago (another reason it's important to look at all this stuff over time as well). The 2 $8.99 books at #26 and #27 are Tom Clancy books that have been selling steadily at those ranks and are a demonstration of the price bias in action.



So, Realistically, What Can You Do With This Information?



Honestly? Not a whole lot. A higher price will give you a slight advantage, but only if you're selling well enough to be near the top of the pop lists anyway. It's not like a $4.99 book is going to rank dozens of ranks better than one that sells the same number of copies at $3.99. And a 99c book that sells 1000 copies will still rank higher than a $2.99 book that sells only 300. Simply pricing your book higher is not going to automatically boost your ranking.



Giving away a LOT of books during a free run can certainly help. Even so, the 3325 copies of SECTOR C given away last month only equaled about 332 sales equivalents. Depending on the category your book is in, that could be a drop in the bucket. In categories where the top books are selling 1000 copies a day, you'd have to give away 300,000 books to compete for first-page visibility. If you only gave away 200,000 books, you'd have to make up the difference with 10,000 paid sales. For most of us, it ain't gonna happen.



So if you're looking to the algorithms to help you sell, understand exactly what the algos are doing for you -- and how they work against you. There's no magic to them. It's all pure math. And Amazon may choose to change the math that feeds them tomorrow. Just maybe, the next changes will be in our favor...


 •  0 comments  •  flag
Share on Twitter
Published on August 06, 2012 10:44
No comments have been added yet.


Phoenix Sullivan's Blog

Phoenix Sullivan
Phoenix Sullivan isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Phoenix Sullivan's blog with rss.