SEO Automation for Expansive Websites #SMX #14A
SEO Automation for Expansive Websites #SMX #14A was originally published on BruceClay.com, home of expert search engine optimization tips.

Michael Nguyen, Director of SEO at Connexity/Shopzilla, on stage at SMX West 2015
This SMX West 2015 mini session is a presentation grounded in the belief that you can automate everything for SEO and directly integrate it to your products. This is especially important for large websites. Michael Nguyen, Director of SEO at Connexity/Shopzilla, is our presenter.
SEO for Large Sites
What is large? Too big to fit in your head. Examples are networks of sites and sites with thousands and millions of pieces of content. Enterprise sites and long-tail businesses are also to be considered large.
Large sites have big problems. Good things about large sites include domain authority and a lot of content. But problems include:
Too many products!
Duplicate content and canonical issues
Crawl efficiency
Shifting inventory
Shifting search demand
This is a discovery optimization problem. How do you make it easy for users and search engines to get to the content you care about. The solution is organizing content, surface content and improve the content value.
Checklist if you’ve inherited a large site and run into the problems:
Identify valuable content: figure out the pages/products that are valuable
Keyword research: natural language and not jargon
Site architecture: manually create a taxonomy with the help of info architects and keyword research
Content for category pages: manually create content for landing pages to merchandise products/content
Flag dupe content: audit content and map duplicate to a canonical URL
Deal with stale inventory: Audit content/ products for removal and canonicalization. If it’s temporary: keep it up; if it’s permanent: redirect or 404
Deal with pagination and facets: create rules for managing crawl and technical SEO. Control the crawl.
Promote high quality content: remove the junk, market the good, and be selective.
Build category linking support: link to categories and subcats in all key areas. Flatten the crawl architecture by linking across deep pages. Remember links come in from everywhere.
You’re done! … Not so fast. If you’re doing the whole process by hand, you find you run into changes as you go. Change is constant. Inventory and business changes. Not a one time deal. Doing it manually doesn’t scale for large operations. Many variables to manage. The answer is automation. Scale your operation with technology and process.
How to Automate
1. Identify routine tasks. Figure out how often you do a task. Start small. Automate very small tasks to start.
2. Improve operational efficiency. Focus on speeding up processes.
3. Test and validate. When you start to trust an automated system, check to make sure it’s doing the job right.
4. Keep in mind the bigger picture.
Great SEO Platforms
Characteristics of a great SEO platform:
Make the core product better
Enable testing and experimentation
Leverage big data
Combine data and expert intuition
Dynamic content management
Related searches
Page scoring: if you can evaluate content with KPIs, you can direct users to pages that matter the most. This is business-specific scoring
Duplicate content classification: ensure that you only promote your most useful content
Backlink classification: easily audit large amounts of backlinks
Get to Good Enough
You can’t ever get to perfection. You’ll never automate yourself out of a job. Get to good enough first. Try to get value out of every step. APIs are your friend; you don’t always have to build things in-house. Offload tasks and data gathering (AuthorityLabs, GA/GWT, Deepcrawl, Botify). Utilize your internal search engine to understand search engine concepts like keyword data. Leverage data and search science, like books on informational retrieval and open source tools (sentiment analysis, spam filters, duplicate content classifiers, and he gives Mahout, Hadoop, HBase as examples). Build a feedback loop into your system to generate data, make judgments, have it continually understand what’s going on and loop it back into the system.
This is the future of SEO.