The system also finds and maps all your internal links, generates a link graph and calculates internal PageRank. This provides a more nuanced insight into internal page connectivity and can further inform link recommendation strategy and fine-tune the final output. Internal page authority is also useful in creating before and after optimisation projections and linking them with your business goals.
The final outcome of data extraction is:
- Clean text
- Page meta data
- Internal link graph, anchor text and PageRank
BERT Vector Embeddings
Pre-processed and tokenised text is then converted to multi-dimensional vectors as language-agnostic BERT sentence embeddings. This enables similarity searches among pages in any major human language making it suitable for multilingual websites with complex alternate hreflang setups.
Similarity Search
In this stage we employ cosine similarity to generate a similarity matrix and are able to produce an arbitrary number of link recommendations in a many to many scenario. This maps all similar pages in the entire dataset.
Prioritisation
Your similarity matrix holds many thousands, even millions of link recommendations and so it’s important to carefully prioritise the rollout of implementation by focusing on heuristics to find high-impact link recommendations and filtering out the rest.
Examples rules:
- Only link higher to lower PageRank URLs
- Exclude pages which already link to each other
- Prevent linking between pages that are too far apart in authority
- Suggest links only between certain topical clusters
Note: We do not recommend automatic-linking, however LLM-based automated evaluation is an option.
Final Recommendations
You receive a spreadsheet of all link recommendations, prioritised and sorted for implementation. Our work isn’t finished at this stage however, as we stay with you offering advice, guidance and assistance during implementation and help you measure and report on impact of your internal link optimisation project.