Logo
Home
Logo
Logo
HomeRSSSite Map

©Copyright 2025 Maple Commerce

Made with
  1. Post
  2. What are document N-grams?

1 min read

What are document N-grams?

Written by

IV

Creator

Published on

4/11/2025

Based on Semantic SEO principles, document N-grams are contiguous or non-contiguous sequences of N items (typically words) found within a document. Here are several key aspects of document N-grams:

  • Identification of patterns and phrases: N-gram analysis involves processing text to identify how frequently different sequences of words appear. This helps in understanding the common phrases and linguistic structures present in a document.
  • Different types of N-grams: The value of 'N' determines the type of n-gram. Bigrams consist of two consecutive words, trigrams of three, fourgrams of four, and so on. Additionally, skip-grams are mentioned, which are non-contiguous sequences where some words are skipped (e.g., 1-skip bigrams, 2-skip bigrams).
  • Understanding document topic and context:Site-wide n-grams, which appear on every web page of a source, are particularly helpful for search engines to locate the main topic and macro context of the entire website. Analysing the consistent appearance of certain target words across a document can help understand its overall character.
  • Semantic SEO and ranking: Unique phrase sequences or unique n-grams containing original information can convey authority on a topic. By providing unique n-grams, particularly within supplementary content, a website can be perceived as an authority by search engines. Using lexical relationships, like hyponyms, can aid in creating these unique n-grams.
  • Query semantics: Understanding the n-grams within documents is related to query semantics, which focuses on the meaning and relevance of search terms. Search engines use n-gram analysis, along with other Natural Language Processing (NLP) techniques, to understand the relationship between queries and documents, focusing on context rather than just string matching.
  • Tools for analysis: Tools like Oncrawl offer features such as "N-gram Analysis as site-wide" to help analyse these word sequences within a website.
  • Sequence modelling: The concept of sequence modelling, which is the backbone of semantic SEO, involves understanding the likelihood of words appearing together. N-gram analysis contributes to this by revealing common word sequences in documents.
    In essence, document N-grams provide a way to analyse the composition of text at a multi-word level, offering insights into the content's themes, linguistic patterns, and its potential relevance to user queries for search engines.

Subscribe to our newsletter

AI to ROI - Get actionable insights, tested strategies, and real-world case studies direct to your inbox.

We care about your data in our privacy policy.

Latest

More from the site

    AI Prompt Engineering Markup Best Practices

    When crafting prompts for AI systems, clear markup and structure significantly improve the quality and consistency of responses. Here's a progression from basic to advanced techniques: Basic Text Form

    Read post

    How to run Facebook Ads in 2025

    Okay, so you want to know the right way to do Facebook ad campaigns in 2025? This is a cracking question, and frankly, it's constantly evolving, but there are some absolute game-changers and core prin

    Read post

    New SEO vs Traditional SEO - Core Mindset Shifts and Objectives

    Focus on the Topic, Not Just Keywords: Semantic SEO centres on creating content for an entire topic, not just a single keyword. This means publishing content for multiple semantic keywords that cover

    Read post

View all posts