Part-of-Speech (POS)#

Part-of-speech tagging is the process of labeling each word in a sentence with its grammatical role. These roles help the NLP model understand the structure and meaning of a sentence.

Example sentence:

The cat sat on the mat.

POS tags:

  • The → Determiner (DT)

  • cat → Noun (NN)

  • sat → Verb (VBD)

  • on → Preposition (IN)

  • the → Determiner (DT)

  • mat → Noun (NN)


Common Parts of Speech in NLP#

POS

Description

Example

Noun (NN)

Person, place, thing, or idea

cat, city, happiness

Pronoun (PRP)

Replaces a noun

he, she, it, they

Verb (VB, VBD, VBG, VBN, VBP, VBZ)

Action or state

run, running, ran, is, have

Adjective (JJ)

Describes a noun

beautiful, tall, red

Adverb (RB)

Describes a verb, adjective, or other adverb

quickly, very, well

Determiner (DT)

Introduces a noun

the, a, this, those

Preposition (IN)

Shows relationship between noun and other words

in, on, at, by

Conjunction (CC)

Connects words or phrases

and, but, or

Interjection (UH)

Expresses emotion

wow!, oh!, hey!

Modal (MD)

Expresses possibility, ability, necessity

can, should, must


Why POS Tagging is Important in NLP#

  1. Syntax Understanding

    • Helps models understand sentence structure.

    • Example: Differentiating between “run” as a noun vs. verb.

  2. Named Entity Recognition (NER)

    • POS tags help identify proper nouns for detecting names, organizations, locations.

  3. Sentiment Analysis

    • Adjectives and adverbs often carry sentiment; POS tagging helps isolate them.

  4. Information Extraction

    • Helps extract subject, object, and action from a sentence.

  5. Machine Translation

    • Understanding grammar helps generate accurate translations.


POS Tagging in NLP Tools#

  • NLTK and spaCy are popular libraries for POS tagging.

Example with NLTK#

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

sentence = "The cat sat on the mat."
words = nltk.word_tokenize(sentence)
pos_tags = nltk.pos_tag(words)
print(pos_tags)


import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("The cat sat on the mat.")
for token in doc:
    print(token.text, token.pos_)
[nltk_data] Downloading package punkt to c:\Users\sangouda\AppData\Loc
[nltk_data]     al\Programs\Python\Python312\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to c:\Users
[nltk_data]     \sangouda\AppData\Local\Programs\Python\Python312\nltk
[nltk_data]     _data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[('The', 'DT'), ('cat', 'NN'), ('sat', 'VBD'), ('on', 'IN'), ('the', 'DT'), ('mat', 'NN'), ('.', '.')]
The DET
cat NOUN
sat VERB
on ADP
the DET
mat NOUN
. PUNCT

Key Points

  • POS tagging assigns grammatical roles to words.

  • Essential for syntactic parsing, sentiment analysis, and text understanding.

  • NLP libraries like NLTK, spaCy, and StanfordNLP make tagging easy and reliable.