Handle Emojis & Emoticons#

  • Emojis/emoticons often replace words: πŸ™‚ = β€œhappy”, 😒 = β€œsad”.

  • They affect sentiment polarity: β€œI love this movie πŸ˜‚πŸ”₯” β†’ highly positive.

  • Ignoring them can lead to loss of meaning.


Ways to Handle Emojis/Emoticons#

1. Remove emojis/emoticons#

  • Useful when emojis are irrelevant (e.g., formal text classification).

import re
text = "I am happy 😍 but tired :("
cleaned = re.sub(r'[^\w\s,]', '', text)  # removes emojis & special chars
print(cleaned)  # "I am happy  but tired "

2. Convert emojis/emoticons to words#

  • Map πŸ˜€ β†’ β€œsmile”, 😒 β†’ β€œsad”, etc.

  • Use libraries like emoji or custom dictionaries.

import emoji

text = "I love pizza πŸ•πŸ˜"
converted = emoji.demojize(text)
print(converted)  # "I love pizza :pizza: :smiling_face_with_heart_eyes:"

3. Keep only emojis#

  • Useful for emotion-only analysis.

emoji_only = ''.join(char for char in text if char in emoji.EMOJI_DATA)
print(emoji_only)  # "πŸ•πŸ˜"

4. Emoticon handling#

  • Create a dictionary for emoticons:

emoticons = {
    ":)": "smile",
    ":-)": "smile",
    ":(": "sad",
    ":'(": "crying",
    ";)": "wink"
}
text = "I am happy :) but sad :("
for emo, meaning in emoticons.items():
    text = text.replace(emo, meaning)
print(text)  # "I am happy smile but sad sad"

5. Sentiment-aware processing#

  • Emojis can be features in models.

  • Example: 😍 β†’ strong positive, 😑 β†’ strong negative.

  • Libraries like VADER already account for emoticons.


Summary

  • Remove β†’ when not needed.

  • Convert to text β†’ keeps sentiment/context.

  • Extract only emojis β†’ for emoji-specific tasks.

  • Use as features β†’ boosts sentiment/emotion models.

# Install emoji library if not already
# !pip install emoji

import re
import emoji

# Sample text with emojis and emoticons
text = "I love pizza πŸ•πŸ˜ but I am tired 😴 :("

print("Original Text:")
print(text)

# 1️⃣ Remove emojis & emoticons
cleaned_text = re.sub(r'[^\w\s,]', '', text)  # remove non-alphanumeric chars except comma
print("\nAfter Removing Emojis & Emoticons:")
print(cleaned_text)

# 2️⃣ Convert emojis to words
converted_text = emoji.demojize(text)
print("\nAfter Converting Emojis to Words:")
print(converted_text)

# 3️⃣ Extract only emojis
emojis_only = ''.join(char for char in text if char in emoji.EMOJI_DATA)
print("\nExtracted Emojis Only:")
print(emojis_only)

# 4️⃣ Handle common emoticons with dictionary
emoticon_dict = {
    ":)": "smile",
    ":-)": "smile",
    ":(": "sad",
    ":-(": "sad"
}
text_with_emoticons_handled = text
for emo, meaning in emoticon_dict.items():
    text_with_emoticons_handled = text_with_emoticons_handled.replace(emo, meaning)

print("\nAfter Converting Emoticons to Words:")
print(text_with_emoticons_handled)
Original Text:
I love pizza πŸ•πŸ˜ but I am tired 😴 :(

After Removing Emojis & Emoticons:
I love pizza  but I am tired  

After Converting Emojis to Words:
I love pizza :pizza::smiling_face_with_heart-eyes: but I am tired :sleeping_face: :(

Extracted Emojis Only:
πŸ•πŸ˜πŸ˜΄

After Converting Emoticons to Words:
I love pizza πŸ•πŸ˜ but I am tired 😴 sad