Naive Bayes Algorithm

Naive Bayes एक probability-based classification algorithm है जो Bayes’ Theorem पर आधारित है।
यह विशेष रूप से text classification (जैसे spam detection, sentiment analysis) में बहुत उपयोगी होता है।

🔶 Naive Bayes का मूल सिद्धांत:

इसका आधार है Bayes’ Theorem, जो किसी घटना की posterior probability निकालने के लिए prior probability और likelihood का उपयोग करता है।

📐 Bayes’ Theorem:

जहाँ:

P(C∣X): Class C की probability दी गई input X के लिए
P(X∣C): Class C में X के आने की संभावना
P(C): Class C की prior probability
P(X): Input X की total probability

🎯 Naive Assumption:

Naive Bayes “naive” इस कारण कहलाता है क्योंकि यह मान लेता है कि:

सभी features एक-दूसरे से स्वतंत्र (independent) हैं, यानी

🔍 Types of Naive Bayes:

Type	Use Case	Feature Type
Gaussian Naive Bayes	Continuous values (e.g., height)	Numerical
Multinomial NB	Text classification (e.g., spam)	Discrete counts
Bernoulli NB	Binary features (yes/no)	Boolean

🔬 Gaussian Naive Bayes Formula:

यदि कोई feature xxx continuous है, तो हम मानते हैं कि वह Gaussian distribution को follow करता है:

जहाँ

🔧 Sklearn में Naive Bayes Code:

from sklearn.naive_bayes import GaussianNB

model = GaussianNB()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

📄 Text Classification (Multinomial Naive Bayes) Example:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

texts = ["spam offer now", "buy cheap", "hello friend", "how are you"]
labels = [1, 1, 0, 0]  # 1=spam, 0=ham

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

model = MultinomialNB()
model.fit(X, labels)

test = vectorizer.transform(["cheap offer"])
print(model.predict(test))  # Output: [1]

📌 Advantages:

✅ Fast और memory-efficient
✅ Probabilistic interpretation (confidence level मिलता है)
✅ Text और NLP tasks में अच्छा काम करता है
✅ कम data पर भी अच्छा perform करता है

⚠️ Limitations:

❌ Independence assumption हमेशा सही नहीं होती
❌ Complex datasets में Accuracy कम हो सकती है
❌ Continuous features पर Gaussian assumption जरूरी होता है

📊 Summary Table:

Element	Description
Based On	Bayes’ Theorem
Assumption	Feature Independence
Output	Probabilities (0 to 1)
Applications	Spam Detection, NLP, Document Classify
Speed	Very Fast

📝 Practice Questions:

Naive Bayes का “naive” नाम क्यों पड़ा?
Bayes’ Theorem का formula क्या है और उसका अर्थ समझाइए।
Gaussian Naive Bayes में likelihood कैसे निकाला जाता है?
Multinomial और Bernoulli Naive Bayes में क्या अंतर है?
Naive Bayes को कब उपयोग नहीं करना चाहिए?