← Projects
Completed Research

Fake News Detector

View on GitHub ↗
PythonTensorFlow / KerasLSTMNLPpandasNumPy

About this project

Recurrent neural network model for automated fake news detection from article text. Trained on labelled news corpora using LSTM architecture to classify credibility at inference time.

Background

This was an exploration of text classification using sequence models, motivated by the question of whether surface-level linguistic patterns in news articles are predictive of credibility. The hypothesis is that manipulated or fabricated content has systematic stylistic signatures — specific vocabulary choices, sentence structures, or narrative patterns — that a trained model can learn to detect.

LSTM architecture was the right choice for this problem because news articles are sequences where long-range dependencies matter — the relationship between the opening framing and the conclusion carries information about credibility that a bag-of-words model would miss. The text pre-processing pipeline handles the noisy input typical of news text: tokenisation, stop-word removal, and embedding to convert text into the vector representations the network needs.

The honest conclusion from this project is that the problem is harder than the benchmarks suggest. A model that performs well on labelled datasets built from known fact-checked sources doesn't necessarily generalise to novel misinformation. The lexical patterns associated with unreliable sources in 2019 are not identical to those in 2024. That limitation is important context for anyone considering deploying a classifier of this type in production: it's a useful signal, not a definitive oracle.

Highlights

  • LSTM-based sequence model capturing long-range textual dependencies
  • Text pre-processing pipeline: tokenisation, stop-word removal, embedding
  • Binary credibility classification with confidence scoring
  • Evaluated against labelled fake news benchmark datasets
← All projects GitHub ↗
← Retail Network Segmentation Model SentimentScope →