Welcome to the second post in BrightPlanet’s three part series that follows completely unformatted, unstructured web pages through the three step process that data follows to be made into actionable intelligence.
- Stage 1 – Harvesting
- Stage 2 (this post) – Normalization / Enrichment
- Stage 3 – Reporting and Analytics
In our last blog posting, we covered the first stage, harvesting. The post talks about how BrightPlanet harvested over 100,000 news articles from the top 50 newspapers using the Deep Web Harvester. In this post we’ll talk about the second stage, normalizing.

