Understanding Bot Blockers and Web Harvesting the Smart Way

Bot blockers are tools that monitor your website traffic and look for patterns indicating a robot is harvesting your site’s content. When a bot blocker detects a robot, the bot blocker usually drops the request, so the robot thinks that the server is not there and will discontinue making data requests.

The bot blocker algorithms are possible due to common patterns left by robots, which are made for grabbing every page as quick as possible. At BrightPlanet, we often deal with bot blockers, but we have adopted a reactive position instead of a defensive position. In this post, you’ll learn how BrightPlanet harvests data and handles bot blockers.

Continue reading »

Posted in Deep Web and Big Data | Tagged , , , , |

[Whitepaper] Open Source Intelligence (OSINT) for Fraud Prevention and Detection

Anti-fraud, money laundering and counterfeiting laws are useless without effective online monitoring. It is important to empower companies who experience loss from fraud and counterfeiting sales of their products to target and take down the offenders with the use of automated open source intelligence (OSINT) monitoring. In this free whitepaper, Open Source Intelligence (OSINT) for Fraud Prevention and Detection, discover how to use OSINT to stop online fraud. Continue reading »

Posted in Financial Industry, White Papers and Publications | Tagged , , |

BrightPlanet Presents at OSINT Automation Product Training Seminar

On February 19, BrightPlanet will present at the premier OSINT Automation Product Training Seminar in Washington, DC. This one-day event, presented by ISS World and featuring a presentation by our own Tyson Johnson, will cover everything one needs to know about OSINT, or open source intelligence. Continue reading »

Posted in Intelligence Community, Law Enforcement | Tagged , , , |

Preventing Automated Collection Through Captcha

CAPTCHA is the image that appears often containing scrambled words or letters, when you fill out a Web form and it is used as a “Completely Automated Public Turing test to tell Computers and Humans Apart.” It prevents robots from automatically submitting Web forms without any human interaction.

BrightPlanet cannot submit forms or harvest content from sites that leverage CAPTCHA because our harvester is not an actual human. In this post, you’ll learn how CAPTCHA works, and why our harvester and others do not support the harvesting of content from CAPTCHA.

Continue reading »

Posted in Deep Web and Big Data | Tagged |

BrightPlanet Launches Deep Web Data Feeds: Global News Data Feed is First Available Data Feed

BrightPlanet is excited to announce a new product line called Deep Web Data Feeds. Deep Web Data Feeds utilize a data-as-a-service (DaaS) model and give users the ability to leverage unstructured data collected from the Internet that has been enriched and structured through BrightPlanet’s harvesting and enrichment process. The Global News Data Feed is the first available data feed that is ready for access.

Continue reading »

Posted in Deep Web and Big Data | Tagged , , |

Using Big Data from the Deep Web for Financial Compliance, Fraud Detection & Fraud Prevention

In our last posting, we featured two case studies on how our customers are using Big Data from the Deep Web in the insurance and risk management industries. In this post, you’ll learn how customers in the financial service industry are using Web data to help understand exposure to risk and better develop initiatives for institutional compliance related to ‘Know Your Customer’ (KYC).

Continue reading »

Posted in Financial Industry | Tagged , , , , |

Why you should tap into the Deep Web in 2015

It’s no surprise to anyone that the growth and use of the Internet has continued to increase steadily. Over three billion people now have access to the Internet at home or 42% of the global population have the ability to directly access the Internet. The three billion people contribute to the content on the Internet by generating:

In this posting, we cover why you should tap into content on the Web in 2015. We also recap how users of Web data capitalized on harvested and enriched content from the largest known database in existence, the Internet, in 2014.

Continue reading »

Posted in Deep Web and Big Data | Tagged , , |

Using Email Harvesting to Mitigate Risk and Detect Online Fraud  

We spend a lot of time talking about Web data harvesting and collection here on the Deep Web University and often forget to talk about email harvesting and how it can benefit users.

In today’s posting we uncover how we harvest email content and explore two case studies that illustrate how risk managers and security officers are using email harvesting to mitigate risk and detect fraud.

Continue reading »

Posted in Case Studies, Deep Web and Big Data, Intelligence Community | Tagged , , , , , , , |

BrightPlanet and Mahindra SSG Team Up to Bring New OSINT Solutions for Corporate Security and Risk Management to India

Sioux Falls, SD – December 16, 2014 – BrightPlanet, a leader in providing scalable Web harvesting technology, and Mahindra Special Services Group (MSSG), India’s leading corporate security risk consulting firm, have developed a technology partnership to offer open source intelligence (OSINT) solutions for risk management to customers seeking them in India.

The technology partnership will combine BrightPlanet’s patented ability to harvest and enrich data from the Web at scale with Mahindra’s expertise in the risk management sector. The partnership will bring a first of its kind solution to the market.

Continue reading »

Posted in Deep Web and Big Data, Intelligence Community | Tagged , , |

Common Deep Web and Big Data Questions Answered – Part 2

Welcome to our second post in our two-part series which answers some of the frequently asked questions we get from visitors and customers. Last week, we posted Part 1 which focused on questions related to the Deep Web and how we get data from it. This post focuses on questions about Big Data and how we enrich and structure it from the Deep Web.

Continue reading »

Posted in Deep Web and Big Data | Tagged , , , |