This is Jamie Martin writing once again about a new data visualization I’ve been working on. If you didn’t see my basketball ranking data visualization a few weeks ago, you should check it out as well.
You’ve probably heard terms like search, index, mine, extract, and even harvest referring to data collection. We use the term harvest and are frequently asked why we use that term.
At BrightPlanet we have a lot of frequently asked questions that we provide answers to in our Deep Web University content. Some of our most read blog posts answer these important questions.
Education continues to be an important part of what we do at BrightPlanet so we decided to take some of our most read blog posts and share the information in short videos.
As if beating everyone taking second place in our office’s March Madness bracket pool isn’t good enough, last Saturday I had another reason to celebrate.
This is Jamie Martin writing to you. I’m a Data Acquisition Engineer at BrightPlanet. My alma mater, Augustana University, made school history by winning its first ever NCAA Division II National Championship in men’s basketball. The men’s basketball team beat Lincoln Memorial University 90-81. As a proud Viking alum, it was a blast watching them rack up 34 wins across the season.
Plus, Augustana’s winning streak inspired me to put together this visualization of the NABC Division II Men’s Basketball polls, detailing the rankings from the start of the season back in October.
Our Global News Data Feed API just got better! We’re excited to announce that we’ve added two additional features directly from the API.
- Users can identify how positive/negative entities (People, places, or companies) are mentioned within each Web page. (Polarity)
- Users can get a score rating the importance of the entity within each Web page. (Salience)
BrightPlanet Global News Data Feed API users now have a whole new set of intelligence to include in their analysis.
The advancement in hardware, cloud computing, and more efficient database and warehousing solutions has allowed for a new way of thinking in data analysis, particularly when it comes to data collection. The old way of thinking that revolves around the concept of sample size, often described as ‘n’, is a thing of the past.
Instead of having to collect an appropriate sample of a dataset, why not go directly to a source and collect the dataset in its entirety? This new paradigm of using data has forced a large number of industries to focus on exploiting large amounts of unstructured external data, which becomes significantly more difficult to manage than a structured dataset. In this blog post, we explore the noisiness in data, how it happens, and why sometimes it’s best to embrace all of the data instead of some of it.
With over 320 million monthly active users, Twitter continues to be a rich and vast resource for data on the Web. Twitter provides direct access to Tweets and user data in a number of ways.
What can you actually collect from Twitter and how can it be used? In this post, we focus on uncovering the exact data that is stored in a user profile and Tweet and how it can be used for analysis.
Is your company working with spreadsheets and trying to find a way to better manage it through visualization? Or, maybe you already use Tableau for data visualization and graphing? Either way, we’d love to have you join us for the first Sioux Falls Tableau User Group (TUG) Meeting later this month.
Not using anonymization when accessing your competitor’s Web pages can turn you into competitive intelligence. In our last post, we covered how this happens.
The question we didn’t answer is how anonymizing systems work. In the following technical post, we’ll uncover how anonymizing systems, including BrightPlanet’s, keep you anonymous on the Internet.
Companies big and small seek competitive intelligence on the biggest threats to their business.
Competitive intelligence collection typically involves harvesting large amounts of unstructured data from competitor’s websites. But what most collectors of competitive intelligence don’t know, is that by doing this, you yourself become competitive intelligence.
In today’s post, we uncover the importance of anonymization when collecting data.