At Datastreamer we index a lot of HTML. On an average day we index about 5TB of HTML content and write about 600GB of that to our Elasticsearch index. As part of our indexing we perform data augmentation including language detection.
But how do we scale that and provide high quality language classification without any central point of failure?
During a customer conversation today I mentioned that RSS was dead which prompted an interesting discussion as to why it died - specifically the technical reasons behind its death.
I was actually one of the inventors of RSS and one of the co-authors of the RSS 1.0 spec. I started two companies around RSS aggregation. Saying it is dead doesn’t really give me much comfort but at least we can learn from our mistakes.
Today we’re announcing the release of Datastreamer 6.5, which includes a number of new APIs and features that provide our clients with the ability to listen to, engage with, classify, as well as analyze social media content.
Earlier today we launched a major new release of Datastreamer. This has been in development for about a year so it’s really great to get it over the fence and released and in front of customers.
Datastreamer 6.0 is here. In a nutshell, we have dramatically improved our pricing for customers needing smaller amounts of data; brought online a new partner program, doubled our hardware capacity, and implemented more features. Let’s take a look…
Datastreamer is a big data and social media analytics company which provides access to massive datasets of social media, blogs, forums, and other real time and live content.
Datastreamer 5.0 has been in development for the last year and a half and today we’re now making it available to the public for the first time. This release incorporates new technology that we’ve been developing based on customer feedback we’ve received over the last eight years. The latest version of Datastreamer enables a number of compelling new features.
subscribe via RSS