Datastreamer 5.0 Released

Datastreamer 5.0 has been in development for the last year and a half and today we’re now making it available to the public for the first time. This release incorporates new technology that we’ve been developing based on customer feedback we’ve received over the last eight years. The latest version of Datastreamer enables a number of compelling new features.

For starters, we’re releasing full-text search support based on Elasticsearch and Kibana. It’s essentially a fully hosted Social Media Monitoring and search API indexing weblogs, mainstream news, and social media.


Improved Firehose API

Additionally, we’ve expanded our core firehose API to include better filtering, much higher parallelism, and easier integration with 3rd party systems.

We've spent a great deal of time trying to make the API easy to use out right of the box. We implement 95% of the heavy lifting for you and provide a client that works right away with massive scalability. Essentially, all you have to do is download a client, and write a small piece of code to import the data into your system.

New Indexing Technology

We completely rewrote some major portions of our indexing technology. Not only do we support RSS/Atom and 3rd party APIs, but we also support raw HTML5 and JSON. This allows us to be pragmatic about our content indexing, which essentially allows us to index any type of content on the web.

5.0 is incredibly good at indexing new types of content, which may or may not have APIs or feeds. If there's a data source that you want to index, one that may may be difficult (or too expensive) for existing providers, you should definitely talk to us.

All this is available on a platform that’s proven to scale north of 90M weblog posts and 45M mainstream news posts per month.

New Admin Console

We've redesigned our admin console for our clients. Integrated stats for our crawler are now front and center for our users.


JSON, JSON, and more JSON

We reworked our document object model, and now use a new document format based on JSON. This allows us to easily extend our API as the web changes and we index new types of content.

This provides easy schema modification as the web and metadata schemas evolve. This proves helpful when provisioning new types of data that may have new fields which prove valuable to our customers.

Let's get started

So let's get started. We'd love your feedback and can set you up with an eval when you're ready.