Social media search and analytics

Full-text search over 10B posts from weblogs, forums, mainstream news, comments, and social media via a simple and easy to use API including analytics, NLP, sentiment analysis, support for multiple language, and gender detection.

Search over 60 days of posts on ultra-fast SSD with query times less than 500 milliseconds.

Hosted in the cloud

Don't spend tens of thousands of dollars per month in hardware and engineering costs to host your own search or social media monitoring platform. We already have a huge cluster deployed that works now!

ALL social media in one API

We index and store all Datastreamer content in one full-text search API. This allows you to work with all social media including , weblogs, microblogs, mainstream news, comments, and forums. within one source. This removes the need to implement multiple vendors APIs, and avoids the complexity and cost of multiple content sources.

Based on Elasticsearch

Our full-text search API is based on Elasticsearch. This means you get the full power of the Elasticsearch API and Apache Lucene including filters, aggregations, complex boolean logic, etc.

This also includes a hosted Kibana install which provides a UI providing aggregations, visutlaizations, and analytics.

Simple JSON over HTTP API

Our API is exposed as JSON over HTTP - the easiest API known to man! This will allow you to specify more advanced features including aggregations, filters, etc. Additionally, the results are just JSON so parsing the documents and integrating them into you workflow should be straight forward.

Example search

Here's an example request searching for Ebola within mainstream news. Notice that the query uses a simple domain specific language for fielded/boolean search.

source_publisher_type:MAINSTREAM_NEWS AND Ebola

Aggregations and analytics

We support integrated aggregations and analytics. Searches can return the top 100 documents and you can also group by top tags, by language, bucketed into time windows, etc.

Example search for 'Scalia' after his passing in February 2016.

Long-term archives

We maintain 30 days of content on hyper-fast SSD drives -- supplemented with infinite historical archives - currently at 8 months. Queries on Datastreamer execute in less than half a second. Our infinite archives on HDD are efficiently maintained for longer retroactive reporting.

Hive / SQL queries

Need to perform bulk data analysis on large amounts of content? We have 9 months of content in our Apache Spark and Hive cluster giving you access to a raw bulk data and compute platform for performing large batch computations.

Indexed in real time

We index content in real-time with documents waiting less than a 30 seconds to make it into our full-text search index.

Search by inbound links

You can search for all inbound links to a specific URL. Find all posts from a certain site or all posts from a specific domain. You can also filter the links by other fields including gender, sentiment, etc.

