We have been using diffbots for about two years now at my current job, and I haven't had any issues with it aside from some minor lag in page load times when there was a lot going on or if you're loading multiple pages within one session.
It does what we need it to do without us having to worry too much about it; all we really have to do is set up our content sources and point them toward our diffbots instance (which is hosted through AWS). The only other thing I can think of would be getting more advanced features like custom fields, but it looks like those may not work well with S3/SES so we stuck with standard options. Our main use case for this product has been to help keep track of social media mentions and trends related to our brand and projects.