Firstly, I am a huge fan of
Solr. It is easy to test, blisteringly fast, and incredibly stable. I would recommend it as a solution to many search problems in a heartbeat. However, despite the recent efforts of
SolrCloud I have simply found that as a scalable cloud solution, Solr is still in its infancy.
Thanks to a
useful blog article I was able to test some multi-core examples out (note: If you do follow the blog example then beware of capitalisation issues with attribute names. - e.g "instancedir" should be camel case - "instanceDir"). My major issue is that the whole approach feels too clunky, particularly for a solution such as ours which requires multi-tennancy and therefore a large number of cores. At the time of writing SolrCloud doesn't seem to have a nice way of discovering other nodes in a cluster. It all works but it isn't slick.
ElasticSearch on the other hand has been designed for the cloud. James Cook has written a brilliant
tutorial on running ElasticSearch on EC2. In just a few hours I had created some test EC2 instances running ElasticSearch.
Discovery of other ElasticSearch instances was trivial. The cloud-aws plugin allows for several options, e.g security groups or tags. When I stopped the "master" instance, the "slave" instance noticed and took over as master. Asynchronous backup to EC3 just worked.
Multi-tennancy has been made really easy too - If you post to an index that doesn't exist, it gets created. Trivial to get going and so far very impressive.