For the capability that we are looking for, Elasticsearch is the leading technology and that's the main reason for this choice.
Some of the prominent reasons why Elasticsearch has been chosen as the technology of choice for the technical capability that we are looking for in our Data Lake implementation:
- Compatibility with Hadoop (as this is our persistent store)
- Distributed
- Scalable
- Capability of indexing data
- Highly performant (fast query and search)
- Battle hardened technology (enterprise-grade having all the capabilities required by an enterprise)
- Capability of handling a huge volume and variety of data
- Failover and data redundancy capability
Compared to other technologies in the Big Data arena, Elasticsearch is more recent with a short history. This doesn't mean that the technology is immature; rather, it is one of the mature (70+ Million product downloads) and well adopted technologies with a vibrant community (70,000+ community members) backing.
In...