December 19, 2014
Migrating backend search technologies on a high-throughput production site is no easy task, but Vector Media Group was recently faced with this decision. With a popular client site struggling under the load of complex MySQL full-text search queries, they recently switched to Elasticsearch.
I spoke with Matt Weinberg to learn how the migration went. Was the switch to Elaticsearch worth the effort?
We created a custom search using MySQL queries and implemented it into our CMS for the project, ExpressionEngine.
To support full-text search, we needed to use the MySQL MyISAM storage engine. This has major downsides, the primary one being full table locks: when a table is updated, no other changes to that table can be performed.
Our tables have considerable update activity, so this would result in sometimes-significant performance issues.
We ended up doing this. It was a fairly simple step and allowed us to switch to the InnoDB engine on the master, eliminating the table lock issues.
This bought us some time, but it wasn't a long-term solution: we basically were rolling our own search and this frequently involved complex queries that third-party search libraries could perform more efficiently. We ended up with massive queries composed of many JOINs plus AND/ORs - these aren't easy to maintain.
Besides query complexity, it's tough to beat the performance of a dedicated search solution.
We also considered Solr and Amazon CloudSearch. However, we started with Elasticsearch as we heard really great things about it. Elasticsearch met all of our needs in our early experiments so we decided to continue with it.
We started with a list of common search terms and result expectations and ensured the results looked solid to our team and to our client. Elasticsearch shined: in many cases, the results returned from Elasticsearch were more relevant than those from MySQL.
We rolled over to Elasticsearch in three stages:
About 2-3 months, including investigation time, tool building, and implementation.
Definitely. Facets/Aggregations are much much faster now than the MySQL way we were doing it before.
Our MySQL master and slave are 8 core, 52GB memory machines (each). We have a single Elasticsearch box that is a 4 core, 15GB memory server and search performance is much better than standard MySQL queries. While we've continued to upgrade our MySQL hardware as the app's usage has grown, we haven't needed to bump up our Elasticsearch resources.
We'll almost definitely scale vertically for as long as we can before adding new nodes. While Elasticsearch makes adding new nodes easy, we prefer the ease of administration, monitoring, and deployment (along with a lower surface area for issues) that having fewer, larger nodes affords.
The search query syntax/API is really easy to work with and the client library support is great. Much better than massive MySQL SELECT queries.
Scout provides three plugins for monitoring Elasticsearch:
Elasticsearch Cluster Status tracks key cluster health metrics and alerts when the status changes.
Elasticsearch Index Status reports key metrics on an Elasticsearch index.
Elasticsearch Node Status reports stats on a specific node in an Elasticsearch cluster.
Vector Media Group is a 24-person interactive agency based in Manhattan. They specialize in web and mobile development, design, and online marketing. Their clients range from large Fortune 100 companies to small startups and everything in between. They are well-known for a variety of work, including their significant ExpressionEngine experience.