Cache Architecture - Part II - Membase Servers - Implementation and Configuration
In Cache Architecture - Part I - Choosing a Provider for The Knot, Inc. we discussed how The Knot, Inc. chose a cache provider and our history with memcached. We also discussed the reasons for choosing Membase and what Membase does for us at The Knot.
In part II, we will discuss how we actually set up the Membase instances, some of the decisions we made when creating a cluster and our installation scripting.
Hardware – Bare Metal vs. VM
Our first inclination was to place our cache infrastructure on VMs in our VMWare ESX cluster. After talking it over we decided to go with Bare Metal boxes for the following reasons
· Additional overhead when reading and writing from the RAM
· Additional cost for ESX-level RAM and hardware
· Additional cost for ESX licensing above Membase and Server OS licenses
After deciding to go for bare metal boxes we investigated options for hardware. Originally we planned to order HP DL360 series G7 with Intel Xeon X5650 processors and 6 hotswap HD bays. What we realized was that the DL360 Intel required DDR3 RAM, which means RAM configuration in groups of 3 which means buying in groups of 3. In addition, the max RAM per server was 192gb. Not that we ever think we will reach this level, but we wanted servers that could reach the maximum limit on RAM. The final configuration we went with was 3 Bare Metal boxes with the following configuration.
· HP DL165
· AMD Operon 6172
· 1U Chassis
· 2x300GB HP SAS 10K HD
· 4x16GB HP PC3-8500R-7 (64gb total)
We also had 3 relatively new boxes laying around in a different rack, so we rebuilt those as well for a total of 6 Cache boxes in a 5+1 configuration. If one of the 5 boxes were to fail, we could bring the 6th box up in just under 17 minutes (we timed it).
Each DL165 has 16 RAM slots, at 16GB sticks, that would put us at 256gb max per-server. Provided Membase and the OS can address that much RAM, we would have a whopping 1TB of cache.
Windows vs. Linux
When we were deciding which cache solution to use and we had almost decided on standalone memcached, we anticipated having to build out the cache cluster on a Linux-based environment. Since our shop is mostly .NET, our depth of expertise is in managing and administering Windows Server environments. Once we understood that Membase releases Linux and Windows flavors together, it was an easy choice for us to go with our expertise and use Windows servers.
Memcached vs. Membase Bucket Type
Although NoSQL and document-databases are quickly gaining ground and reputation, our practice is built around RDBMS, specifically SQL Server. We have backups, reporting, monitoring, and configuration management for our SQL Server environment. What we needed was not a persistent data store but a transient cache. We wanted the cache to provide fast, fault-tolerant access to READ data from within our app server environments, but use a more traditional ACID structure for our data environments.
Environments and Buckets
We divided our apps into 10 buckets. Each bucket corresponds (generally) to one of the major platform apps within the company:
· Commerce – Online Store
· Community – Forums, Blogs, Public Profiles, Friends
· Editorial – Real Weddings and Content Slideshows
· Enterprise – Internal Sales and Account applications
· Galleries – User Photo Galleries
· Local – Wedding Vendor Directories
· Membership – Shared cache for user information
· Registry – Universal Registry system
· Tools – Member Tools (Guest List Manager, Wedding Checklist, Baby Name finder)
· Websites – Personal wedding and baby websites
We have a traditional 4 environments in our application flow:
· Development (1 Membase Server VM)
· QA (1 Membase Server VM)
· Staging (2 Membase Server VMs, clustered)
· Production (5 Membase Servers Bare Metal, clustered)
Our full configuration table looks like this (in MB):
Per Server Total/ Cluster Total
32gb vs. 64gb RAM Quota
On the last day before production set up, we realized that the old boxes only had 32gb RAM but the new boxes had 64gb. Membase requires all instances of Membase server to have the same size Quota. So we had to decide quickly: did we want to set up 3 boxes at 64gb and wait until we could add 32gb to the new cluster or did we want to set up 5 32gb nodes and upgrade all 5 at the same time later on.
After a lengthy email chain, we went with the 32gb servers. We decide that given the amount of time it takes to set up a Membase server (more on that later) that when/if we needed to jump to the 64gb quota, we could do so very quickly. It was better not to buy the extra RAM until we had to. Also, our configuration needs did not require the extra ram.
In addition, we can now expand the cache horizontally as we retired our app servers, since we usually build those with 32gb.
Once the hardware arrived, our crack Systems Engineering team got the Win2k8 R2 Images up in less than 1 day for all 6 servers. From the time we got into the Windows RDP to the time Membase was ready for production was … wait for it … 20 minutes. And I have the tweets to prove it.
I do admit that we “cheated” a little by scripting out the Bucket creation using the CLI. Here is our script
view plaincopy to clipboardprint?
membase bucket-create -c 127.0.0.1:8091 --bucket=Commerce --bucket-type=memcached --bucket-port=11111 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Editorial --bucket-type=memcached --bucket-port=11112 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Membership --bucket-type=memcached --bucket-port=11113 --bucket-ramsize=4096 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Local --bucket-type=memcached --bucket-port=11114 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Enterprise --bucket-type=memcached --bucket-port=11115 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Community --bucket-type=memcached --bucket-port=11116 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Galleries --bucket-type=memcached --bucket-port=11117 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Websites --bucket-type=memcached --bucket-port=11118 --bucket-ramsize=4096 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Registry --bucket-type=memcached --bucket-port=11119 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
membase bucket-create -c 127.0.0.1:8091 --bucket=Tools --bucket-type=memcached --bucket-port=11110 --bucket-ramsize=2048 -u Administrator -p XXXXXXX
For the initial set up, we are monitoring the Membase console on a daily basis and we set up:
· Nagios alerts for CPU, RAM Spikes
· Nagios memcached alerts using the Nagios plugin:
Although it is on the Membase roadmap to provide the ability to Flush a cache directly from the Membase console, as of yet that feature is not implemented. Membase suggests using a standard Flush command from Memcached interface so we installed phpMemCacheAdmin to provide basic Memcached stats as well as the Flush command from a GUI
I wish that were it…
Standing up the Membase servers was actually the easy part. Because Gear6 uses IP addresses to separate buckets and Membase uses ports, we still had to update all our configuration files. In Part III, I will discuss:
- Application Stack
- Memcached Clients (yes, more than 1)
- Cache Migration Process
- Future Plans