bright ideas

Object stores need to be flexible, not just scalable

IT Infrastructure, Storage

|

22 Jun 2016

Light_bulbs.jpg

I've written a bit about Object Stores before, and I was interested to see that Gartner recently updated their "Critical Capabilities" review of Object Storage products, and the top five candidates finished pretty much first equal in their overall ratings.

Gartner_Graph.png

It was also interesting to see Ceph so low on the list. Ceph does seem to have a slightly negative reputation for being complicated and having a single active node for all hashing control.

Gartner also evaluated five specialist use cases and apparently the top five again came out almost equal, with a bit of place shuffling depending on the category, but well ahead of those outside of the top five.

Gartner is of course not necessarily the ultimate authority, but they have identified five clear pack leaders. Personally I have my own criteria which would help me to refine that list even further. Living in a country where object store requirements are more often measured in 100’s of terabytes rather than in petabytes, I’m less likely to be impressed by extreme scalability, and more likely to be impressed by flexibility.

Flexibility rules

One thing that I think is important in an object store is flexibility. It’s all very well being able to scale to an Exabyte, but if you can’t also scale down to 100 TB, or you can’t deliver reasonable performance for small files, or you can’t expand one node at a time, then the suitability of the solution is going to be limited to very large-scale heavily controlled requirements.

Scalability and flexibility are two of the core values of software defined storage however and I want a good measure of both. So if you’re asking me to choose between them, then I’m thinking your product might not ring my bells.

The ability to deliver both is the reason I like Cloudian. The GUI might not be as pretty as Scality’s, and the extreme scalability might not be as well proven as Cleversafe, but overall Cloudian is going to be more versatile and less frustrating to own for most people with mid-sized requirements. For example, why wouldn’t I prefer a solution that is so completely software defined that it will run on just about any commodity Intel hardware without the need to check a hardware compatibility list?

Small file performance

Object stores are optimised for capacity not latency and small file performance especially is a common bugbear for object storage systems. Cloudian deals with this under the covers with a bi-modal approach, using Cassandra for efficient handling of files up to 5MB in size, and using the Linux file system to handle larger files more efficiently than Cassandra can. This gives it a distinct advantage in the flexibility/versatility stakes.

Object_Storage.png

Unified S3 access and object store nodes

Needing different node types to scale S3 protocol access, or object hashing, separately from the actual object store is another point of potential frustration for object stores. Cloudian unifies the S3 access and object nodes so scalability and homogeneity are made simpler. That doesn’t make it unique, but it does make it simpler than Cleversafe or Ceph for example. Cloudian’s S3 protocol support is also more mature and complete than any other vendor’s. S3 compatibility is a major point of pride for Cloudian while some other object stores try to avoid using S3 if they can, preferring their own APIs, or the less functional Swift instead.

Start with 3 nodes and scale to 200+

The minimum recommended number of nodes, and the ability to expand in whatever steps you like is another area of trouble for most object stores. Cloudian will actually allow you to build a virtualised object store with a single node, but recommended best practice is to start with three physical nodes. The reason for three is that it allows you to still have object mirror protection for new writes if a node goes down. If you want to use erasure coding (RAID-like protection) then a minimum of 6 nodes is recommended (4+2).

Flexible licensing from 10 TiB to 10PiB

Another positive feature of Cloudian is the licensing model. Licensing starts as low as 10 TiB and you can buy the licence up-front (usable primary storage) and have as many copies of data (sync and async) as you want, or you can buy a subscription and pay annually for what you need. For the big guys you can also buy licences in multiples of 10 PiB if you need them.

Efficiency and security

Capacity efficiency is vital for any large object store. Cloudian delivers efficiency through back-end compression, and also has the ability to auto-tier out to AWS based on time last accessed. Each object or bucket can also be configured for encryption. Cloudian is also supported as an S3 target for both Commvault and NetBackup.

Optional NAS

While there are many use cases today for object storage, often there is a NAS requirement lurking there as well. Cloudian offers a free (unsupported) nfsv3 connector as a download, but if you want fully supported SMB2 and nfsv3 support you can buy HyperStore Connect for Files (HCF) which can run on a separate slim physical node or as a VM. HCF is also supported as an NFS target for Veeam.

An impending update to HCF is planned to deliver a local read cache with dedup and compression, making it ideal as a distributed NAS gateway back to a central Cloudian object store. There are also plans (uncommitted) to deliver FTP, WebDAV and iSCSI access.

Objects_Buckets_Files_Folders.png

In summary, while Cloudian is not necessary the best choice for every use case, it is a pretty good choice for most object store use cases from 10TB to 2PB+ where flexibility and efficiency are key. 

Jim Kelly

Author: Jim Kelly

Jim specialises in cloud consulting, defining client requirements, developing strategies and designing infrastructure solutions with a focus on cost and risk optimisation.

22 June 2016 / 1 Comment