Quantcast
Channel: Xoriant Blog » NoSQL
Viewing all articles
Browse latest Browse all 2

ReDiS – A Magnificent NoSQL Key – Value Data Store

$
0
0

Day-by-day our world is becoming more & more technology dependent. And the ‘Internet’ i.e. ‘World Wide Web’ is the key resource in connecting people sitting in far-off geographic locations. The web platforms like: Google, Facebook, Twitter, and LinkedIn etc. are increasing our social presence & multiplying our interactions in the form of hundreds of comments on Facebook walls or thousands tweets at Twitter or so. This exponential growth is challenging technology world to handle such billions of billion bytes of information floating around on web. The challenge is not only to maintain such unstructured data but to access it in the fastest manner as well. And the solution to this is NoSQL databases.

What do you mean by NoSQL database?

A ‘NoSQL’ database also known as Not Only SQL provides a mechanism for storage and retrieval of data that is modeled in a way other than usual tabular relations used in relational databases i.e. data stored in tables, identifiable by primary keys & associated with each other via foreign keys. These databases supports both SQL as well as non – SQL query language. The NoSQL database has following advantages over using RDBMS [Relational Database Management System]:

  • Makes the schema design process simple i.e. no need to take care of normalization etc.,
  • Minimizes the cost of horizontal scaling (increasing nodes/servers as per load) and
  • Improves performance by facilitating faster data access.

These days NoSQL databases like ReDiS, MongoDB and Couchbase, etc. are gaining popularity in big data and real-time web applications.

What is ReDiS?

ReDiS is a NoSQL database. You may be surprised to see the way I have written the word “ReDiS”, camel-cased. Actually this camel-casing is inspired by what it means “Remote Dictionary Server”. Most of you must be aware of what the words ‘remote’ & ‘server’ stand for, in technical world but few of you might have confusion about ‘dictionary’. Here dictionary refers to a container having key-value pairs where key would serve as a path to reach at the value. In fact one can easily compare it with data structure Map.

In simple words, ReDiS is an open-source in-memory key-value data store written in ANSI C. It offers persistence storage with optional durability. Though it is described as a key value storage engine, it would not be incorrect to see it as a data structure server which supports the keys that can contain wide range of data-types as value(s).

ReDiS key(s), are binary safe. Any binary sequence can be used as a key from a string to the content of a JPEG file. The empty string is also a valid key. The maximum allowed key size is 512 MB.

Supported data-types for the ReDiS value(s) are:

  1. Strings: Binary-safe strings.
  2. Lists: Collections of string elements sorted according to the order of insertion.
  3. Hashes: Maps composed of fields associated with values. Both the field and the value would be strings. Using a hash, values can be assigned to fields in each key. Think of it as a Java HashMap implementation.
  4. Sets: Collections of unique & unsorted string elements.
  5. Sorted-sets: Sets having unique string elements associated to a floating number value called score. The elements would be sorted based on the weightage of their scores or lexicographical order.
  6. Bitmaps: String values can be handled like array of bits using special ReDiS commands. The operations on individual bits or group of bits can be performed. Bitmaps are an extremely space saving way to store information.
  7. HyperLogLogs: A probabilistic data structure used in order to estimate the cardinality of a set (count of unique elements).

All atomic operations could be performed on the values as per their type like string-append, incrementing the value in a hash, pushing an element to a list, intersection or union of sets.

Who all are using ReDiS?

Flickr, Twitter, StackOverflow, Craigslist, Pinterest, GitHub, SnapChat, etc.

Why one should use ReDiS?
  • Data Replication: by supporting multilevel master-slave cluster set up
    • Replication supports multiple & multilevel slaves.
    • Automatically restores & partially (re)synchronizes the master-slave connections on outrage.
  • Persistence could be achieved in 2 ways:
    • Snapshotting: perform point-in-time snapshots of your dataset at specified intervals (configurable) in a binary file called ‘dump.rdb’. It is configurable to take snapshot at specified intervals or can be done manually by executing ‘SAVE’ or ‘BGSAVE’ commands.
    • Append Only File (AOF) & Log Rewriting: every time ReDiS receives a command that changes the dataset/value, it will append it to the AOF. When you restart ReDiS it will re-play the AOF to rebuild the state. To optimize the AOF & produce the shortest sequence of commands to rebuild the current dataset in memory, ‘BGREWRITEAOF‘ command is useful.
    • User can disable persistence to have better performance.
  • Publish-Subscribe Interface: designed over the paradigm of (high) decoupling, it facilitates user(s) by providing a ready to use publication & subscription interface.
  • Auto Cleanup with key expiry (time to live): the ReDiS keys could be configured/set to automatically expire on specified time/interval, thus to clean up previous jobs/values & keep limit on data storage.
  • LRU (Least Recently Used) Capabilities: the ‘maxmemory’ directive is used in order to limit the memory usage to a fixed amount. On reaching at max memory usage, it evicts the keys as per specified LRU policies among 6 available options.
  • Automatic Failover & Highly Available (HA): it supports clustering via ReDiS Sentinel, a system designed to monitor cluster-instances, notify user if something goes wrong, handle failover by electing new master and act as configuration provider.
  • Partially Transactional: it supports partially ACID (atomic, consistent, isolated, durable) transactions. A transaction would start by executing MULTI command and flush then exit by executing DISCARD. The transaction supports CAS (check-and-set) behavior but can’t roll-back.
  • Ease of Use: easy to use interface/commands. It supports high-level, atomic & server-side operations (no need to write code) like intersection, union between sets or Sorting of lists, sets etc.
  • Great Performance: when get/set keys during heavy load time. Though it works in single-threaded model, it offers faster read/writes than any RDBMS due to its in-memory nature.
  • Database as a Service: it is deployed & available as a managed service on all leading cloud providers like Amazon & Rackspace etc.
  • Partially Secure: it is designed to be accessed by trusted clients inside trusted environments only. While it does not try to implement Access Control, it provides authentication layer that could be turned on by editing the ‘redis.conf’ file. In secure mode, it is required by client to authenticate itself by sending the ‘AUTH’ command followed by the password before executing any data request(s).
  • List of Supported programming language: C, C#, JAVA, PHP, Python, JS, Perl, Ruby, Scala, etc.
  • Scripting Support: it supports LUA Scripting. LUA is a cross platform & light weight language
What you can’t have while using ReDiS but RDBMS (Relational Database)?
  • Data Processing Limit: it is an in-memory data store & thus limits the size of data to be processed to the available & occupied memory size.
  • No SQL Support: it supports only commands instead of query language. Thus limits user to execute ad-hoc queries (like you can using SQL on a RDBMS). All data accesses should be anticipated by the developer, and proper data access paths must be designed.
  • Partially Transactional: it doesn’t support ACID completely i.e. rollbacks could not be done on transactional failures.
  • Partially Secure: it offers basic level of security only, in-terms of access rights. Also it does not support encryption.
  • Single Threaded: it works on single threaded model & thus to scale it several ReDiS instances would be required to run in, as cluster.
Do you want to begin with it?

Instead of supporting SQL, it supports (no SQL) commands to facilitate users with fastest, shortest & easy to learn operational language. On execution, commands return ‘1’ or ‘OK’ for success and ‘0’ for failure/no-data-exist.
Few example commands:

  • To insert: SET mykey myvalue
  • To retrieve: GET mykey
  • To set keys to expire in specified seconds: EXPIRE mykey time_in_seconds
  • To store the value: DUMP mykey
  • To restore the value: RESTORE key_name dumped_value
Are you interested to know more?

Click to visit ReDiS source of my inspiration.

And last but not the least; do not forget to post your queries. I would love to hear from you and would help you to sort out the problems.


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images