Random ID generation for distributed systems

Vishnu R
4 min readJan 28, 2019

A problem faced by many distributed systems is how to generate an ID that is unique across all the machines in the system. We discuss some common approaches in this post.

Some of the requirements can be:

  • Time-sortable — Once the IDs are generated, they may be required to be sortable by time (say to order photos by time).
  • Distributed — While some systems may use a single (or group of) server/system(s) for simplicity, some systems would prefer not to have such an implementation to avoid a single point of failure.
  • Scalable — no matter how many servers are added to the system, this should work with minimal to no changes.

DB Auto-increment

(Or DB Ticket servers)

Simple systems using a database can take advantage of the auto-increment feature available in a lot of databases. (For example, MongoDB’s Object ID is 12 bytes long). First, looking at the advantages to this

  • Databases are well understood and a lot of engineers can use some database very easily.
  • Will be globally (globally => system wide) unique.
  • Will be monotonically increasing.

There will be disadvantages to this also

  • Some businesses may be reluctant to move application logic to a database (what would happen if they move to a different database system?).
  • Must depend on the database provider’s guarantee to be globally unique (There may be some edge cases)
  • Single point of failure (even if the database is replicated in some form of master-master setup like in Flickr’s system). At the very least, this can introduce a bottleneck in the system where the application servers need to query the databases every time they need to generate an ID.
  • More complexity to the system — databases need managing. Also needs messaging, etc. to be implemented for the ID query channel.

Ticket Server

Ticker server (DB servers follow a similar layout)

This is basically an extension/modification of the above DB based ID generation systems. Here, some server(s) function simply to serve unique IDs. Similar advantages and disadvantages as above exist. While, this may be a simple implementation, this will still add to the problem of having to maintain and manage additional servers.

There are systems like Twitter’s Snowflake service which aim to solve some of the issues mentioned above (like SPOF/bottlenecks, etc), but this may bring in other complexities like having to run ZooKeeper to co-ordinate nodes. This may still work in systems where ZooKeeper is already up and running and there is a willingness to maintain it.

UUID

There are several ways to generate UUIDs. A standard way is to use the logic presented in the RFC. This uses a combination of timestamp/random-number + node ID (MAC address usually) and other information. But, this is usually 128 bits (this may be subjective, but some systems prefer a 64 bit ID).

Also, depending on the logic used, it may not be time-sortable. But, this is almost always guaranteed to be globally unique. So for systems where the above mentioned disadvantages are not an issue, then using UUIDs is a very good idea.

Custom solutions

While all the above solutions are perfectly valid under the right circumstances, there are other custom solutions which aim to provide a completely distributed solution with as few moving pieces as possible.

An example is that used by Instagram to generate a 64 bit ID.

41 bits for time in milliseconds (gives us 41 years of IDs with a custom epoch)

13 bits that represent the logical shard ID

10 bits that represent an auto-incrementing sequence, modulus 1024. This means we can generate 1024 IDs, per shard, per millisecond

Another example is Boundary Flake. As described by their github page:

Flake ids are 128-bits wide described here from most significant to least significant bits.

64-bit timestamp — milliseconds since the epoch (Jan 1 1970)

48-bit worker id — MAC address from a configurable device

16-bit sequence # — usually 0, incremented when more than one id is requested in the same millisecond and reset to 0 when the clock ticks forward

Such implementations have the advantage of being time-sortable, distributed (each note can generate its own ID independently as long as configured correctly) and simple to use. Some care has to be taken to understand this (especially the case when a local clock is used and it is rolled back). But, if done right, this can scale well and little to no configutation/change is needed when new servers are added to the system.

--

--