big data

Google F1 Database: One Step Closer To Discovering The DB Holy Grail


Fari Payandeh





Sept 15, 2013

Fari Payandeh

Google recently replaced its AdWords MySql Database with a Database that they built in-house namely F1 Database. AdWords serves thousand of users, ” which all share a database over 100TB serving up hundreds of thousands of requests per second, and runs SQL queries that scan tens of trillions of data rows per day,” Google said.

After reading Google’s paper on its F1 Database (not open source), I started thinking about its ramifications for Databases in general and Big Data in particular. Google F1 Database paper might trigger new initiatives that eventuate in materializing the phantom (next paragraph). The paper mentions few challenges with F1 DB that need to be addressed. I came away with two lingering issues. First, there is no mention of security. Secondly, it states, “Hide RPC latency, Buffer writes in client, send as one RPC”. What will happen if the network connection between the client and the Database goes down? Will the data be lost? This is a serious problem for operations that need to commit as fast as possible; Airline reservation is one.  I probably misunderstood.

The system resembles a hybrid between Relational and Hierarchical (think mainframe) Databases. What is the Holy Grail  in the Database world?  Relational Databases (RDBMS) are like high-rises comprising many apartments.  What if there are no vacancies and people have lined up to rent from us. The way RDBMS has handled the demand is by adding more floors on top of the high-rise. It is expensive and slows down the day-to-day operations. A new technology (NoSql) emerged a few years ago and solved the space allocation problem. Instead of building new floors we place the tenants in inexpensive houses. Once we run out of vacant houses we give the tenants new houses. The downside? It makes managing the place more difficult and  we might unwittingly  reserve the same house for two different individuals. There are ways to prevent that, but it’s a perplexing task and it places a lot of pressure on the engineers who design the housing complex. The Holy Grail is to discover a method by which we  can combine the best of both worlds and remove the negative.

Following Google’s invaluable tips in the paper, no doubt some engineers are working hard to figure out how to build an F1++ Database. What if they succeed? What will happen to NoSql and NewSql if they produce an open source Database System? The confluence of several forces that are currently shaping open source, Big Data, Mobile, and Cloud technologies might in time make NoSql and the existing NewSql irrelevant– flash-aware applications, shared-nothing architecture, Mapreduce methods, software-defined storage, in-memory computing, shared virtual storage array networks, new compression algorithms, atomic writes, horizontal scalability, software-defined networking, columnar technology,  progress in fault tolerance, database sharding, and solid state drives.

There is one very powerful force that in my view will keep NoSql alive and well for years to come and that is the power of developers. The genie is out of the bottle and all the nuclear fusion combined in the world cannot put it back in there. Speaking from personal experience as a Developer/DBA, I know that developers hate roadblocks. Once they start on something they like to continue working. To get them away from what they are deeply involved in is like taking a pacifier from a baby. For the first time in history, they can get on their generally free and open source bikes and run without the hassle of calling the DBA’s to open the gates for them every 40 miles. NoSql pushed the Database inside the developers’ world and they love it! Is it good for the industry? Perhaps not, but it might just create millions of programming jobs. After all, somebody has to untangle the convoluted code (not to the fault of developers) left behind. Separation of Database and code, as painful as it might be for developers is a necessity. It establishes checks and balances. According to Google’s paper, they have taken those factors into account. Google F1 is a developer friendly Database. Hopefully the trend will continue.

From Google:

F1 is a distributed relational database system built at
Google to support the AdWords business. F1 is a hybrid
database that combines high availability, the scalability of
NoSQL systems like Bigtable, and the consistency and us-
ability of traditional SQL databases. F1 is built on Span-
ner, which provides synchronous cross-datacenter replica-
tion and strong consistency. Synchronous replication im-
plies higher commit latency, but we mitigate that latency
by using a hierarchical schema model with structured data
types and through smart application design. F1 also in-
cludes a fully functional distributed SQL query engine and
automatic change tracking and publishing.

2 replies »

  1. Distributed transactional SQL databases promise to solve scale-out, high availability, geo-distribution and a range of other important database problems. They are hard to build, and F1 adopts one of the three traditional models for doing this (synchronous commit). They can do that as they have infinitely fast networks and atomic clocks on each machine. NuoDB does the same thing but it is a downloadable product that can run on your own machine, on public clouds, or on a combination of the two. No doubt NuoDB is the first of many such products – usually when a major player like GOOGLE publishes a paper like this there are a number of startups that set about building one. Check out NuoDB at http://www.nuodbcom

  2. Hi Barry,
    NuoDB, ParStream, and VoltDB have been on my radar screen. The key phrase in my post is “open source”. Just as Hadoop (open source) is dominating the market for inexpensive, scalable, and fault tolerant distributed file systems, the potential for emergence of an open source Database System similar to F1 is there.
    I personally wish all of you folks success because you are making a difference. Nevertheless, the fact is that the competition in the space you are in is ferocious.


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s