Cloud Dataproc


Dataproc is a managed Hadoop service. If you use the Hadoop ecosystem, then you know that it can be complicated to set up, involving hours and even days. Dataproc can spin up a cluster in 90 seconds so that you can start analyzing the data quickly.

HDFS: Hadoop data file system. Cloud storage was originally named DFS
(distributed file system)

HBASE: open-source, NoSQL, distributed big data store. Runs on top of HDFS. The GC equivalent is Bigtable