How to Stream Data From Firebase to BigQuery Easily
You must have installed the Stream Firestore to Big Query Extension and maybe even read some of the docs but you still can’t quite figure out how to get everything to work. This guide will help you with that.
Have data in Firebase Firestore and need to bring them into Big Query? That’s a little straightforward. You simply export the data into a bucket and pull the data into your Big query console. There are a few articles online that explain this well, particularly this.
But, when it comes to streaming data in realtime from Firestore to BQ, I couldn’t find any good articles there that could get me from start to finish without errors — so, I wrote one.
This guide assumes you have an existing Firestore instance setup and one or more collections in your Firestore DB. Next, within your Firebase console, navigate to the extensions screen. Search for and install the Stream Firestore to BigQuery extension.
Once in, set up the extension by following the prompts.
NB: You’ll only be able to use the extension if your workspace or account has a Credit card on file.
Keep hitting next till you get to Step 4: Configure extension.
Set your Project Id, Collection Path, Dataset ID and Table ID. Project ID should be the same as whatever you have named your firebase project. Collection path refers to the primary collection you would like to stream to BQ. In our case, we want the users collection. Dataset ID and Table ID can be named as you wish. Keep all other configurations as default and hit the install button.
The extension will take around 3–5mins to install.
At this point, there’s no new data in your BQ console just yet.
While waiting on the extension’s installation to complete, you can begin the process of setting up gcloud CLI. You’ll need to set up the CLI in order to authenticate when running the import script that will take your data from Firestore to BQ.
Follow the instructions here to set up the CLI for your device. Don’t forget to make sure the CLI is in your PATH
by running (on a Mac) — in your HOME directory.
./google-cloud-sdk/install.sh
Restart your terminal instance before continuing.
With the CLI setup. Run the command below in your terminal window to authenticate.
gcloud auth application-default login
The command will bring you to the Google sign-in screen. Authenticate using the E-mail that has access to the Firestore project you’re working with.
You’ll see a screen like below if everything goes fine.
Your terminal should also have a message like below
Google has an import script that will allow you to automatically bring all the data in your Firestore collection into BigQuery by running one command. This script works alongside the Stream Firestore to BigQuery extension. Run the command below in your terminal to trigger the script.
npx @firebaseextensions/fs-bq-import-collection
P.S The command above will only work if you have Node/NPM installed.
Follow the prompts starting with entering your project ID. You only need to enter the custom information we set earlier — project ID, Dataset ID, Table prefix and Collection Path. See screenshot below.
The script will create a raw_changelog
file in the format table_name_raw_changelog
. Once you navigate to your BigQuery console, you would be able to see this.
You can also run queries on your data using the Compose a new query button
Now, you can add a new document to the users collection, refresh your table within BigQuery and observe that the new document reflects in your BigQuery query!