DBT Project Setup¶
Fabric Workspace Setup¶
Next we will create a new dbt project and configure it to use the dbt-fabricsparknb adapter. But, before we do this we need to gather some information from the Power BI / Fabric Portal. To do this, follow the steps below:
- Open the Power BI Portal and navigate to the workspace you want to use for development. If necessary, create a new workspace.
- Ensure that the workspace is Fabric enabled. If not, enable it.
- Make sure that there is at least one Datalake in the workspace.
-
Get the connection details for the workspace.
- You will need to get the workspace name, workspace id, lakehouse id, and lakehouse name.
- The lakehouse name and workspace name are easily viewed from the fabric / power bi portal.
- The easiest way to get the id information is to:
- Navigate to a file or folder in your target lakehouse.
- Click on the three dots to the right of the file or folder name, and select "Properties". Details will be displayed in the properties window.
- From these properties select copy url and paste it into a text editor. The workspace id is the first GUID in the URL, the lakehouse id is the second GUID in the URL.
- In the example below, the workspace id is
4f0cb887-047a-48a1-98c3-ebdb38c784c2
and the lakehouse id isaa2e5f92-53cc-4ab3-9a54-a6e5b1aeb9a9
.
https://onelake.dfs.fabric.microsoft.com/4f0cb887-047a-48a1-98c3-ebdb38c784c2/aa2e5f92-53cc-4ab3-9a54-a6e5b1aeb9a9/Files/notebooks
Create Dbt Project¶
Once you have taken note of the workspace id, lakehouse id, workspace name and lakehouse name you can create a new dbt project and configure it to use the dbt-fabricsparknb adapter. To do this, run the code shown below:
# Create your dbt project directories and profiles.yml file
dbt init my_project # Note that the name of the project is arbitrary... call it whatever you like
When asked the questions below, provide the answers in bold below:
Which data base would you like to use?
selectdbt-fabricksparknb
Desired authentication method option (enter a number):
selectlivy
workspaceid (GUID of the workspace. Open the workspace from fabric.microsoft.com and copy the workspace url):
Enter the workspace idlakehouse (Name of the Lakehouse in the workspace that you want to connect to):
Enter the lakehouse namelakehouseid (GUID of the lakehouse, which can be extracted from url when you open lakehouse artifact from fabric.microsoft.com):
Enter the lakehouse idlog_lakehouse (Name of the log Lakehouse in the workspace that you want to log to):
Enter the log_lakehouse nameendpoint [https://api.fabric.microsoft.com/v1]:
Press enter to accept the defaultauth (Use CLI (az login) for interactive execution or SPN for automation) [CLI]:
selectcli
client_id (Use when SPN auth is used.):
Enter a single space and press enterclient_scrent (Use when SPN auth is used.):
Enter a single space and press entertenant_id (Use when SPN auth is used.):
Enter a single space or Enter your PowerBI tenant idconnect_retries [0]:
Enter 0connect_timeout [10]:
Enter 10schema (default schema that dbt will build objects in):
Enterdbo
- threads (1 or more) [1]:
Enter 1
The command above will create a new directory called my_project
. Within this directory you will find a dbt_project.yml
file. Open this file in your favourite text editor and note that it should look like the example below except that in your case my_project will be replaced with the name of the project you created above.:
# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'my_project'
version: '1.0.0'
# This setting configures which "profile" dbt uses for this project.
profile: 'my_project'
# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"
# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models
models:
test4:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
The dbt init command will also update your profiles.yml
file with a profile matching your dbt project name. Open this file in your favourite text editor using the command below:
When run this will display a file similar to the one below. Check that your details are correct.
Note
- The
profiles.yml
file should look like the example below except that in your case the highlighted lines may contain different values. - log_lakehouse is an optional value in the profile.
Info
You are now ready to move to the next step in which you will build your dbt project. Follow the Dbt Build Process guide.