Skip to content

Databricks with Visual Studio Code

Continuing from my post last week about the install and configuration of Databricks Connect I am continuing with the steps required to install Visual Studio Code in which you will write commands/scripts.

Firstly a definition is in order, Visual Studio Code (VSC) is a lightweight source code editor which is available for Windows, macOS and Linux. It differs from Visual Studio in that it is not a full IDE.

Install

We need to carry out two steps:

  1. Install Visual Studio Code if it is not already installed.
  2. Install the Python Extension for Visual Studio Code (VSC).

FYI, this post is based on the following version of VSC:

Version: 1.46.1 (user setup)
Commit: cd9ea6488829f560dc949a8b2fb789f3cdc05f5d
Date: 2020-06-17T21:13:20.174Z
Electron: 7.3.1
Chrome: 78.0.3904.130
Node.js: 12.8.1
V8: 7.8.279.23-electron.0
OS: Windows_NT x64 10.0.18363

Configuration

Ok assuming you now have VSC installed and the Python Extension we need to configure the setting python.venvPath to tell VSC where the Python environment is located. This achieved by running the following from the command line to obtain the location of the Python environment:

databricks-connect get-jar-dir

This will return the location of the Python environment. In the example below it is

c:\users\user\miniconda3\lib\site-packages\pyspark/jars

Take the path returned, open VSC and add the path to Python: Venv Path via the Settings tab:

Setting the Python Environment

Testing

You will now be good to go and able to send commands to the Databricks clusters previously created.

  1. Launch Visual Studio Code
  2. Create a new file. In this case it is called Test Connectivity
  3. Paste the following command in the Terminal Window

from pyspark.sql import SparkSessionspark = SparkSession\.builder\.getOrCreate()
print(“Testing connectivity from VSC to Databricks”)
print(spark.range(100).count())

Final step is to run the script and hopefully you will be greeted with the following

Code sample courtesy of Databricks

References

Why use VS Code?

Connecting to Databricks from Visual Studio Code

Leave a Reply

Your email address will not be published. Required fields are marked *

Close Bitnami banner
Bitnami