Ritu Singh
Problem:
As the title suggests, I have a shapefile (.shp) in my Azure Blob Storage container, and I'm trying to read it into my Azure Databricks notebook directly without downloading it into my local drive.
I'm able to read csv files from the Blob Storage, but am running into problems with the shapefile. I haven't been able to find a solution in past stackoverflow questions.
Here's the code I'm using:
This returns <_io.BytesIO at 0x7f1ad4bbbcc0>.
Subsequently I've tried reading the shapefile with geopandas and Fiona:
This gives the error DriverError: '/vsimem/31debcdbc2b0480b9f0567aea3a687d7' not recognized as a supported file format.
Fiona gives a similar error: DriverError: '/vsimem/04e527ecf5324605bdcf3643ea3b4bd2/04e527ecf5324605bdcf3643ea3b4bd2' not recognized as a supported file format.
There doesn't appear to be issues with the file. I've uploaded the shapefile into my Azure Workspace and it read fine from there, but because this file is meant to be used for a workflow on the cloud I can't use this approach.
Solution:
You can mount your storage account to Databricks and read the shapfile(.shp). Below is the shapefile I am having.
Code to mount.
Using below code you can read it.
Here, you can see I prefixed dbfs to the path, geopandas.read_file checks path from the root and it is not in spark context.
Suggested blogs:
>Build a minimal ASP.Net Core API with Android Studio for Mac
>Complete guide on Life Cycle of Angular Component
>Complete guide to Perform crud operation in angular using modal popup
>Create a Vue.js application with CLI Services
>Creating API in Python using Flask
>CRUD Operations In ASP.NET Core Using Entity Framework Core Code First with Angularjs
>Deploying a python App for iOS, Windows, and macOS
>Design a basic Mobile App With the Kivy Python Framework