Revolutionizing Geospatial Analyses: Pyt ...

Revolutionizing Geospatial Analyses: Python and F# for Local Community Leaders

Jun 15, 2023

TLDR; In this post, I will describe how to bring programming and data analysis closer to local community leaders by harnessing the combined power of two cool programming languages and their ecosystems. Citizens can draw their living spaces with universal tools (felt.com) while utilizing F# to generate Domain-Specific Language (DSL) types. Finally, the valuable information will be shared with the Python programming language, which offers limitless possibilities for further, advanced data processing and visualization techniques.

Example notebook is here.

Make sure to use the latest VS Code, Polyglot Notebooks extension, and run VS Code from Anaconda (you must have used modules like GeoPandas, already installed):

Before you read: benefits of supporting me

If you found my GeojsonCloud and other libraries useful, and want something similar for your home or beloved city, consider buying me a coffee (or lifetime membership). Type your city in the text field. I will support more cities according to my supporter's preferences.

Polyglot Notebooks

Polyglot notebooks, a Visual Studio Code extension, have recently expanded their language support to include Python and R (in preview), in addition to the existing .NET, JavaScript, SQL, and more. This extension allows for the integration of multiple programming languages, also enabling users to harness the strengths of each language's ecosystem.

One area that showcases a particularly potent synergy is geospatial analyses, where Python and Geopandas form a formidable duo to perform complex spatial tasks with ease.

F# brings its unique superpowers, such as type providers and domain modeling with type safety, to the table. By leveraging F#'s capabilities, developers can enhance their code's clarity, reliability, and maintainability in various domains.

Combining these two cool programming languages can unleash a powerful synergy in geospatial analyses, enabling developers to tackle complex challenges with a robust and expressive toolset.

As the toolset becomes more expressive and less procedural, programming and data analyses become accessible to community leaders seeking to enhance their local areas.

Smart Citizens

Data-Driven cities need data-driven community leaders. Still, all that object-oriented or procedural coding is hard to start with.

An ideal starting point for newcomers would be coding with everyday words - neighborhood names, points of interest and facilities, and even the names of their family members and friends, with minimizing the usual noise found in general-purpose programming languages.

It would be much better for the newcomers to start coding by using words from their everyday life: their neighborhood names, points of interest, and even the names of their family members and friends, with very little noise common for general-purpose languages.

In my spare time, I'm working on libraries in F#, where we can access open city data with no setup and evaluate places with just a single function:

Such a function is all a citizen has to write on his own and is enough to address many municipal, social, or business challenges.

But what is the Place<'properties> coming from? How can it be beneficial for Python developers?

Geojson Cloud

GeojsonCloud.CityName is a dedicated Nuget package that can speak to open city data without setup (literally). There are a lot of modules in Python that enable one to work with various geo datasets, including Open Street Maps, but open city data is a totally different level. City governors have a monopoly on gathering these data. Such data are exhaustive (compared to OSM) and often open to copying, modifying, and redistribution (in the vast majority).

Geojson Cloud is a cloud copy of several downloaded files from open city data portals. They are put into the cloud along with their metadata (license, attribution, nr of items, and features). Thanks to the dedicated F# Type Provider, particular data can be queried and downloaded as if it were a part of the language, with strongly-typed access that promotes discoverability and supports metadata:

While this feature is only available with the F# kernel, the final value can be shared with other kernels.

Hence we can use it from Python:

You can further continue with other Geopandas features (or any other library):

Analyses of living space with Florence library

In this section, I will show how to rank a neighborhood by using mostly names of our vocal points in the city. But how to easily define them?

There are many new cool geospatial tools. One of my favorites is felt.com.

Felt is the World's first collaborative mapping tool built for anyone to make a beautiful map in minutes.

I created a simple living space in Southwark (London) with Felt.

Additionally, using HTML kernel, we can embed such a created map in our notebook to have a better bigger picture of what we address:

Our goal is to find the best place to live, according to the citizen who created his map.

This is a very subjective task, as one person wants to live as close to a certain point as possible, and the other may want to live as far from it as possible. That is why it is extremely important to give users the possibility to define the ranker function by using regular words from their ordinary days and neighborhoods. So they do a geospatial analysis without even knowing they do geospatial.

This is exactly what my Florence library aims to provide:

We can read the map created with Felt:

A special type is created (YournameDistances), and we can access spatial distances with properties created based on the map. We can check distances to a certain point at once:

Or just by typing their names:

Not that these are spatial distances that represent the shortest line between two points.

What if we want to know the real distances and times according to different traveling profiles?

You can use Mapbox Direction API. To make it possible, you must use your own Mapbox token, but be very careful if you run it against many areas during a single month. It can either fail or (worse) use your quota and introduce potentially large bills.

Warning:

Mapbox provides a generous free quota every month (100k directions) but use it only if you know what you want to achieve (at the time of writing this post, Mapbox doesn't support custom usage limits, so be careful).

Sample usage:

You can see that the type that has been created allows you to check distances and times between points in plain English statically. But this is only one literate angle. You can use it as a parameter to any function that expects location (coordinates points). Like this:

Although it still looks like a mathematical formula, it resembles everyday language: "Live as far as from your work and as close to your best friends or grandma."

If it doesn't look natural (and in most cases, it doesn't), you can write such a function differently:

In this case, you give max 10 points for each location but take into consideration only places closer than 4km

You may ask: this function calculates distances to my focal places in the city, but what location does it compare to?

The information goes with the value of type you pass into the function:

Having that, we can execute our "bestPlaceToLiveIn" function against any possible area.

So let's check what is the place to live in Southwark borough in London.

If we load it from GeojsonCloud we can use it as a regular sequence of places in F#:

And rank all Southwark areas as follows:

Note that inside the ranker function, we can include other datasets and their (strongly-typed) properties:

Python time!

Python support in Polyglot notebooks is in preview, and we can only share variables between languages. At some point, other possibilities like sending commands (already possible between JS and .NET) will be available. However, sharing variables already gives us a lot.

We want to provide geojson content, created with F#, for Python to consume:

Note that one of the area's properties are all distances to our focal points, drew first with Felt tool, then used to rank the area with your life-inspired types.

To make the results more granular, we can use more factors in the function:

Finding out which area is the best or what value has a concrete place is hard while looking at a visualization. This is where Pandas, Geopandas, NumPy shine:

Summary

I just started to learn Python, so this is all I can present for now. Still, I'm sure regular Python developers can go pretty far with analyzing all this data, finding the best place to live, rent a flat, setting up a business, the best place for ...everything.

Enjoy this post?

Buy Paweł Stadnicki a coffee

More from Paweł Stadnicki