Apache Sedona

Apache Sedona
Original author(s)	Jia Yu, Mohamed Sarwat
Developer(s)	Apache Sedona
Stable release	1.2.0
Repository	github.com/apache/incubator-sedona/
Written in	Java, Scala, Python, R
Engine
License	Apache-2.0 license
Website	sedona.apache.org

Apache Sedona, formerly known as GeoSpark^[1], is an open-source cluster computing system for managing large-scale spatial data. It joins the Apache Software Foundation in 2020 and is now in the incubation phase.

Search Apache Sedona on Amazon.

Description

Apache Sedona supports analyzing spatial data using operational languages such as Java, Scala, Python, R as well as declarative languages such as SQL. It extends cluster computing systems such as Apache Spark and Apache Flink with a set of out-of-the-box distributed Spatial Datasets and Spatial SQL for loading, processing, and analyzing large-scale spatial data efficiently across machines. It supports data loading from heterogeneous data sources which include but are not limited to CSV, GeoJSN, GeoTIFF, WKT, and ESRI Shapefile. Spatial datasets may contain different types of geometry objects such as Point, Line, Circle, Polygon, MultiPolygon. Apache Sedona has built-in support for all basic geometry types^[2].

Since Apache Sedona is a general purpose geographical data processing engine, it can be used for wide range of spatial applications. Some example applications of Apache Sedona include generating region heat map, spatial co-location pattern mining, calculating average rating of neighboring schools, calculating spatial autocorrelation of neighboring regions for properties such as housing prices, traffic volume, traffic inflow and outflow, volume of crowd, etc.^[3]

Supported Spatial Operations

Apache Sedona supports the following operations used for spatial data analysis and spatial data mining:

References

↑ Yu, Jia; Wu, Jinxuan; Sarwat, Mohamed (November 2015). GeoSpark: a cluster computing framework for processing large-scale spatial data. Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems.
↑ Yu, Jia; Zhang, Zongsi; Sarwat, Mohamed (22 October 2018). "Spatial data management in apache spark: the GeoSpark perspective and beyond". GeoInformatica. 33 (9): 1064–1073.
↑ Sarwat, Mohamed. "Introducing Apache Sedona: Where 'Big Data' meets geospatial data". Retrieved 10 May 2022.

External links

This article "Apache Sedona" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Apache Sedona. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[1] Yu, Jia; Wu, Jinxuan; Sarwat, Mohamed (November 2015). GeoSpark: a cluster computing framework for processing large-scale spatial data. Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems.

[2] Yu, Jia; Zhang, Zongsi; Sarwat, Mohamed (22 October 2018). "Spatial data management in apache spark: the GeoSpark perspective and beyond". GeoInformatica. 33 (9): 1064–1073.

[3] Sarwat, Mohamed. "Introducing Apache Sedona: Where 'Big Data' meets geospatial data". Retrieved 10 May 2022.

[1]

[2]

[3]

Apache Sedona

Contents

Description

Supported Spatial Operations

References

External links

📰 Article(s) of the same category(ies)[edit]