Essential Means for Urban Computing: Specification of Web-Based Computing Platforms for Urban Planning, a Hitchhiker’s Guide

This article provides an overview of the specifications of web-based computing platforms for urban data analytics and computational urban planning practice. There are currently a variety of tools and platforms that can be used in urban computing practices, including scientific computing languages, interactive web languages, data sharing platforms and still many desktop computing environments, e.g., GIS software applications. We have reviewed a list of technologies considering their potential and applicability in urban planning and urban data analytics. This review is not only based on the technical factors such as capabilities of the programming languages but also the ease of developing and sharing complex data processing workflows. The arena of web-based computing platforms is currently under rapid development and is too volatile to be predictable; therefore, in this article we focus on the specification of the requirements and potentials from an urban planning point of view rather than speculating about the fate of computing platforms or programming languages. The article presents a list of promising computing technologies, a technical specification of the essential data models and operators for geo-spatial data processing, and mathematical models for an ideal urban computing platform.


Introduction
In this article we focus on the applications of urban computing in Smart Cities Planning practice (as proposed by (Batty et al., 2012)).They suggest that there is a need for a paradigm-shift in urban planning, from focus on the built environment problems to social problems such as deprivation, and their relations to space, spatial distributions and spatial planning.Considering the complexity of cities, they imply that there is a need to develop "a new science of human [spatial] behaviour".This paradigm shift towards developing new [spatial] sciences of cities can be facilitated by the so-called urban computing practices, e.g., by facilitating access to large datasets on human spatial behaviour.This article seeks to illustrate what are the essential means of urban computing practice from a methodological point of view, i.e., computational requirements for 1) developing scientific knowledge in the form of validated analytic/simulation models using spatial data and spatial relations; and 2) informing planning actions using the insight gained from analytic/simulation models on effectiveness of actions.

What is Urban Computing?
It is difficult, and perhaps even futile, to provide a comprehensive definition of the emerging fields of Urban Computing (e.g., as referred to in Kindberg, Chalmers, & Paulos, 2007;Zheng, Capra, Wolfson, & Yang, 2014) and the closely related field of Urban Informatics (e.g., as referred to in Foth, Choi, & Satchell, 2011).These two are umbrella terms for describing diverse practices involving geo-spatial data analysis related to cities and citizens.While the former has a technical connotation related to sensing, analysis and actuation technologies (Kindberg et al., 2007), the latter is more focused on the computational social sciences applied to analysis of cities.Without attempting to provide a comprehensive definition, we choose to use the term urban computing with a broader scope to refer to all data-intensive 'computational workflows' that can be used for improving urban planning and urban decision-making by providing the means of data acquisition, analysis and simulation, e.g., to reduce traffic congestion or energy consumption.From a technical point of view, urban computing can involve acquisition, integration, and analysis of (big) data generated by diverse sources such as sensing technologies and large-scale computing infrastructures in the context of urban spaces.The volume, velocity and variety of such data often requires the use of cloud computing infrastructure and software services (Hashem et al., 2015).Urban Computing is applicable in a variety of fields, namely: • environmental studies (e.g., Shang, Zheng, Tong, Chang, & Yu, 2014;Zheng, Liu, & Hsieh, 2013); • modelling energy use/generation (e.g., Simão, Densham, & Haklay, 2009); • transport modelling (e.g., Zheng, Liu, Yuan, & Xie, 2011); • monitoring health (e.g., Varshney, 2007); • epidemiology (e.g., Lopez, Gunasekaran, Murugan, Kaur, & Abbas, 2015); • social informatics (e.g., Foth, Forlano, Satchell, Gibbs, & Donath, 2011;Pires & Crooks, 2017); • criminology (e.g., Bogomolov et al., 2014); and • participatory planning (e.g., Robinson & Johnson, 2016;Tenney & Sieber, 2016).

Why Is Urban Computing needed in Urban
Planning?
In Urban Planning, we are often interested in analysing the so-called what-if scenarios using simulations and projections (Batty & Torrens, 2001).Traditionally, the geospatial analysis of intervention scenarios, urban plans, and urban data is done by means of Geographic Information Systems (GIS), Planning Support Systems (PSS; see Batty, 2007;Harris & Batty, 1993) and Spatial Decision Support Systems (SDSS).The PSS and SDSS systems are typically stand-alone desktop applications that have a database, a library of computational methods for geospatial data processing, and an interface.Despite the technical similarities in using a spatial database, the two categories are different in that the SDSS are geared towards operational decision-making whereas the PSS are geared towards strategic planning that often involves land-use planning and thus requiring the consideration of land-use transport interactions (the distinction between PSS and SDSS from Geertman & Stillwell, 2009).In these systems, there exist some workflows for spatial analysis of urban data, which do not require new ground-breaking technology.However, the prospect of urban computing is the potentials of the web-based computing platforms for developing a new generation of shareable and editable geo-spatial data processing workflows for informing decisions in urban planning.From urban computing applications listed in Section 1.1, it can be seen that so far urban computing technologies have been mostly applied in the operational and managerial contexts (based on the definition of urban planning actions; Couclelis, 2005).For a wider adoption of urban computing practices in strategic urban planning, urban computing platforms must provide the essential means of analysis and simulation procedures needed in PSS.
Although most of the scholarly works in the area of PSS are focused on land-use change, there are other aspects of urban dynamics that could be modelled computationally; that is to say, the broader discussion is on what changes can be explained, anticipated, and taken into account when making strategic decisions on spatial plans, this broader field of research and development is called Urban Modelling (Batty, 2009).Considering the nature of outcomes of planning processes, (e.g., landuse plans) we can observe that the spatial relations between land-use distributions and a variety of phenomena need to be considered while making strategic planning decisions: for instance, land-use and transport interactions and their effects on energy use in transport (see Keirstead, Jennings, & Sivakumar, 2012) and the effect of land-use distribution on bio-diversity and the use of natural resources (especially water) should ideally be considered when proposing plans.From a pragmatic point of view, however, the adoption of PSS in practice is not high (Geertman & Stillwell, 2009): It is disturbing, in fact, to observe the extent to which new computer-based support systems are developed by researchers to the point of adoption but are never implemented in planning practice or policy making.Similarly, there is evidence to indicate that systems which are made operational are not extensively used, after the initial novelty has passed, by those planning organizations for which they have been developed in the first instance.In terms of application, it is possible to point to more failures than successes, i.e., to more cases where systems have not been implemented than examples where they are used routinely.Moreover, many state-of-the-art systems appear to take a long time to reach the 'market' and this is often a process requiring considerable financial resources.
We suggest that the research and development culture of Spatial Planning and Decision Support Systems (SPDSS, terminology of Geertman & Stillwell, 2009) must adopt open-source and agile development principles for effective 'market' uptake and ensuring the viability of the R&D products (Crowston & Howison, 2005;Hey & Payne, 2015;Pressman & Roger, 2009;von Krogh, 2003).By adopting urban computing practices, utilization of scientific knowledge in planning practice will be eased; because web-based computing platforms facilitate rapid prototyping, development, release, sharing, and test of SPDSS (incorporating a variety of Urban [Analysis/Simulation] Models).

Problem Statement
Although much can be said about the graphical user interfaces of GIS applications, we do not focus on them; because these interfaces are generally geared towards manual operations.Instead our focus is on the essential means for developing 'geo-spatial computing workflows'.Workflows can be as simple as routines of sequential actions or more sophisticated procedures with flow-control mechanisms, which are better known as algorithms (see Figures 1 and 2 for workflow examples).There are two types of challenges in using the currently available GIS desktop applications for innovative inter-disciplinary research in Urban Computing applied in Urban Planning (i.e., Design and Development of Web-Based SPDSS): • Data-Related Challenges: -Data-Availability: how easy is it to acquire a relevant dataset?-Data-Interoperability: how easy is it to read/write datasets from/to file formats?-Data-Mergeability: how easy is it to overlay multiple datasets?
• Workflow-Related Challenges: -Workflow Comprehensibility: to what extent is the whole workflow understandable?-Workflow Editability: how easy is it to modify the workflow explicitly?-Workflow Repeatability: how easy is to repeat a certain data processing workflow?-Workflow Shareability: how easy is it to share a workflow from one system to another?-Workflow Scalability: how easy is to process large datasets with a workflow?-Workflow Sustainability: to what extent is the workflow modular and recyclable?
A rather neglected matter about SPDSS is the very social/human process of developing them.We argue that there are three determining factors to consider with regards to 'the suitability of a computing technology for urban computing', i.e., the availability and quality of: 1. Visual Data Flow Programming 2. Spatial Computing Libraries 3. Internet of Things (IoT) APIs5

Visual Dataflow Programming
It is well known that the time spent on research and development is often much more valuable than the computation time.Therefore, we need to consider human interface requirements with regards to the ease of ideation-development-test cycles (prototyping).We propose that using a dataflow programming platform, the user can interact with the platform knowing only a common programming language to edit the nodes (blocks of code) and only a handful of UI manoeuvres to get started; without the problem of learning a sophisticated UI.In processing big data, there are two generic approaches, namely: batch processing and real-time processing (Hashem et al., 2015).Considering the real-time data processing requirement, especially in dealing with managerial and operational planning actions, we can conclude that the Dataflow Programming6 is an appropriate paradigm for setting up an R&D/prototyping environment (Blackstock & Lea, 2014;Szydlo, Brzoza-Woch, Sendorek, Windak, & Gniady, 2017).Considering that the sustainability and the repeatability of the workflow, it is practical to adopt a modularization and standardization approach to workflow development.Standardization is important for reusability.Specifically, the code-blocks (alias nodes, blocks, or subsystems) of a workflow must input and output data in formats readable for one another.Of course, having a visual overview of the workflow is of high added value, as it makes the workflow as intuitive as a flowchart.The idea of a visual dataflow programming language is to represent the high-level logic of a program/workflow as a graph of nodes, which are blocks of (reusable/shareable) code.The representation of the high-level logic as a graph makes it easy to focus on the complex big-picture for a group of developers working on a workflow.Instead of developing a complete software application with a graphical user interface, a research software engineer can focus on the core of the workflow, model the workflow, test it, share it, and release it as a functional prototype.
If the workflow description language is a (de facto) standard, the intended user does not need to learn a new interface to interact with the workflow.In other words, instead of focusing on optimizing a new software application in terms of its interface and the computational efficiency, more attention can be paid to the effectiveness of the workflow itself.In addition, if the workflow is also cloud-based, then it will be easier to share them and collaborate on-line in real-time.
In short, adopting a visual cloud-based dataflow processing language (and ecosystem) brings about a few advantages: •   Boyd, 2015), QGIS Graphical Modeller8 , Anaconda Orange39 , and ArcGIS Model Builder10 , all of which offer Python APIs.The GIS dataflow programming environments make it easy to automate routines, share them, and use standard modules; however, the installation procedures, their domain specific nature and their UI make them much less accessible than the two all-purpose data-flow programming environments shown.
processes by means of rapid development and integration of apps (e.g., using Node-RED11 , a visual data-flow programming tool for wiring together hardware devices, APIs and online services, see Figure 2).

Spatial Computing Libraries
Here we provide an overview of the requirements of a software application for urban computing; and focus on the specific functionalities that deal with geo-spatial data.Geo-spatial data can be analysed in at least five spatial forms from the most concrete to the most abstract: • Geographical Data Models: geographically positioned points, lines, polygons, and polyhedrons; • Geometrical Data Models: points, lines, polygons, and polyhedrons (in local coordinate systems); • Topological Data Models: vertices, edges, faces, and bodies (algebraic\combinatorial topology); • Graphical Data Models: objects and links (Graph Theory); and • Spectral Data Models: eigenvectors and eigenvalues.
The use of the last category of data models is relatively newer than the other types of the models and is used for modelling the dynamics of diffusion flows and Markov Processes in networks (Nourian, 2016;Nourian, Rezvani, Sariyildiz, & van der Hoeven, 2016;Volchenkov & Blanchard, 2007;Wei & Yao, 2014).Performing spectral analyses requires using a computational linear algebra library such as NumPy 12 .Generally, considering the inter-disciplinary nature of urban computing, evident in the breadth and variety of practices mentioned in Section 1.1, we propose that scientific and numerical com-puting libraries must be available in an ideal platform for urban computing.
In Table 1, we have shown the computational modules required to make spatial analysis and spatial simulation models, which are, in other words, the essential data-models and operations in geo-spatial data processing for urban computing.Central to this schema are the three distinct ways of modelling space as: • Manifolds13 (often approximated as simplicial complexes); • Grids (a.k.a.2D/3D raster data models, see Zlatanova, Nourian, Gonçalves, & Vo, 2016); • Networks (a.k.a.[directed/weighted] graphs).
In Figure 3, we have categorized the specifically required functionalities for spatial computing as to the previously introduced fields of application of urban computing.There we have shown an overview of exemplary types of analysis or simulation models for planning support workflows, their typical goals and required data models related to the previously listed areas of applications of urban computing.

IoT APIs
IoT for smart environment is defined by (Gubbi, Buyya, Marusic, & Palaniswami, 2013) as follows: Table 1.A list of typical goals, required spatial data types, and analytic (mathematical) or simulation (computational) modelling approaches of urban computing.

Goal
Typically Interconnection of sensing and actuating devices providing the ability to share information across platforms through a unified framework, developing a common operating picture for enabling innovative applications.This is achieved by seamless large scale sensing, data analytics and information representation using cutting edge ubiquitous sensing and cloud computing.
IoT applications can be used for acquisition of data from sensors.They can also be used to directly control some dynamics of cities such as traffic lights.The devices needed for enabling control of physical things are called actuators or actuating devices.The electronic devices that can connect sensors and actuators to internet could be micro-controllers or micro-computers, some of which are open devices popular among amateur enthusiasts such as Arduino14 and Raspberry Pi15 .The capabilities of a computing technology for interacting with such devices can be a key factor in making it more pervasive among enthusiast makers and academic software developers, due to the accessibility of such devices in terms of low prices and ease of learning.Operational planning actions can especially benefit from actuators and sensors in urban environments.For instance, traffic lights can be actuated (controlled) by a controller system connected to many of both sensors and actuators in real-times (thus having a real-time overview of a city) continuously analysing the data coming from sensors sensing the volume of traffic.In other words, IoT devices can facilitate (real-time) operational planning actions.With regards to the IoT potentials for Urban Computing, it is logical to assume that Web-based GIS services (alias web mapping) are necessary for urban computing.In addition, moving all workflows from desktop applications to web-based platforms makes it eas-ier to share (standardized) workflows and collaborate on them.In the next section we focus on the potentials of four programming languages for setting up web-based computational workflows for geo-spatial data analytics and simulations.

Promising Technologies for Urban Computing
We have identified a few promising technologies for urban computing, based on Python, Java, JavaScript and R-Spatial languages.From a practical perspective, we consider their potential in terms of ease of prototyping, geo-spatial mapping, 3D visualization, handling big data, and numerical computing (computational linear algebra).From a mathematical/computational point of view, all required models mentioned in Figure 3 can be rather easily developed on top of a robust computational linear algebra library.Apart from numerical capabilities, we argue that for a research software engineer, the visualization and mapping capabilities are essential to consider while making technical choices.

Python
This programming language is used for example in the Geoda-Web16 , that is the web-based version of CAST17 with its spatial analysis library PySal18 seems to be a promising open-source project.Python is the de facto language of open-source development in the field of Geo information science, e.g., in QGIS, Rasterio19 and Fiona20 .Python provides a wide range of libraries for numerical and scientific computing such as NumPy, SciPy and Pandas, which facilitates development.Interactive development environments such as IPython (Interactive Python) (Perez & Granger, 2007) and web-based Jupyter notebooks (Shen, 2014) seems to be a promising technology for prototyping and interactive computing.Some universities have started facilitating the use of Jupyter interactive documents as a common means of exchanging reproducible research products, e.g., on JupyterHub21 , NBViewer22 , or SURF-sara (Templon & Bot, 2016) provide hosting and viewing services for sharing Jupyter notebooks.A few options which stand out for simple 3D visualization in Python are: MatPlotLib23 , Mayavi24 or VisPy25 , while more high-performance applications can be built in OpenGL using PyOpenGL26 .Web mapping in Python is possible by means of GeoDjango27 .

Java
This programming language is used for example in a web-GIS for environmental analyses by (Zavala-Romero et al., 2014).The FIWARE platform (Zahariadis et al., 2014) offers an "Application MashUp Generic Enabler", i.e., the WireCloud28 for visual programming and prototyping web applications.Another flow-based programming environment for Java development supported by Apache Hadoop29 is NiFi30 .Java can also provide for interactivity and 3D visualization.The OpenGeoSpatial foundation (aka OSGeo31 ) also provides an open source GIS toolkit for Java called GeoTools32 .Considering the might of Hadoop for big data analytics and the support of OSGeo Java seems to be a fertile language for urban computing.One option for 3D visualization in Java is JogAmp 33 , while a more advanced option is JOGL 34 .

JavaScript
This programming language is used for example in Open-Layers 35 and Carto 36 SaaS (Software as a Service, for-merly known as CartoDB 37 ) to provide user-friendly Web-GIS tools, which can moreover be deployed as desktop applications with tools like Electron 38 .However, neither of them supports explicit workflow development.The other promising JavaScript platform for spatial analysis is MapBox 39 , which offers access to the Turf library 40 .Node-RED (Blackstock & Lea, 2014), based on IBM BlueMix (a.k.a.IBM Cloud) 41 , seems to be a promising technology in terms of visual programming and the ease of prototyping IoT applications.Node-RED is distributed as part of an open-source software ecosystem called node package manager or NPM 42 , that is managed by the Node.js 43foundation.Interactive visualization in web-browsers is well supported in JavaScript, and arguably more advanced than comparable libraries in Python, thanks to the D3.js library, by Mike Bostock 44 (Bostock, Ogievetsky, & Heer, 2011).In addition to D3 for interactive graphics, there is three.js 45for We-bGL rendering in the browser.Other JavaScript libraries which should not go unnoticed for urban computing are Leaflet 46 (mobile-friendly interactive maps providing access to OSM 47 ) and Cesium 48 , the latter providing for quality 3D visualization.

R Spatial
R is a programming language that is part of the R Project for Statistical Computing 49 , which includes a complete set of vector algebra operations and functions to create graphics such as plots.The statistical functions in R are much more complete than those available in other languages (e.g., Python).The R Spatial 50 functionality includes the more relevant parts for urban computing, such as representations for raster and vector data, deal-ing with coordinate systems and creating 2D maps.Spatial.ly 51shows several examples of the more advances visualisation functions in R, including 3D visualisation and animated globes.Shiny 52 is a tool to build web apps with R.There are also other ways in which web sessions of R can be deployed, such as with Rweb 53 and rApache 54 .Similar to Python, Jupyter notebooks can also be used thanks to the IRkernel 55 .

Conclusion
In response to this question: "What are the essential means for urban computing?",we have provided an overview of specific data models and functionalities required in dealing with geo-spatial data processing (spatial analysis and spatial simulation), referred to as spatial computing in Figure 3 and Table 1, which we deem as the essential means for urban computing.We have considered four programming languages and their promising aspects for urban computing.They all come with their own advantages and shortcomings.It is difficult (and perhaps futile) to point to one of these languages as the most promising language for urban computing.We stress that these technologies are not mutually exclusive, but they can (in some cases) be used in combination with each other.For example, a web-based GIS system could use a Python backend with Flask 56 and a JavaScript frontend with a 3D visualiser based on Cesium, or a processing pipeline could use Python to fetch data from the web using a tool like BeautifulSoup 57 , use Java to parse and process the data, use R to do statistical analysis on it, and then visualize the results in a browser using JavaScript.However, it can be said that each of them is stronger in a certain direction, respectively: Java in server-side tools, R Spatial in statistical and mathematical operations, Python in the availability of GIS tools, and JavaScript in IoT and web visualisation.Their respective strengths can be combined by using the best language for each task.
In addition, it is perhaps noteworthy to mention that in the related field of computer-aided design (CAD), there is an active movement towards development of visual programming languages and connecting them together by means of a cloud platform, e.g., Flux 58 , initially sponsored by Google 59 .Considering the attractiveness of aligning urban design and urban planning actions, it would be ideal to work in an environment where planners, designers, and research software engineers could all work and share their workflows, for example, a 3D city modelling SaaS such as Möbius 60 (Janssen, Li, & Mohanty, 2016), Tygron 61 or CityZenith 62 could potentially become such a shared development environment.

Figure 1 .
Figure 1.Two examples of geo-spatial data processing workflows from QGIS Processing Modeller 1 (top) and ArcGIS Model Builder 2 (bottom), respectively made for calculating area of water within 25 metres of urban roads (tutorial), and finding suitable locations for urban parks (tutorial).
Automation of repetitive tasks for data cleansing, validation, etc.; • Informal and yet sustainable standardization based on common-practices and bottom-up emergence of workflow patterns 7 ; • Sharing workflow pattern solutions instead of reinventing the wheel; • The possibility of interdisciplinary collaboration; • Ultimate modularization of workflows based on sharing nodes/blocks of code; • Agile development-test-release cycles; • Promotion of Open-Source development practices and therefore rapid progress; • Ensuring re-usability and repeatability of workflow-based practices such as spatial analyses; • Saving time by significantly reducing the time and effort in re-inventing interfaces; • Raising the level of comprehensibility of analytic workflows by providing a glass-box view of the process (as opposed to black-box SPDSS); and • The possibility of public participation in planning

Figure 2 .
Figure 2. Data processing workflow examples, respectively from top left, clockwise, node-RED, editable by JavaScript (picture fromBoyd, 2015), QGIS Graphical Modeller 8 , Anaconda Orange3 9 , and ArcGIS Model Builder 10 , all of which offer Python APIs.The GIS dataflow programming environments make it easy to automate routines, share them, and use standard modules; however, the installation procedures, their domain specific nature and their UI make them much less accessible than the two all-purpose data-flow programming environments shown.
These systems can be developed by Research Software Engineers.3Atypical research software developer is not necessarily a software engineer, but usually a domain-specific researcher who can develop software or computational workflows.A typical research software engineer, often does not have the means of a software vendor to develop a large application with a custom-made GUI.The core of the work of research software development is on developing analytic workflows.
42.What Do We Need for Urban Computing?
Essential mapping operations and data models required for geo-spatial computing.