Users Online: 1277 Home Print this page Email this page Small font sizeDefault font sizeIncrease font size


Home About us Editorial board Search Ahead of print Current issue Archives Submit article Instructions Subscribe Contacts Login 

 Table of Contents  
Year : 2020  |  Volume : 64  |  Issue : 2  |  Page : 191-197  

Using open-source data to explore distribution of built environment characteristics across Kerala, India

1 PhD Scholar, Achutha Menon Centre for Health Science Studies, SCTIMST, Thiruvananthapuram, Kerala, India
2 Professor, Achutha Menon Centre for Health Science Studies, SCTIMST, Thiruvananthapuram, Kerala, India
3 Scientist B, Achutha Menon Centre for Health Science Studies, SCTIMST, Thiruvananthapuram, Kerala, India

Date of Submission20-Sep-2019
Date of Decision12-Nov-2019
Date of Acceptance29-Apr-2020
Date of Web Publication16-Jun-2020

Correspondence Address:
Joanna Sara Valson
Achutha Menon Centre for Health Science Studies, SCTIMST, Thiruvananthapuram - 695 011, Kerala
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/ijph.IJPH_472_19

Rights and Permissions

Background: Built environment characteristics in the neighborhood are of utmost priority for a healthy lifestyle in the fast-urbanizing countries. These characteristics are closely linked to the disease burden and challenges in low- and middle-income countries (LMICs), which have been unexplored using open-source data. The present technology offers online resources and open source software that enable researchers to explore built environment characteristics with health and allied phenomena. Objectives: This article intends to delineate methods to capture available and accessible objective built environment variables for a state in India and determine their distribution across the state. Methods: Built environment variables such as population density and residential density were collated from the Census of India. Safety from crime and traffic were captured as crime rates and pedestrian accident rates, respectively, acquired from State Crime Records Bureau. Greenness, built-up density, and land slope were gathered from open-source satellite imagery repository. Road intersection density was derived from OpenStreetMap. Processing and analysis differed for each dataset depending on its source and nature. Results: Each variable showed a distinct pattern across the state. Population and residential density were found to be closely related to each other across both districts and subdistricts. They were both positively related to crime rates, pedestrian accident rates, built-up density, and intersection density, whereas negatively related to land slope and greenness across the subdistricts. Conclusion: Delineating the distribution of built environment variables using available and open-source data in resource-poor settings is a first in public health research among LMICs. Cost-effectiveness and reproducible nature of open-source solutions could equip researchers in resource-poor settings to identify built environment characteristics and patterns across regions.

Keywords: Built environment, distribution, geographical information systems, low-and-middle-income countries, open-source, public health

How to cite this article:
Valson JS, Kutty V R, Soman B, Jissa V T. Using open-source data to explore distribution of built environment characteristics across Kerala, India. Indian J Public Health 2020;64:191-7

How to cite this URL:
Valson JS, Kutty V R, Soman B, Jissa V T. Using open-source data to explore distribution of built environment characteristics across Kerala, India. Indian J Public Health [serial online] 2020 [cited 2021 Dec 5];64:191-7. Available from:

   Introduction Top

The place where we live matters a lot. The relationship between health and place has been of great interest in the present health scenario. Broadly, the built environment encompasses the place where we live and have been modified by people. It is inclusive of indoor and outdoor physical environments (including climate and air quality) and social environments which comprise civic participation and community investment. Another definition says that “it includes man-made buildings, infrastructures and cultural landscapes that constitute the physical, natural, economic, social, and cultural capital of a society.”[1] Yet another useful definition states “built environment consists of all buildings, spaces, and products that are created or modified by people.”[2] Relationship of the built environment with health has been broadly inclusive of the built environment and physical activity, built environments and food, built environments and mental health, and urban planning and health. The variables documented to capture the built environment ranges from population density, residential density, land-use mix, street connectivity, greenness, land slope, safety from crime, and safety from traffic to capture of food environments (density of restaurants, distance to food destinations, etc.). Furthermore, types of data used for measuring built environment characteristics encompass objective measures (e.g., systematic scans or audits), perceived measures (e.g., by personal interviews or questionnaires), and archived datasets analyzed using geographical information systems (GIS).[3]

Large epidemiological studies have been undertaken in the developing nations to estimate the prevalence and identify contributing factors of communicable and noncommunicable diseases. However, the relationship between health conditions with built environment features has not been explored adequately in these nations. Recent advancements in spatial analysis, capacity building, and availability of spatial data have been crucial for researchers in low- and-middle-income countries (LMICs) to spearhead studies on the built environment. Besides, open-source solutions such as quantum GIS (QGIS), Google Earth, OpenStreetMap (OSM), and satellite data availability provide cost-effective platforms to undertake research involving spatial data.[4] Advancements in geospatial applications enhanced computational power, and increased availability of spatial data have empowered researchers to incorporate GIS to capture objective built environment measures over large areas, using publicly available data.[5],[6] Taking into account the recent trend toward building healthy communities, this methodology using GIS and available data would also help gauge communities/neighborhoods based on land use and safe infrastructure to walk/bicycle.[7]

On this backdrop, this study is aimed to: (a) delineate methods to capture built environment variables using open-source solutions for a state in India and discuss the challenges thereof, (b) examine the distribution of built environment variables across Kerala.

   Materials and Methods Top

Study site

Kerala, one among the states with a high epidemiological transition level, is the lowermost southwestern State in India. It has 14 districts with 64 sub-districts. It is topographically diverse, with lowlands in the western coasts and extending toward midlands and highlands in the eastern regions.[8] It is well known for its biodiversity and very high social indicators among the Indian States.[9] In terms of health indicators, Kerala fares excellent in neonatal and maternal mortality rates but faces the greatest challenge to curb lifestyle diseases, including diabetes, coronary heart disease, renal disease, cancer, and geriatric problems.[10] The urban share in Kerala among the total population has doubled from 26% in 1991 to 48% in 2011, which is the highest in India.[11] Problems due to unplanned urbanization continue to prevail in urban Kerala, including the rise in transportation costs, urban poverty, and urban sanitation problems.[12]

Data sources

This study has compiled data from the following sources:

Census data

Population density and residential density were obtained from the Census of India, 2011 (

State Crime Records Bureau

Crime rates and pedestrian accident rates were accessed from the State Crime Records Bureau, the authorized data holding agency under the Government of India directive. Rates were available for all 498 police stations in Kerala. Each police station was linked to the corresponding district and sub-district using jurisdiction details from the corresponding Kerala Police station websites. All 498 police station websites were visited to confirm their jurisdiction. Crime was defined as total crimes inclusive of all cognizable crimes in the Indian Penal Code.[13]

Spatial data

Greenness was measured using normalized differentiated vegetation index, and built-up density was estimated using normalized differentiated built index from Landsat8 images accessed from the United States Geological Society archives.[14],[15] The land slope was measured from digital elevation model data retrieved from Shuttle Radar Topography Mission (SRTM) 90 m resolution images through Consortium for Spatial Information.[16],[17] Intersection density was calculated as the number of three-way or more road intersections per square kilometer area in a district or sub-district. This was captured from the road network layer for the state of Kerala using OSM.[18],[19] The timeline for data capture was between February and April 2018.

Data processing

Nonspatial data

The population density was defined as the number of inhabitants per square kilometer area of district or sub-district. In contrast, residential density was defined as the number of residential units per square kilometer area of district or sub-district. Crime rates were calculated as the number of crimes reported per thousand population in district or sub-district. Pedestrian accident rates were calculated as the number of pedestrian accidents reported per one lakh population in district or sub-district.

Spatial dataset

Search criteria of place names of Kerala, Kasaragod, Thiruvananthapuram, and Kanyakumari, with a data range of the year 2016 and a cloud cover <10% yielded 25 Landsat 8 operational land image images. Similarly, a search criterion for Kerala in the SRTM repository produced three SRTM images, which were merged and clipped for calculating the land slope for the extent of Kerala. The OSM layer was also clipped to the extent of Kerala. District and sub-district-level measures of greenness, built-up density, and land slope were captured using Zonal statistics plugin. In contrast, intersection density was obtained using points in the polygon tool in the QGIS software 3.4.4 (QGIS Development Team, 2009). Data validation and quality check measures were placed in each step of processing, beginning with the download of data, filling of no data values, calculation of composite measures, and scrutiny of raster histograms. The data processing methods are summarized in [Table 1]. Approval for this research was obtained from the Institutional Ethics Committee (IEC/1164).
Table 1: Data processing methods and definitions (source: Original)

Click here to view

Data analysis

Geographical distribution of population density, residential density, crime rates, pedestrian accident rates, greenness, built-up density, intersection density, and land slope were summarized using both tables and choropleth maps across districts and subdistricts of Kerala. Choropleth map generation was done using sp package in R software version 3.6.1 (R Core Team, 2019). Correlation between these variables was also examined.

   Results Top

Distribution of built environment variables – district-wise

The built environment variables were captured for districts and sub-districts, and geographical distribution of the same are shown in [Figure 1] and [Figure 2]. Thiruvananthapuram district was the most populous and had the highest density of housing units in the State. Crime rates were recorded to be the highest in Ernakulam and lowest in Malappuram districts. Pedestrian accident rates were reported to be highest in Kollam, whereas lowest rates were reported from Malappuram district. Ernakulam district had the highest built-up density and lowest greenness, while Kozhikode had the lowest built-up density, and Wayanad had the highest greenness. The most upper median land slope was found for Idukki, while the lowest was for Alappuzha district. Ernakulam was found to have the highest intersection density, while Idukki had the lowest number of road intersections per square kilometre.
Figure 1: Distribution of selected built environment variables across districts in Kerala (Source: Author generated).

Click here to view
Figure 2: Distribution of selected built environment variables across sub-districts in Kerala (Source: Author generated).

Click here to view

Distribution of built environment variables – sub-district-wise

The highest land slope was in the Devikulam sub-district in Idukki, while the lowest was found to be in Aleppey sub-district in Alappuzha. Cochin subdistrict had the lowest greenness, while Ranni in Pathanamthitta recorded the highest. Kuttanad in Alappuzha recorded the lowest built-up density while the highest built index was recorded for Cochin in Ernakulam. Pirmed in Idukki was the least populous and had the lowest number of housing units per square kilometer while Cochin city recorded the highest population and residential density. Kanayanur and Cochin subdistricts in Ernakulam ranked lowest and highest, respectively, in crime and pedestrian accident rates. Intersection density was highest in Kanayanur and lowest in Ambalappuzha subdistricts.

Correlation of built environment variables-district-wise

Correlation between built environment variables among districts and subdistricts are summarized in [Table 2]. Across districts, population density and residential density were highly correlated to each other. The population density was also positively related to three-way road intersection density and negatively related to the land slope. A similar relationship was reflected for residential density with the density of road intersections and land slope. Pedestrian accident rates were directly related to the density of houses per square kilometer, crime rates, the density of road intersections, and inversely related to greenness. The density of road intersections per square kilometer tended to decline with higher land slope.
Table 2: The correlation among built environment variables across districts and subdistricts (source: Original)

Click here to view

Correlation of built environment variables-sub-district-wise

The population and residential density of sub-districts also were highly correlated to each other. Both population and residential density were directly related to crime rates, pedestrian accident rates, urbanicity, and road intersection density, but were negatively related to greenness and land slope. Crime rates had a tendency to be higher with higher intersection density, built-up density, and pedestrian accident rates, while an inverse relationship was found with greenness. A lower inclination of pedestrian accident rates was found with higher greenness, higher land slope, and with lower built-up density. Higher intersection density was related to low greenness, high built-up density, and low land slope. Higher greenness was related to higher land slope and low built-up density.

   Discussion Top

This study intended to capture the available built environment variables using open-source data and examine their distribution across districts and subdistricts of Kerala. The distribution of variables under study showed distinct patterns across districts and sub-districts.

Capturing data and assimilating them continues to be a great challenge for researchers in LMICs.[20] Seeking due permission from government authorities to access data in LMICs continues to be a hurdle. Obtaining spatial data from government-owned sources, for example, Bhuvan in India entails procedural delays and charges.[21],[22] However, the linking of various data sources has its constraints of compatibility, and standardization, for example, police station jurisdiction and census blocks jurisdiction, may not be the same. Majority of the existing evidence depicts the capture of the walkability index, which may be impossible to capture objectively in LMICs, because of the paucity of data regarding land use and accessible destinations. Land use data available from government sources in India depicted agricultural land, barren land, and cropland, which could not be used for examining walkable environments while developed countries had specifics of residential/institutional/commercial use.[6] The digitization of such variables may be plausible in the forthcoming decade, due to advancements in technology and expertise in handling spatial data.

District and subdistrict distribution showed that population density and residential density were highly correlated, which were related to accident rates, intersection density, and land slope within the districts. Within the sub-districts, both higher population density and residential density reflected higher built-up density, higher intersection density, and lower land slope and greenness, with higher rates of crimes and accidents. These could determine the urbanicity of the districts and subdistricts. These results coincide with previous pieces of evidence, where population density has been related to built-up growth and an increase in the built-up area.[23],[24] Moreover, there has been an established relationship between population density and crime rates, particularly a negative relationship with property crimes.[25] Higher crime rates were also reported in the populous districts of Istanbul.[26] Residential density beyond a threshold has also been evidenced to reduce violent crimes in urban neighbourhoods.[27] Furthermore, higher population densities have resulted in the lowering of maximum vegetation fraction in the United States.[28] Greenness was found to be inversely related to population density, crime rates, and pedestrian accident rates. This has been previously documented in Portland, where greenness has resulted in reduced violent crimes.[29] Moreover, greenness has been found to reduce stress and mental fatigue in urban settings, thus facilitating a converse relationship between greenness and crime. Fewer police crime reports were also reported in higher vegetated regions in Chicago, and a negative correlation has also been found between tree cover and crime rates.[30] Pedestrian accident rates with higher casualties were found to be higher in extremely dense areas, and pedestrian collisions were also higher in high-density urban neighborhoods and areas with a higher percentage of street space.[31]

The present study showed that it is possible to capture all available built environment variables using open-source data, which could be reproducible across LMICs. It is a first of its kind attempt in public health research from LMICs. Such an exploration is cost-effective and maximizes the use of available resources in public health research. It could be replicated for comparison across different settings or investigate changes in the neighborhood across time series. Such exploration could provide opportunities to answer a multitude of plausible research questions, including relationships to health and disease. This method of spatial data capture using open-source data demonstrated the relationship of built environment characteristics in the neighborhood with diabetes and physical inactivity.[32] Nevertheless, limitations such as standardization across datasets and jurisdiction boundaries need to be taken care of. This study has also attempted to capture only a few of the variables of the built environment. All the objectively captured data were not originally captured for research purposes, especially crime and pedestrian accident statistics; hence, we may not be assured of quality. GIS data captured as a single estimate for the whole state may not exactly reflect the true phenomenon, which has not been evaluated for validity and reliability. Certain open-source data sources, for example, the OSM is updated through crowd-sourcing, and hence may not be complete and reliable as for the developed countries.

   Conclusion Top

Objective measurements of the built environment can be captured using open-source data and freely available datasets in resource-poor settings. Capturing built environment data for public health research continues to be challenging in LMICs. However, the available solutions prove to be beneficial in delineating distribution across a state and have the potential to be replicated. These could be used to identify and compare built environment features across neighborhoods in LMICs.


State Crime Records Bureau is gratefully acknowledged for their willingness to share data on crime and traffic.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

   References Top

Hollnagel E. Resilience engineering and the built environment. Build Res Inf 2014;42:221-8.  Back to cited text no. 1
Smit W, de Lannoy A, Dover RV, Lambert EV, Levitt N, Watson V. Making unhealthy places: The built environment and non-communicable diseases in Khayelitsha, Cape Town. Health Place 2016;39:196-203.  Back to cited text no. 2
Su M, Du Y, Liu Q, Ren Y, Kawachi I, Lv J, et al. Objective assessment of urban built environment related to physical activity – Development, reliability and validity of the China Urban Built Environment Scan Tool (CUBEST). BMC Public Health 2014;14:109.  Back to cited text no. 3
Prasad A, Gray CB, Ross A, Kano M. Metrics in urban health: Current developments and future prospects. Annu Rev Public Health 2016;37:113-33.  Back to cited text no. 4
McGinn AP, Evenson KR, Herring AH, Huston SL, Rodriguez DA. Exploring associations between physical activity and perceived and objective measures of the built environment. J Urban Health 2007;84:162-84.  Back to cited text no. 5
Koohsari MJ, Badland H, Sugiyama T, Mavoa S, Christian H, Giles-Corti B. Mismatch between perceived and objectively measured land use mix and street connectivity: Associations with neighborhood walking. J Urban Health 2015;92:242-52.  Back to cited text no. 6
Koehler K, Latshaw M, Matte T, Kass D, Frumkin H, Fox M, et al. Building healthy community environments: A public health approach. Public Health Rep 2018;133:35S-43S.  Back to cited text no. 7
Simon A, Mohankumar K. Spatial variability and rainfall characteristics of Kerala. J Earth Syst Sci 2004;113:211-21.  Back to cited text no. 8
Fox T. Land-use dynamics of Kerala's agroforestry systems [Master of Science]. Quebec, Canada: Department of Geography, Dissertation, McGill University; 2015. Available from: http://digitool.library.mcgill.c a/R/?func=dbin-jump-full&object_id=132825&local_base=GEN01-MCG02. [Last accessed on 2018 Nov 22].  Back to cited text no. 9
State Planning Board G of K. Economic Review 2016, State Planning Board; 2016. Available from: w2016/web/chapter 04_06.php. [Last accessed on 2019 Feb 25].  Back to cited text no. 10
World Bank Group. Kerala Indicators at a Glance; 2017. Available from: ated/en/339981504162153632/pdf/119244-B RI-P157572-Kerala-AtAGlance.pdf. [Last accessed on 2019 Feb 25].  Back to cited text no. 11
Praveen Lal CS, Nair SB. Urbanization in Kerala – What does the census data reveal? Indian J Hum Dev 2017;11:356-86.  Back to cited text no. 12
Indian Penal Code Cases. Available from: mation/crime-statistics/ipc-cases. [Last accessed on 2019 Dec 06].  Back to cited text no. 13
Lane KJ, Stokes EC, Seto KC, Thanikachalam S, Thanikachalam M, Bell ML. Associations between Greenness, Impervious Surface Area, and Nighttime Lights on Biomarkers of Vascular Aging in Chennai, India. Environ Health Perspect 2017;125. c/articles/PMC5783666.  Back to cited text no. 14
Rasul A, Balzter H, Ibrahim G, Hameed H, Wheeler J, Adamu B, et al. Applying built-up and bare-soil indices from landsat 8 to cities in dry climates. Land 2018;7:81.  Back to cited text no. 15
CGIAR-CSI. CGIAR-Consortium for Spatial Information; 2004. Available from: [Last accessed on 2018 Dec 01].  Back to cited text no. 16
Sreedevi PD, Owais S, Khan HH, Ahmed S. Morphometric analysis of a watershed of South India using SRTM data and GIS. J Geol Soc India 2009;73:543-52.  Back to cited text no. 17
Ridwan SB, Ferdous HS, Ahmed SI. The challenges and prospect of OpenStreetMap in Bangladesh. In: 14th International Conference on Computer and Information Technology (ICCIT 2011). Bangladesh: IEEE; 2011. p. 589-94.  Back to cited text no. 18
Boeing G. OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comput Environ Urban Syst 2017;65:126-39.  Back to cited text no. 19
Nori-Sarma A, Gurung A, Azhar G, Rajiva A, Mavalankar D, Sheffield P, et al. Opportunities and challenges in public health data collection in Southern Asia: Examples from Western India and Kathmandu Valley, Nepal. Sustainability 2017;9:1106.  Back to cited text no. 20
NRSC. Price List. National Remote Sensing Centre; 2016. Available from: [Last accessed on 2018 Dec 21].  Back to cited text no. 21
Valson JS, Soman B. Spatiotemporal clustering of dengue cases in Thiruvananthapuram district, Kerala. Indian J Public Health 2017;61:74-80.  Back to cited text no. 22
[PUBMED]  [Full text]  
Zeug G, Eckert S. Population growth and its expression in spatial built-up patterns: The Sana'a, Yemen case study. Remote Sens 2010;2:1014-34.  Back to cited text no. 23
Ehrlich D, Kemper T, Pesaresi M, Corbane C. Built-up area and population density: Two Essential Societal Variables to address climate hazard impact. Environ Sci Policy 2018;90:73-82.  Back to cited text no. 24
Battin JR, Crowl JN. Urban sprawl, population density, and crime: An examination of contemporary migration trends and crime in suburban and rural neighborhoods. Crime Prev Community Saf 2017;19:136-50.  Back to cited text no. 25
Ergun N, Yirmibeşoǧlu F. Distribution of crime rates in different districts in Istanbul. Turk Stud 2007;8:435-55.  Back to cited text no. 26
Browning CR, Byron RA, Calder CA, Krivo LJ, Kwan MP, Lee JY, et al. Commercial density, residential concentration, and crime: land use patterns and violence in neighborhood context. J Res Crime Delinquency 2010;47:329-57.  Back to cited text no. 27
Pozzi F, Small C. Vegetation and population density in urban and suburban areas in the USA. Proceedings of the Third International Symposium of Remote Sensing of Urban Areas. 2002. p. 489-96.  Back to cited text no. 28
Burley BA. Green infrastructure and violence: Do new street trees mitigate violent crime? Health Place 2018;54:43-9.  Back to cited text no. 29
Bogar S, Beyer KM. Green space, violence, and crime: A systematic review. Trauma Violence Abuse 2016;17:160-71.  Back to cited text no. 30
Loukaitou-Sideris A, Liggett R, Sung HG. Death on the crosswalk: A study of pedestrian-automobile collisions in los angeles. J Plan Educ Res 2007;26:338-51.  Back to cited text no. 31
Valson JS, Kutty VR, Soman B, Jissa VT. Spatial clusters of diabetes and physical inactivity: Do neighborhood characteristics in high and low clusters differ? Asia Pac J Public Health 2019;31:612-21.  Back to cited text no. 32


  [Figure 1], [Figure 2]

  [Table 1], [Table 2]


    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Access Statistics
    Email Alert *
    Add to My List *
* Registration required (free)  

  In this article
    Materials and Me...
    Article Figures
    Article Tables

 Article Access Statistics
    PDF Downloaded140    
    Comments [Add]    

Recommend this journal