Crime detection usually refers to the discovery of a crime, which may include identifying suspects and collecting intelligence and evidence, all dependent on the law and legal requirements of a country. Its key ingredient, however, is in determining whether the alleged crime is indeed committed once discovered.[1] Acquiring relevant information, therefore, is critical to establish detection, including clandestine activities online. Specifically, in the environmental crime domain, existing research demonstrates the opportunities for crime detection approaches, for example:
• In terms of observing data online which generates a possible trend of activities and a lead to a suspected criminal activity, a study done by Xu, et al. (2020) on the illicit Chinese wildlife trading landscape, pointed out that online marketplaces can become “conduits for the distribution, trafficking and sale” of illegal wildlife trade following the results of the publicly available Facebook trading posts gathered by their automated web scraper.[2]
• As regards illustrating the depth of impact from an environmentally harmful and illegal activity, Massarelli and Uricchio (2024) were able to map out potential illegal waste disposal hotspots in an Italian province. They anchored their findings on the web scraped data from various platforms like social media, websites and online forums, to name a few. They then applied the results to an open-source geographic information system (GIS) platform to execute thematic mapping schemes to illustrate these data results.[3]
• As to forecasting and predicting the most recurring harmful effects from general products, Koyamparambath, et. al. (2022) formulated a prediction framework to help practitioners and verifiers of construction products’ lifecycles. Their baseline information was drawn through the data collected by their in-house designed web scraper with specific details of online environment product declarations to be processed. The findings were then fed to their AI-based data model to predict the potentially frequent harmful effects from construction products over time.[4]
The above examples take advantage of web scraping to get through the vast pool of online data.
Web scraping is a technique that converts online data – such as general websites, social media, forums and other online data sources – into a structured data that can be “saved and analyzed in a central spreadsheet or database”. [5]
It has been adapting to the developments of the world wide web,[6] especially since online content is become more dynamic and visual.
Once developed, a web scraper can run on-demand by the user or through automated means, for example using a software that helps the web scraper mimic human behavior, most popular example is a bot. [7] With the emergence of artificial intelligence (AI), specialized web scrapers took on advanced capabilities that can parse markup languages or JSON (web development standardized file format) files. These include the use of computer vision application in web page parsing (software-led syntax processing for semantic information) as well as applications that can simulate human behavior in browsing online content, such as natural language processing for extracting information from a webpage and machine learning in classifying web pages. [8] These AI advances enable “minimized human intervention”[9] until the coding parameters are updated and match the latest user requirements.
Chaurasia (2023) agreeably drives that web scraping is a valuable tool for law enforcement. [10] Specifically, they emphasized that, wildlife enforcement agencies, for instance, could “significantly enhance” their efficiency and productivity and that the tool even acts as a “force multiplier”. Furthermore,
- Structured information from web scrapers can empower precise extraction of specific data related on online platforms and can help in focused and effective efforts on investigation,
- Can help in real time monitoring and data gathering from identified public platforms such as online marketplaces, social media, forums, websites, etc.,
- The structured data generated by the tool can aid in the analysis of law enforcers and thereby support monitoring “trends, patterns, potential leads and modus operandi” of suspected trafficker/s, and finally
- Relevant data are “collected at a fraction of time” through the automation of the process, which can take longer when it is done manually.
It is unlikely that the exponential growth of the internet will be put to a halt anytime soon. Social adaptations of technological innovations will continue to progress, to which those with ill intent will exploit, hence the sustained need for more complex policing activities.
Nevertheless, there is a silver lining as technology advances, especially with the steady progress of AI. Collins (2024) underscored the stipulated gains from AI whilst wading through the complexities of environmental regulation. For one, AI can be used in “predicting environmental risks”, which can help enforcers in their response readiness and incident mitigation. Another is “informing policy decisions” by using AI in generating scenarios and predicting outcomes of subjects in consideration. Finally, AI can be used as tools for “monitoring compliance with environmental law”, with real-time monitoring and enforcement as among the critical capabilities AI tools can provide, [11] many of which recent web scrapers have already taken onboard already in refining their functionalities and, given the opportunity, can be highly beneficial to law enforcers.
Written by Gerardine Meloy, CENTRIC
[1] Allot, A. and Bernard, T. Detection of crime. Britannica. 13 July 2024. https://www.britannica.com/topic/crime-law/Detection-of-crime, accessed on 19 July 2024.
[2] Xu, Q. Cai, M., and Mackey, T. The illegal wildlife digital market: an analysis of Chinese wildlife marketing and sale on Facebook. Environmental Conservation. Issue 47, pages 206-212. 14 July 2020. https://www.cambridge.org/core/services/aop-cambridge-core/content/view/D9E1850222C1CD521D309821BE596F74/S0376892920000235a.pdf/div-class-title-the-illegal-wildlife-digital-market-an-analysis-of-chinese-wildlife-marketing-and-sale-on-facebook-div.pdf, accessed on 19 July 2024.
[3] Massarelli, C. and Uricchio, V. The Contribution of Open Source Software in Identifying Environmental Crimes Caused by Illicit Waste Management in Urban Areas. Urban Science. Volume 8, Issue 1. 19 March 2024. https://www.mdpi.com/2413-8851/8/1/21, accessed on 19 July 2024.
[4] Koyamparambath, A. et. al. Implementing Artificial Intelligence Techniques to Predict Environmental Impacts: Case of Construction Products. Life Cycle Thinking and Sustainability Assessment of Buildings. 21 March 2022. https://www.mdpi.com/2071-1050/14/6/3699, accessed on 22 July 2024.
[5] Khder, M. Web Scraping or Web Crawling: State of Art, Techniques, Approaches and Application. International Journal of Advances in Soft Computing and its Applications. Volume 13 (3), pages 145-168. December 2021. https://www.i-csrs.org/Volumes/ijasca/2021.3.11.pdf, accessed on 17 July 2024.
[6] Smith, V. Go Web Scraping Quick Start Guide. January 2019. Packt Publishing. https://learning.oreilly.com/library/view/go-web-scraping/9781789615708/, accessed on 18 July 2024.
[7] Bot. Dictionary.com. https://www.dictionary.com/browse/bot, accessed on 26 July 2024.
[8] Weerasinghe, K., Maduranga, M. and Kawya, M. Enhancing Web Scraping with Artificial Intelligence: A Review. January 2024. https://www.researchgate.net/publication/379024314_Enhancing_Web_Scraping_with_Artificial_Intelligence_A_Review, Research Gate. accessed on 30 July 2024.
[9] Chapagain, A. Hands-on Web Scraping with Python – Second Edition. Packt Publishing. October 2023. https://learning.oreilly.com/library/view/hands-on-web-scraping/9781837636211/, accessed on 22 July 2023.
[10] Chaurasia, A. Eyes in the Digital Wilderness: Web Scrapers in the Battle Against Illegal Wildlife Trade. Wildhub. https://wildhub.community/posts/eyes-in-the-digital-wilderness-web-crawlers-in-the-battle-against-illegal-wildlife-trade, accessed on 19 July 2024.
[11] Collins, P. The role of Artificial Intelligence in environmental regulation. London School of Economics and Political Science Politics and Policy blog. 17 October 2023. https://blogs.lse.ac.uk/politicsandpolicy/the-role-of-artificial-intelligence-in-environmental-regulation/, accessed on 01 July 2024.