A Demystified Overview of Data Scraping
DOI:
https://doi.org/10.69511/ijdsaa.v6i6.205Keywords:
Data Scraping, Web Scraping, Web crawling, Data mining, API ScrapingAbstract
Data scraping is a concept that involves the extraction of relevant data from a pool of information stored in a computer. Data scraping is universally known as web scraping because the web contains massive amount of information that is easily accessible and extracted. Web scraping is valuable to all field of human endeavour. The paper gives a vivid conceptualization of data scraping and some minor misconception between data mining, web crawling and web scraping. Furthermore, the phases and procedure of data scraping were outlined. The merit of web scraping over API scraping were explicated. Moreover, the numerous software and tools that support the scraping of websites were stated. Even though web scraping has vast prominence, there are also some technical issues and challenges associated with it. Finally, some of the legal and ethical issues related to information extraction were discussed and it is obvious that data scraping is permitted as long as users comply to the terms and conditions of the target site.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Shehu Mustapha, Mustafa Man, Wan Aezwani Wan Abu Bakar , Mohd Kamir Yusof , Ily Amalina Ahmad Sabri

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

International Journal of Data Science and Advanced Analytics (IJDSAA) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. This license allows users to copy, distribute and transmit an article, adapt the article as long as the author is attributed and the article is not used for commercial purposes.
The author(s) confirms
- The manuscript submission has not been previously published, nor is it before another journal for consideration (or an explanation has been provided in Comments to the Editor).
- The published materials used in the manuscript were obtained permission for reproduction. (if any)