Resources and Documents for CPI Web Scraping Project
Welcome to the resources and documents page for the CPI web scraping project. Here you will find a variety of materials related to the project, including data sets, code samples, and documentation.
Resources
Installation
A list of software required for the CPI web scraping project: Installation list
Web Scraping Books
Books recommended by Luigi Palumbo: The 5 Best Web Scraping Books 2023: https://scrapeops.io/web-scraping-playbook/best-web-scraping-books
Books recommended by Dominik DÄ…browski: Learning Scrapy now on Amazon and Packt: https://scrapybook.com
Web scraping courses/tutorials
https://www.freecodecamp.org/news/use-selenium-to-create-a-web-scraping-bot
https://www.freecodecamp.org/news/scraping-ecommerce-website-with-python
https://www.freecodecamp.org/news/how-to-scrape-websites-with-python-2
SQL courses/tutorials
https://www.freecodecamp.org/news/learn-sql-free-relational-database-courses-for-beginners/
https://www.freecodecamp.org/news/how-to-read-and-write-data-to-a-sql-database-using-python/
Some learning resources
Web scraping Useful Resources
Additional_references
Project Documents
Web scraping capacity plan v11
NSO charter english and french version
Slack Channel
https://app.slack.com/client/T38UUQEP6/C04JVA1GR70
Sessions and recordings
CPI project initiation (19th January): Africa CPI collaborative project initiation-20230119_112419-Meeting Recording: https://vimeo.com/794501843/b1b06e0ce3?embedded=true&source=video_title&owner=99857619
Web-scraping environment surgery (2 February): Web scraping environment: Surgery-20230202_110427-Meeting Recording: https://vimeo.com/795309590/eb229eebdc?embedded=true&source=video_title&owner=99857619
Module 1
Web-scraping (8 February): CPI project (UN Regional Hub for Africa) 2023-02-08 Meeting Recording: https://vimeo.com/798448554/4df61f2558?embedded=true&source=video_title&owner=99857619
Slides are available here: UN Webscraping.pdf
Module 2
Agreeing on web-scaping strategies (15 February): CPI project ( UN Regional Hub for Africa)-2023-02-15: Meeting Recording: https://vimeo.com/799804914/5cc361fa73?embedded=true&source=video_title&owner=99857619
Module 3: Introduction to Python
Session 1: 20 February: CPI project (UN Regional Hub for Africa) 1 of 3-2023-02-20: Meeting Recording: https://vimeo.com/801191519/e6f953df82?embedded=true&source=video_title&owner=99857619
Python Notebook: UNECE_1.ipynb
Data: data.csv: data.csv
Session 2:22 February: CPI project (UN Regional Hub for Africa) 2 of 3-2023-02-22: Meeting Recording: https://vimeo.com/801977400/45aff72e39?embedded=true&source=video_title&owner=99857619
Python Notebook: UNECE_2.ipynb
Session 3:24 February: CPI project (UN Regional Hub for Africa) 3 of 3-2023-02-24: Meeting Recording: https://vimeo.com/803409277/9dfec5e886?embedded=true&source=video_title&owner=99857619
Python Notebook: UNECE_3.ipynb
Dataset to train at home: training_base.csv
Module 4: Designing a general web scraper in Python
Session 1: 13 March: CPI project (UN Regional Hub for Africa) 1 of 3-2023-03-13: Meeting Recording: https://vimeo.com/807581142/6319bc0612?embedded=true&source=video_title&owner=99857619
Python Notebook: scraping_requests.zip
Session 2: 15 March: CPI project (UN Regional Hub for Africa) 2 of 3-2023-03-15 Meeting Recording: https://vimeo.com/808663930/9bd6ecbba8?embedded=false&source=video_title&owner=99857619
Material used by Luigi Today: scraping_selenium.zip
Session 2: 17 March: CPI project (UN Regional Hub for Africa) 3 of 3-2023-03-17: Meeting Recording: https://vimeo.com/809805403/b500ffe344?embedded=true&source=video_title&owner=99857619
Material used by Luigi Today: data_post_processing.zip