slider1slider2slider3slider4slider5slider6

Resources and Documents for CPI Web Scraping Project

Welcome to the resources and documents page for the CPI web scraping project. Here you will find a variety of materials related to the project, including data sets, code samples, and documentation.

Resources

Installation

A list of software required for the CPI web scraping project: Installation list

Web Scraping Books  

Books recommended by Luigi Palumbo: The 5 Best Web Scraping Books 2023: https://scrapeops.io/web-scraping-playbook/best-web-scraping-books

Books recommended by Dominik DÄ…browski: Learning Scrapy now on Amazon and Packt: https://scrapybook.com

Web scraping courses/tutorials

https://www.freecodecamp.org/news/use-selenium-to-create-a-web-scraping-bot
https://www.freecodecamp.org/news/scraping-ecommerce-website-with-python
https://www.freecodecamp.org/news/how-to-scrape-websites-with-python-2

SQL courses/tutorials

https://www.freecodecamp.org/news/learn-sql-free-relational-database-courses-for-beginners/
https://www.freecodecamp.org/news/how-to-read-and-write-data-to-a-sql-database-using-python/

Some learning resources 

Web scraping Useful Resources

Additional_references

Project Documents  

Web scraping capacity plan v11

NSO charter english and french version

Slack Channel  

https://app.slack.com/client/T38UUQEP6/C04JVA1GR70

Sessions and recordings 
CPI project initiation (19th January): Africa CPI collaborative project initiation-20230119_112419-Meeting Recording: https://vimeo.com/794501843/b1b06e0ce3?embedded=true&source=video_title&owner=99857619

Web-scraping environment surgery (2 February): Web scraping environment: Surgery-20230202_110427-Meeting Recording: https://vimeo.com/795309590/eb229eebdc?embedded=true&source=video_title&owner=99857619

Module 1

Web-scraping (8 February): CPI project (UN Regional Hub for Africa) 2023-02-08 Meeting Recording: https://vimeo.com/798448554/4df61f2558?embedded=true&source=video_title&owner=99857619

Slides are available here: UN Webscraping.pdf

Module 2

Agreeing on web-scaping strategies (15 February): CPI project ( UN Regional Hub for Africa)-2023-02-15: Meeting Recording: https://vimeo.com/799804914/5cc361fa73?embedded=true&source=video_title&owner=99857619
 
Module 3: Introduction to Python   
 
Session 1: 20 February: 
CPI project (UN Regional Hub for Africa) 1 of 3-2023-02-20: Meeting Recordinghttps://vimeo.com/801191519/e6f953df82?embedded=true&source=video_title&owner=99857619

Python Notebook: UNECE_1.ipynb

Data: data.csv: data.csv


Session 2:22 February: CPI project (UN Regional Hub for Africa) 2 of 3-2023-02-22: Meeting Recording: https://vimeo.com/801977400/45aff72e39?embedded=true&source=video_title&owner=99857619

Python Notebook: UNECE_2.ipynb


Session 3:24 February: CPI project (UN Regional Hub for Africa) 3 of 3-2023-02-24: Meeting Recording: https://vimeo.com/803409277/9dfec5e886?embedded=true&source=video_title&owner=99857619

Python Notebook: UNECE_3.ipynb

Dataset to train at home: training_base.csv

 
Module 4: Designing a general web scraper in Python
 
Session 1: 13 March: 
CPI project (UN Regional Hub for Africa) 1 of 3-2023-03-13: Meeting Recording: https://vimeo.com/807581142/6319bc0612?embedded=true&source=video_title&owner=99857619

Python Notebook: scraping_requests.zip


Session 2: 15 March: CPI project (UN Regional Hub for Africa) 2 of 3-2023-03-15 Meeting Recording: https://vimeo.com/808663930/9bd6ecbba8?embedded=false&source=video_title&owner=99857619

Material used by Luigi Today: scraping_selenium.zip


Session 2: 17 March: CPI project (UN Regional Hub for Africa) 3 of 3-2023-03-17: Meeting Recording: https://vimeo.com/809805403/b500ffe344?embedded=true&source=video_title&owner=99857619

Material used by Luigi Today: data_post_processing.zip