.gitignore | ||
README.md | ||
scraper.py |
scraper-hafensommer
minimal HTML to Event JSON scraper for Würzburg Hafensommer
Vibe Coding Inspiration :)
That's the prompt I've used to create it (with current ChatGPT)
write a simple website scraper script. first download the page https://www.adticket.de/Hafensommer-Wurzburg.html
use css selector .w-paged-listing__list-item
to match a event.
every event shall be stored to a single json file. css select child node with .c-list-item-event
and pick data attribute data-sync-id
and pick id from there. save the event json to a file named hafensommer-{{id}}.json
json shall follow schema.org/Event format.
select child node of time
element type for startDate
property (element looks like <time datetime="2025-07-31T20:00:00">
)
pick name
property from <h3 class="c-list-item-event__headline">Mine | Support: Epilog</h3>
... format as Hafensommer: Mine
... also extract this to "performer": { "@type": "Person", "name": "Mine" }
also initialize location
hard-coded to "location": { "@type": "PostalAddress", "name": "Freitreppe Alter Hafen", "streetAddress": "Oskar-Laredo-Platz 1", "postalCode": "97080", "addressLocality": "Würzburg" }
pick image
property from <img src="https://cdn.adticket.de/core/img/event/detailEvent_2347271.jpg" alt="" class="c-list-item-event__image">
check the following html for offer url https://www.adticket.de/Mine-Support-Epilog/Wuerzburg-Freitreppe-Alter-Hafen/31-07-2025_20-00.html
extract price from <div class="c-list-item-event__event-min-price"> <span>ab 40,00 €</span>
provide in json like: "offers": { "@type": "Offer", "url": "https://www.adticket.de/Mine-Support-Epilog/Wuerzburg-Freitreppe-Alter-Hafen/31-07-2025_20-00.html", "price": 40, "priceCurrency": "EUR" },