For the history of this project see discussion thread culturecreates/artsdata-data-model#120
You can score event JSON-LD on individual webpages:
- go to artsdata.ca
- paste a webpage url into the top right search box and click the search button
- In the options for "External resources" click dereference to view the webpage's JSON-LD
- At the top of the screen click the link compute score.
This will load the external webpage with the score added into the Event data (keep scrolling down and look for a property called "score").
Each event in the JSON-LD will have a total score and the breakdown of the score for each property.
TODO:
- support more schema:Event sub-types. Currently only schema:Event, schema:MusicEvent, schema:TheaterEvent.
You can score a batch of events across a website, provided that you can find a webpage that lists the events to score and a CSS/XPATH class to locate the individual event urls. The tool supports JSON-LD that is injected by javascript with the option "headless: true"
Prerequisite:
- you need to be a team member of this repo
Steps:
- Go to Actions
- Run the action and enter the parameters.
- View the CSV table in the reports section
- clone the repo
- cd into the project
bundle install
rake test
- make changes to the SHACL partials, SPARQL and update tests
rake build
to merge all the SHACL partials into a single file
Taking into account all the excellent feedback provided so far, I would like to propose this revised weighting. It introduces a new category worth 4 points for properties that are deemed useful for disambiguation. Required properties are given a weight of 8 points and recommended properties are brought down to a weight of 2 points to address concerns that recommended properties may collectively have a higher cumulative value than required properties. I also propose to integrate @christianroy's proposal to give a null score if an event does not have all three required properties.
- startDate with a value that passes Artsdata SHACL validation (proper ISO-8601 syntax or minimal errors that are tolerated by the SHACL validation)
- name
- location.type with expected object value (i.e. Place object or subtype)
- location.name
- location.address.postalCode with valid postal code.
- location.sameAs with a URI constituting a unique identifier for the object
- id with a proper URI constituting a unique identifier for the Event (within the website domain, but distinct from the
url
value) - url
- additionalType
- description
- image with a proper url value OR nested ImageObject with a proper image.url value
- organizer.type with expected object value for the property
- organizer.sameAs with a URI constituting a unique identifier for the object
- performer.type with expected object value for the property
- performer.sameAs with a URI constituting a unique identifier for the object
- offers.type with expected object value for the property (Offer or AggregateOffer)
- offers.url
- Any other property with an expected value (including "location.address.type", which I'm proposing to bump down for simplicity's sake and to balance the total weighting of space attributes compared to time attributes).
Under this proposal:
- An event with all three required properties would have a score of 12 + (2 x 8) = 28.
- If any of the three required properties is missing, the score would be 0 (zero), no matter how good the rest of the structured data is.
- An event with all three disambiguation properties would have 12 additional points, for a sub-total of 40.
- An event with all 11 recommended properties would have 22 additional points. If other contributors wished to keep the weight of recommended properties to 3, the total would be 33, which is in the same ball park as the value of required properties.