TEI2Zenodo

A generic webservice to to quickly push TEI XML files to zenodo deposits, thereby assigning them a DOI identifier and committing them to long-term archival. It offers a github integration, listening to webhooks sent by github and updating the TEI files both on zenodo and in the git repository with the new zenodo DOI identifiers.

Quickstart

With the TEI2Zenodo webservice you can submit TEI files to Zenodo deposits. Based on a simple configuration using XPATH expressions, it extracts metadata to be used in the zenodo deposit description from the TEI file. The webservice can be used with direct POST or form-style POST requests, or it can be used via github webhooks.

Note that Zenodo requires some metadata fields to be present and to use a controlled vocabulary. Since this webservice cannot perform more than very simple XPath operations, it cannot create the required terms and instead presupposes that the submitted TEI files already make use of this controlled vocabulary, a presupposition that goes beyond what the TEI guidelines recommend. Alternatively, you can hardcode a fixed value for such fields in your configuration. The former approach could, for instance, be applied to the contributor roles, where the service could look up the //titleStmt/editor/@type value, but requires this value to be one of a specific list of values defined by zenodo. The second approach on the other hand could, for example, be used to specify that the "upload_type" should always be "publication", no matter what. For more details, please see the general documentation and the configuration file template.

Form-style POST requests

Example for the form-style POST API.

General section

 
 

WriteDOI section

Add other attributes:

  

Zenodo section

is the server's current metadata field configuration. To change this, you have to change the server's config.json. Note: this is a dump of an internal struct, don't copy as-is to your config.json!

{
	"Fields": [
		{
			"Field": "upload_type",
			"XPath": "",
			"XExpression": "string('publication')",
			"Subfields": null
		},
		{
			"Field": "publication_type",
			"XPath": "",
			"XExpression": "string('other')",
			"Subfields": null
		},
		{
			"Field": "publication_date",
			"XPath": "//publicationStmt/date",
			"XExpression": "",
			"Subfields": null
		},
		{
			"Field": "title",
			"XPath": "//titleStmt//title[@type='main']",
			"XExpression": "",
			"Subfields": null
		},
		{
			"Field": "creators",
			"XPath": "//titleStmt/author",
			"XExpression": "",
			"Subfields": [
				{
					"Field": "name",
					"XPath": ".",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "affiliation",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "orcid",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "gnd",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				}
			]
		},
		{
			"Field": "description",
			"XPath": "",
			"XExpression": "string('Work published in the context of the School of Salamanca project.')",
			"Subfields": null
		},
		{
			"Field": "access_right",
			"XPath": "",
			"XExpression": "string('open')",
			"Subfields": null
		},
		{
			"Field": "license",
			"XPath": "//publicationStmt/availability/licence/@n",
			"XExpression": "",
			"Subfields": null
		},
		{
			"Field": "contributors",
			"XPath": "//fileDesc//editor",
			"XExpression": "",
			"Subfields": [
				{
					"Field": "name",
					"XPath": ".",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "type",
					"XPath": "@role",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "affiliation",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "orcid",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				},
				{
					"Field": "gnd",
					"XPath": "",
					"XExpression": "",
					"Subfields": null
				}
			]
		},
		{
			"Field": "doi",
			"XPath": "//publicationStmt//idno[@type='DOI']",
			"XExpression": "",
			"Subfields": null
		},
		{
			"Field": "keywords",
			"XPath": "//teiHeader/profileDesc/textClass/keywords/term",
			"XExpression": "",
			"Subfields": null
		}
	]
}

Direct POST requests

You can also submit direct POST requests to the file API endpoint. Where this actually is, depends on your configuration. By default, it is at {hostname}:8081/api/v1/file

(a) The Content-Type HTTP header should have a value of application/xml and the request body should directly contain your file. Options are specified as query parameters: specify a filename with the filename query parameter and use the doPublish query parameter (set to either True or False whether you want zenodo to publish the deposit or to leave it in editable state. (In the latter case, you can edit and publish it manually if you log in to zenodo and go to your Uploads.)

(b) Alternatively, you can send a multipart/form-data request. (That would be a Content-Type header of "multipart/form-data" plus some boundary string appended with a semicolon, e.g. multipart/form-data;boundary="myboundary".) The form fields are then called filename, file and doPublish.